Node console.log on large array shows "... 86 more items" - javascript

I'm new to puppeteer. I used to have PhantomJS and CasperJS but while setting a newer server (freebsd 12) found out that support for PhantomJS is gone and CasperJS gives me segmentation faults.
I was able to port my applications to puppeteer just fine but ran into the problem that when I want to capture data from a table, this data seems to be incomplete or truncated.
I need all the info from a table but always end up getting less.
I have tried smaller tables but it also comes out truncated.
I don't know if the console.log buffer can be extended or not, or if there is a better way to get the values of all tds in the table.
const data = await page.$$eval('table.dtaTbl tr td', tds => tds.map((td) => {
return td.innerHTML;
}));
console.log(data);
I should be able to get all rows but instead I get this
[ 'SF xx/xxxx 3-3999 06-01-16',
'Sample text - POLE',
'',
/* tons of other rows (removed by me in this example) <- */
'',
/* end of output */ ... 86 more items ]
I need the 86 other items!!!
because I'm having PHP pick it up from stdout as the code is executed.

Why console.log does not work
Under the hood, console.log uses util.inspect, which produces output intended for debugging. To create reasonable debugging information, this function will truncate output which would be too long. To quote the docs:
The util.inspect() method returns a string representation of object that is intended for debugging. The output of util.inspect may change at any time and should not be depended upon programmatically.
Solution: Use process.stdout
If you want to write output to stdout you can use process.stdout which is a writable stream. It will not modify/truncate what you write on the stream. You can use it like this:
process.stdout.write(JSON.stringify(data) + '\n');
I added a line break at the end, as the function will not produce a line break itself (in contrast to console.log). If your script does not rely on it you can simply remove it.

You can also use
console.log(JSON.stringify(data, null, 4));
instead of
process.stdout.write(JSON.stringify(data) + '\n');

I know the question is from a couple of years ago, but this has been an issue I've seen time and time again. Discovering (through this thread) the underlying util.inspect call has helped me to overcome this issue in the following way:
process.stdout.write(`${util.inspect(data, { maxArrayLength: 1000 })}\n`)
By default maxArrayLength is 100 which is why the data is truncated for longer arrays.

Do you absolutely have to use stdout? It's not recommended to do that for monitoring because it's very easy for stdout to overrun the buffer (or have incomplete output) - as you've seen illustrating the problem.
Why not modify the PHP script to read from a file as a stream using the readfile function, and write to that stream from your JS code using fs?
https://nodejs.org/docs/latest-v10.x/api/fs.html#fs_class_fs_writestream
https://www.php.net/manual/en/function.readfile.php

Related

Node - safest way to execute code from a string during runtime

My Node app gets an HTML page via axios, parses it via htmlparser2 then sends the valuable information to a frontend JS app as JSON.
The HTML page has some JavaScript in it that creates an array, and I need to work with that array in my code. htmlparser2 gets the content of the script as a string. I have two options to handle it as far as I know:
Write a parser that goes through the string and extracts the required info (doable, but complicated)
Run string as some JavaScript code and handle the values from that.
Assume I want to go with option 2. According to this StackOverflow question, using Node's VM module is possible, but the official documentation says "The node:vm module is not a security mechanism. Do not use it to run untrusted code."
I consider the code in my use case untrusted. What would be a safe solution for this?
EDIT: A snippet from the string:
hatizsakCucc = new Array();
hazbanCucc = new Array();
function adatokMessage(targyIndexStr,tomb) {
var targyIndex = parseInt(targyIndexStr);
if (tomb.length<1) alert("Nincs semmi!");
else alert(tomb[targyIndex]);
}
hatizsakCucc[0]="Név: ezüst\nSúly: 0.0001 kg.\nMennyiség: 453\nÖsszsúly: 0.0453 kg.\n";
hatizsakCucc[1]="Név: kaja\nSúly: 0.4 kg.\nÁr: 2 ezüst\nMennyiség: 68\nÖsszár: 136 ezüst\nÖsszsúly: 27.2 kg.\n";
hatizsakCucc[2]="Típus: fegyver\nNév: bot\nSúly: 2 kg.\nÁr: 6 ezüst\nMin. szint: 1\nMaximum sebzés: 6\nSebzés szórás: 5\nFajta: ütő/zúzó\n";
hatizsakCucc[3]="Típus: fegyver\nNév: parittya\nSúly: 0.3 kg.\nÁr: 14 ezüst\nMin. szint: 1\nMaximum sebzés: 7\nSebzés szórás: 4\nFajta: távolsági\n";
hatizsakCucc[4]="Név: csodatarisznya\nSúly: 4 kg.\nÁr: 1000 ezüst\nExtra: templomi árú\n";
hatizsakCucc[5]="Név: imamalom\nSúly: 5 kg.\nÁr: 150 ezüst\nExtra: templomi árú\n";
The whole string is about 100 lines of this, so it's not too much data.
What I need is the contents of the hatizsakCucc array. Actually, getting an array of that it not too difficult with a regex, I'm realizing now.
hatizsakSzkript.match(/hatizsakCucc(.*)\\n/g);
This gives me an array of the hatizsakCucc elements, so I guess my problem is solved.
That said, I'm still curious about the possibility of running "untrusted" code safely.
Further context:
I plan parse each array element so it will be an object, the object elements will be the substring separated by the \n-s
So the expected result for the first array element will be:
hatizsakCucc[0]{
nev: "ezüst",
suly: 0.0001,
mennyiseg: ...
}
I'll write a function that splits the string to substrings at the \n then parse the data with a match().

Bad data received from Arduino serial while reading from analog port

Maybe this is just a nonsense, but it's driving me crazy. I'm trying to read one analog port in Arduino and send the value through the serial port to JavaScript using node. When I show the data in the Arduino console, everything works fine, but when I use the terminal in Mac, some values appear splited in two lines.
460
460
4
60
460
The code I'm using is:
Arduino:
const int analogInPin = A0;
int sensorValue = 0;
void setup() {
Serial.begin(500000);
}
void loop() {
sensorValue = analogRead(analogInPin);
Serial.print(sensorValue);
delay(200);
}
Node:
var com = require('serialport').SerialPort;
var opts = {baudrate: 500000};
var serialPort = new com('/dev/tty.usbmodem641', opts);
serialPort.on('data', function(data) {
console.log(data.toString());
});
The code couldn't be simpler, but still doesn't work properly. I know I'm missing something but I can't see it. I have tested different baudrates, but nothings works. Could you please help me?
Thanks in advance
I think maybe Elias Benevedes is trying to suggest this in their answer: right now your Arduino data is not delimited at all. Suppose your sensorValue always reads as 1. In this case the output from Arduino will be
11111111111111111111111111111111111111111111111111111111111....
And so on; because you print the integer value without any delimiters. The way it is parsed into different numbers, therefore, has to do with the timing of the arrival of the data. Continuing with the example above then, sometimes your value is read as 1, sometimes as 11, sometimes as 111 and so on, just depending on the timing of the reads and the writes.
The way to begin to fix it is to insert some non-numeric data between your sensor reading outpus. One (again, this is perhaps what Elias Benevedes has in mind) is to insert line breaks between every number printed
Serial.println(sensorValue);
Another way would be to add spaces between the data
Serial.print(sensorValue);
Serial.print(" ");
Either solution would separate your numeric readings from each other, which is what you want.
I had this happen to me once also. Serial.print() sends data to the arduino. Serial.println() will send information from the arduino to the computer Serial message board (or whatever you want to call it.

Simple way to check/validate javascript syntax

I have some big set of different javascript-snippets (several thousands), and some of them have some stupid errors in syntax (like unmatching braces/quotes, HTML inside javascript, typos in variable names).
I need a simple way to check JS syntax. I've tried JSLint but it send too many warnings about style, way of variable definitions, etc. (even if i turn off all flags). I don't need to find out style problems, or improve javascript quality, i just need to find obvious syntax errors. Of course i can simply check it in browser/browser console, but i need to do it automatically as the number of that snippets is big.
Add:
JSLint/JSHint reports a lot of problems in the lines that are not 'beauty' but working (i.e. have some potential problems), and can't see the real problems, where the normal compiler will simply report syntax error and stop execution. For example, try to JSLint that code, which has syntax errors on line 4 (unmatched quotes), line 6 (comma required), and line 9 (unexpected <script>).
document.write('something');
a = 0;
if (window.location == 'http://google.com') a = 1;
document.write("aaa='andh"+a+"eded"');
a = {
something: ['a']
something2: ['a']
};
<script>
a = 1;
You could try JSHint, which is less verbose.
Just in case anyone is still looking you could try Esprima,
It only checks syntax, nothing else.
I've found that SpiderMonkey has ability to compile script without executing it, and if compilation failed - it prints error.
So i just created small wrapper for SpiderMonkey
sub checkjs {
my $js = shift;
my ( $js_fh, $js_tmpfile ) = File::Temp::tempfile( 'XXXXXXXXXXXX', EXLOCK => 0, UNLINK => 1, TMPDIR => 1 );
$| = 1;
print $js_fh $js;
close $js_fh;
return qx(js -C -f $js_tmpfile 2>&1);
}
And javascriptlint.com also deals very good in my case. (Thanks to #rajeshkakawat).
Lots of options if you have an exhaustive list of the JSLint errors you do want to capture.
JSLint's code is actually quite good and fairly easy to understand (I'm assuming you already know JavaScript fairly well from your question). You could hack it to only check what you want and to continue no matter how many errors it finds.
You could also write something quickly in Node.js to use JSLint as-is to check every file/snippet quickly and output only those errors you care about.
Just use node --check filename
Semantic Designs' (my company) JavaScript formatter read JS files and formats them. You don't want the formatting part.
To read the files it will format, it uses a full JavaScript parser, which does a complete syntax check (even inside regular expressions). If you run it and simply ignore the formatted result, you get a syntax checker.
You can give it big list of files and it will format all of them. You could use this to batch-check your large set. (If there are any syntax errors, it returns a nonzero error status to a shell).

I need a Javascript literal syntax converter/deobfuscation tools

I have searched Google for a converter but I did not find anything. Is there any tools available or I must make one to decode my obfuscated JavaScript code ?
I presume there is such a tool but I'm not searching Google with the right keywords.
The code is 3 pages long, this is why I need a tools.
Here is an exemple of the code :
<script>([][(![]+[])[!+[]+!+[]+!+[]]+(!![]+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[+!+[]+[+[]]]+(!![]+[])[+!+[]]+(!![]+[])[+[]]][([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]+(![]+[])[!+[]+!+[]]]()[(!![]+[])[!+[]+!+[]+!+[]]+(+(+[])+[][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]])[!+[]+!+[]+!+[]+[+[]]]+(![]+[])[+!+[]]+(![]+[])[!+[]+!+[]]])(([]+[])[([][(![]+[])[+[]]+([![]]+[][[]])[+!+[]+[+[]]]+(![]+[])[!+[]+!+[]]+(!![]+[])[+[]]+(!![]+[])[!+[]+!+[]+!+[]]+(!![]+[])[+!+[]]]+[])[!+[]+!+[]+!+[]]+(!![]+
Thank you
This code is fascinating because it seems to use only nine characters ("[]()!+,;" and empty space U+0020) yet has some sophisticated functionality. It appears to use JavaScript's implicit type conversion to coerce arrays into various primitive types and their string representations and then use the characters from those strings to compose other strings which type out the names of functions which are then called.
Consider the following snippet which evaluates to the array filter function:
([][
(![]+[])[+[]] // => "f"
+ ([![]]+[][[]])[+!+[]+[+[]]] // => "i"
+ (![]+[])[!+[]+!+[]] // => "l"
+ (!![]+[])[+[]] // => "t"
+ (!![]+[])[!+[]+!+[]+!+[]] // => "e"
+ (!![]+[])[+!+[]] // => "r"
]) // => function filter() { /* native code */ }
Reconstructing the code as such is time consuming and error prone, so an automated solution is obviously desirable. However, the behavior of this code is so tightly bound to the JavaScript runtime that de-obsfucating it seems to require a JS interpreter to evaluate the code.
I haven't been able to find any tools that will work generally with this sort of encoding. It seems as though you'll have to study the code further and determine any patterns of usage (e.g. reliance on array methods) and figure out how to capture their usage (e.g. by wrapping high-level functions [such as Function.prototype.call]) to trace the code execution for you.
This question has already an accepted answer, but I will still post to clear some things up.
When this idea come up, some guy made a generator to encode JavaScript in this way. It is based on doing []["sort"]["call"]()["eval"](/* big blob of code here */). Therefore, you can decode the results of this encoder easily by removing the sort-call-eval part (i.e. the first 1628 bytes). In this case it produces:
if (document.cookie=="6ffe613e2919f074e477a0a80f95d6a1"){ alert("bravo"); }
else{ document.location="http://www.youtube.com/watch?v=oHg5SJYRHA0"; }
(Funny enough the creator of this code was not even able to compress it properly and save a kilobyte)
There is also an explanation of why this code doesn't work in newer browser anymore: They changed Array.prototype.sort so it does not return a reference to window. As far as I remember, this was the only way to get a reference to window, so this code is kind of broken now.

document.evaluate won't work from content script

var allTags = document.evaluate("//*[contains(#src,'"+imgSrc+"')]", document, null, XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE, null);
This is the code that gives errors, it gives:
Uncaught Error: TYPE_ERR: DOM XPath Exception 52
Could someone tell me what the problem is?
I don't have a precise answer, but I can guess and give a workaround.
First the work around: change UNORDERED_NODE_SNAPSHOT_TYPE to a type that don't create a snapshot(unless you need it that way) and returns multiple nodes like UNORDERED_NODE_ITERATOR_TYPE(or ANY_TYPE).
And my guess: After reading the spec it say for this function 'TYPE_ERR: Raised if the result cannot be converted to return the specified type.'. It may be the case it can't allocate the resources to create a snapshot or something like this(the workaround assumes that).
Edit:
The real problem is most likely not the call to document.evaluate is that in your code you do allTags.iterateNext and this call expects allTags to be a *_NODE_ITERATOR_TYPE and not a *_NODE_SNAPSHOT_TYPE, using allTags.snapshotItem don't cause an error to be thrown. I wrote a sample at jsfiddle, it changes the borders after 2 seconds using the call to evaluate in your question and iterate over the elements in the proper way.

Categories

Resources