NodeJS child_process stdout, if process is waiting for stdin - javascript

I'm working on an application, which allows to compile and execute code given over an api.
The binary I want to execute is saved as input_c and should print a text, asking the user for his name and print out another text after the input is received.
It correctly works using the code below: first text - input (on terminal) - second text.
const {spawn} = require('child_process');
let cmd = spawn('input_c', [], {stdio: [process.stdin, process.stdout, process.stderr]});
Output:
$ node test.js
Hello, what is your name? Heinz
Hi Heinz, nice to meet you!
I like to handle stdout, stderr and stdin seperately and not write it to the terminal. The following code was my attempt to achieve the same behaviour as above:
const {spawn} = require('child_process');
let cmd = spawn('input_c');
cmd.stdout.on('data', data => {
console.log(data.toString());
});
cmd.stderr.on('data', data => {
console.log(data.toString());
});
cmd.on('error', data => {
console.log(data.toString());
});
// simulating user input
setTimeout(function() {
console.log('Heinz');
cmd.stdin.write('Heinz\n');
}, 3000);
Output:
$ node test.js
Heinz
Hello, what is your name? Hi Heinz, nice to meet you!
To simulate user input I'm writing to stdin after 3000ms. But here I'm not receiving the first data in stdout directly on run, it seems to wait for stdin and outputs everything at once.
How can I achieve the same behaviour for my second case?
The following C-Code was used to compile the binary, but any application waiting for user input can be used for this:
#include <stdio.h>
int main() {
char name[32];
printf("Hello, what is your name? ");
scanf("%s", name);
printf("Hi %s, nice to meet you!", name);
return 0;
}

node-pty can be used here to prevent buffered output of child process.
const pty = require('node-pty');
let cmd = pty.spawn('./input_c');
cmd.on('data', data => {
console.log(data.toString());
});
// simulating user input
setTimeout(function() {
console.log('Heinz');
cmd.write('Heinz\n');
}, 3000);
output:
Hello, what is your name?
Heinz
Heinz
Hi Heinz, nice to meet you!

The problem you're facing with stdout.on() events not being triggered after spawn() appears because of how node.js spawns the child process. Essentially, it creates stream.pipe() (by default), which allows child process to buffer the output before sending it to stdout it was given by node, which is, in general, good for performance.
But, since you want real-time output and also you're in charge of the binary, you might simply disable internal buffering. In C you can achieve that by adding setbuf(stdout, NULL); to the beginning of your program:
#include <stdio.h>
int main() {
setbuf(stdout, NULL);
char name[32];
printf("Hello, what is your name? ");
scanf("%31s", name);
printf("Hi %s, nice to meet you!", name);
return 0;
}
Alternatively, you can call fflush(stdout); after each printf(), puts(), etc:
#include <stdio.h>
int main() {
char name[32];
printf("Hello, what is your name? "); fflush(stdout);
scanf("%31s", name);
printf("Hi %s, nice to meet you!", name); fflush(stdout);
return 0;
}
Upon disabling internal buffering or triggering explicit flushes in the child process, you will immediately get the behavior you expect, without any external dependencies.
UPD:
Many applications intentionally suppress or, at least, allow suppressing stdio buffering, so you may find related startup arguments. For example, you can launch python interpreter binary with -u option, which will force stdin, stdout and stderr to be totally unbuffered. There are also several older questions related nodejs and stdio buffering problems, you might find useful, like this one: How can I flush a child process from nodejs

Related

stream stdout causes RAM usage to increase dramatically

The bounty expires in 1 hour. Answers to this question are eligible for a +200 reputation bounty.
Soroush Bgm is looking for a canonical answer.
I use spawn to run a command that runs constantly (Not supposed to stop) and it transmits data to its output. The problem is that RAM usage of the node app increases constantly.
After multiple tests, I could reach to following part of code that reproduces the problem, even though the functions are almost empty:
const runCommand = () => {
const command = 'FFMPEG COMMAND HERE';
let ffmpeg = spawn(command, [], { shell: true });
ffmpeg.on('exit', function(code) { code = null; });
ffmpeg.stderr.on('data', function (data) { data = null; });
ffmpeg.stdout.on('data', function (data) { data = null; });
};
I get the same problem with following:
const runCommand = () => {
const command = 'FFMPEG COMMAND HERE';
let ffmpeg = spawn(command, [], { shell: true });
ffmpeg.on('exit', function(code) { code = null; });
ffmpeg.stderr.on('data', function (data) { data = null; });
ffmpeg.on('spawn', function () {
ffmpeg.stdout.pipe(fs.createWriteStream('/dev/null'));
});
};
The important part is, when I delete function (data) {} from ffmpeg.stdout.on('data', function (data) {}); the problem goes away. Type of received data is buffer object. I think the problem is with that part.
The problem also appears when spawn pipes out the data to another writable (even to /dev/null).
UPDATE: After hours of research, I found out that it's something related to spawn output and stream backpressure. I configured FFMPEG command to send chunks less frequently. That mitigated the problem (Increasing less than before). But memory usage still increasing.
If you delete the ffmpeg.stdout.on('data', function (data) {}); line the problem fades away, but just partially because ffmpeg keeps on writing in the stdout and may eventually stop, waiting for the stdout to be consumed. For example, MongoDB has this "pause until stdout is empty" logic.
If you are not going to process the stdout, just ignore it with this:
const runCommand = () => {
const command = 'FFMPEG COMMAND HERE';
let ffmpeg = spawn(command, [], { shell: true, stdio: "ignore" });
ffmpeg.on('exit', function(code) { code = null; });
};
This will make the spawned process to dump the stdout and stderr so there's no need to be consumed. Is the correct way, as you don't need to waste CPU cycles and resources reading a buffer that you are going to discard. Take into account that although you just add a one liner to read and discard the data, livuv (the nodejs IO manager, among other things) does more complex things to read this data.
Still, I'm pretty sure that you are facing this bug: https://github.com/Unitech/pm2/issues/5145
It also seems that if you output too much logs, pm2 can't handle writing them to the output files as fast as needed, so reducing the log output can fix the problem: https://github.com/Unitech/pm2/issues/1126#issuecomment-996626921
As you mentioned you need the stdout output stdio: "ignore" is not an option.
Depending on what you're doing with the data you're receiving you may receive more data than you can handle. Therefore buffers build up filling your memory.
A possible solution will be to pause and resume the stream when data builds up too much.
ffmpeg.stdout.on('data', function (data) {
ffmpeg.stdout.pause();
doSomethingWithDataAsyncWhichTakesAWhile(data).finally(() => ffmpeg.stdout.resume());
});
When to pause and resume the stream highly depend on the factor how you handle the data.
Using in combination with a writeable (which when I'm not mistaken your're doing):
ffmpeg.stdout.on('data', function (data) {
if(!writeable .write(data)) {
/* We need to wait for the 'drain' event. */
ffmpeg.stdout.pause();
writeable .once('drain', () => ffmpeg.stdout.resume());
}
});
writeable.write(...) returns false if the stream wishes for the calling code to wait for the 'drain' event to be emitted before continuing to write additional data; otherwise true. source.
If you're ignoring this you'll end up building up buffers in memory.
This might be the cause of your problem.
PS: As side note:
At least on unix systems when the output buffer of stdout has been filled and not been read (e.g. by pausing the stream) the application which writes to stdout will hang until there is space to write into. In case of ffmpeg this is not an issue and intended behaviour. But it's just something to be mindful of.

Why does my typescript program randomly stop running?

I wrote a very simple typescript program, which does the following:
Transform users.csv into an array
For each element/user issue an API call to create that user on a 3rd party platform
Print any errors
The excel file has >160,000 rows and there is no way to create them all in one API call, so I wrote this program to run in the background of my computer for ~>20 hours.
The first time I ran this, the code stopped mid for loop without an exception or anything. So, I deleted the user rows from the csv file that were already uploaded and re-ran the code. Unfortunately, this kept happening.
Interestingly, the code has stopped at non-deterministic iterations, one time it was at i=812, another at i=27650, and so on.
This is the code:
const main = async () => {
const usersFile = await fsPromises.readFile("./users.csv", { encoding: "utf-8" });
const usersArr = makeArray(usersFile);
for (let i = 0; i < usersArr.length; i++) {
const [ userId, email ] = usersArr[i];
console.log(`uploading ${userId}. ${i}/${usersArr.length}`);
try {
await axios.post(/* create user */);
await sleep(150);
} catch (err) {
console.error(`Error uploading ${userId} -`, err.message);
}
}
};
main();
I should mention that exceptions are within the for-loop because many rows will fail to upload with a 400 error code. As such, I've preferred to have the code run non-stop and print any errors onto a file, so that I could later re-run it for the users that failed to upload. Otherwise I would have to check whether it halted because of an error every 10 minutes.
Why does this happen? and What can I do?
I run after compiling as: node build/index.js 2>>errors.txt
EDIT:
There is no code after main() and no code outside the try ... catch block within the loop. errors.txt only contains 400 errors. Even if it contained another run-time exception, it seems to me that this wouldn't/shouldn't halt execution, because it would execute catch and move on to the next iteration.
I think this may have been related to this post. The file I was reading was extremely large as noted, and it was saved into a runtime variable. Undeterministically, the OS could have decided that the memory demanded was too high. This is probably a situation to use a Readable Stream instead of a readFile.

Is there any way to determine if a nodejs childprocess wants input or is just sending feedback?

I had a little freetime so I decided to rewrite all my bash scripts in JavaScript (NodeJS - ES6) with child processes. Everything went smoothly until I wanted to automate user input.
Yes, you can do automate the user input. But there is one Problem - you can't determine if the given data event is a feedback or a request for input. At least I can't find a way to do it.
So basically you can do this:
// new Spawn.
let spawn = require('child_process');
// new ufw process.
let ufw = spawn('ufw', ['enable']);
// Use defined input.
ufw.stdin.setEncoding('utf-8');
ufw.stdout.pipe(process.stdout);
ufw.stdin.write('y\n');
// Event Standard Out.
ufw.stdout.on('data', (data) => {
console.log(data.toString('utf8'));
});
// Event Standard Error.
ufw.stderr.on('data', (err) => {
// Logerror.
console.log(err);
});
// When job is finished (with or without error) it ends up here.
ufw.on('close', (code) => {
// Check if there were errors.
if (code !== 0) console.log('Exited with code: ' + code.toString());
// End input stream.
ufw.stdin.end();
});
The above example works totally fine. But there are 2 things giving me an headache:
Will ufw.stdin.write('y\n'); wait until it is needed and what happens if I have multiple inputs? For example 'yes', 'yes', 'no'. Do I have to write 3 lines of stdin.write()?
Isn't the position where I use ufw.stdin.write('y\n'); a little confusing? I thought I need the input after my prompt made a request for input so I decided to change my code that my stdin.write() could run at the right time, makes sense right? However the only way to check when the 'right' time is on the stdout.on('data', callback) event. That makes thinks a little difficult, since I need to know if the prompt is aksing for user input or not...
Here is my code which I think is totally wrong:
// new Spawn.
let spawn = require('child_process');
// new ufw process.
let ufw = spawn('ufw', ['enable']);
// Event Standard Out.
ufw.stdout.on('data', (data) => {
console.log(data.toString('utf8'));
// Use defined input.
ufw.stdin.setEncoding('utf-8');
ufw.stdout.pipe(process.stdout);
ufw.stdin.write('y\n');
});
// Event Standard Error.
ufw.stderr.on('data', (err) => {
// Logerror.
console.log(err);
});
// When job is finished (with or without error) it ends up here.
ufw.on('close', (code) => {
// Check if there were errors.
if (code !== 0) console.log('Exited with code: ' + code.toString());
// End input stream.
ufw.stdin.end();
});
My major misunderstanding is when to use stdin for user input (automated) and where to place it in my code so it will be used at the right time, for example if I have multiple inputs for something like mysql_secure_installation.
So I was wondering if it is possible and it seems not. I posted an issue for node which ended up beeing closed: https://github.com/nodejs/node/issues/16214
I am asking for a way to determine if the current process is waiting for an input.
There isn't one. I think you have wrong expectations about pipe I/O
because that's simply not how it works.
Talking about expectations, check out expect. There is probably a
node.js port if you look around.
I'll close this out because it's not implementable as a feature, and
as a question nodejs/help is the more appropriate place.
So if anyone has the same problem as I had you can simply write multiple lines into stdin and use that as predefined values. Keep in mind that will eventually break the stream if any input is broken or wrong in feature updates:
// new Spawn.
let spawn = require('child_process');
// new msqlsec process.
let msqlsec = spawn('mysql_secure_installation', ['']);
// Arguments as Array.
let inputArgs = ['password', 'n', 'y', 'y', 'y', 'y'];
// Set correct encodings for logging.
msqlsec.stdin.setEncoding('utf-8');
msqlsec.stdout.setEncoding('utf-8');
msqlsec.stderr.setEncoding('utf-8');
// Use defined input and write line for each of them.
for (let a = 0; a < inputArgs.length; a++) {
msqlsec.stdin.write(inputArgs[a] + '\n');
}
// Event Standard Out.
msqlsec.stdout.on('data', (data) => {
console.log(data.toString('utf8'));
});
// Event Standard Error.
msqlsec.stderr.on('data', (err) => {
// Logerror.
console.log(err);
});
// When job is finished (with or without error) it ends up here.
msqlsec.on('close', (code) => {
// Check if there were errors.
if (code !== 0) console.log('Exited with code: ' + code.toString());
// close input to writeable stream.
msqlsec.stdin.end();
});
For the sake of completeness if someone wants to fill the user input manually you can simply start the given process like this:
// new msqlsec process.
let msqlsec = spawn('mysql_secure_installation', [''], { stdio: 'inherit', shell: true });

Nodejs child_process execute shell command

I am working on a university project where I have to evaluate the security threats to an open WiFi Network.I have chosen the aircrack-ng set of tools for penetration testing. My project uses Node js for the rich set of features. However, I am a beginner and am struggling to solve a problem. Firstly, I shall present my code and then pose the problem.
var spawn = require('child_process').spawn;
var nic = "wlan2";
//obtain uid number of a user for spawing a new console command
//var uidNumber = require("uid-number");
// uidNumber("su", function (er, uid, gid) {
// console.log(uid);
// });
//Check for monitor tools
var airmon_ng= spawn('airmon-ng');
airmon_ng.stdout.on('data', function (data) {
nicList = data.toString().split("\n");
//use for data binding
console.log(nicList[0]);//.split("\t")[0]);
});
//airmon start at the nic(var)
var airmon_ng_start = spawn('airmon-ng',['start',nic]).on('error',function(err){console.log(err);});
airmon_ng_start.stdout.on('data', function (data) {
console.log(data.toString());
});
var airmon_ng_start = spawn('airodump-ng',['mon0']).on('error',function(err){console.log(err);});
airmon_ng_start.stdout.on('data', function (data) {
console.log(data.toString());
});
As seen in the above code. I use the child_process.spwan to execute the shell command. In the line "var airmon_ng_start = spawn(......" the actual command executes in the terminal and doesn`t end till the ctrl+c is hit and it regularly updates the list of Wi-Fi networks available in the vicinity . My goal is to identify the network that I wish to test for vulnerability. However when I execute the command the process goes to an infinite loop and waits for the shell command to terminate (which never terminates until killed) moreover I wish to use the stdout stream to display the new set of data as the Wi-Fi finds and updates. May the node.js experts provide me with a better way to do this ?
2) Also I with to execute some commands as root . how may this be done . For now I am running the javascript as a root. However, in the project I wish to execute only some of the commands as root and not the entire js file as root. Any suggestions ?
//inherit parent`s stdout stream
var airmon_ng_start = spawn('airodump-ng',['mon0'],{ stdio: 'inherit' })
.on('error',function(err){console.log(err);});
Found this solution. Simply inherit parent`s stdout

Stdout of Node.js child_process exec is cut short

In Node.js I'm using the exec command of the child_process module to call an algorithm in Java that returns a large amount of text to standard out which I then parse and use. I'm able to capture it mostly, but when it exceeds a certain number of lines, the content is cutoff.
exec("sh target/bin/solver "+fields.dimx+" "+fields.dimy, function(error, stdout, stderr){
//do stuff with stdout
}
I've tried using setTimeouts and callbacks but haven't succeeded but I do feel this is occurring because I'm referencing stdout in my code before it can be retrieved completely. I have tested that stdout is in-fact where the data loss first occurs. It's not an asynchronous issue further down the line. I've also tested this on my local machine and Heroku, and the exact same issue occurs, truncating at the exact same line number every time.
Any ideas or suggestions as to what might help with this?
I had exec.stdout.on('end') callbacks hung forever with #damphat solution.
Another solution is to increase the buffer size in the options of exec: see the documentation here
{ encoding: 'utf8',
timeout: 0,
maxBuffer: 200*1024, //increase here
killSignal: 'SIGTERM',
cwd: null,
env: null }
To quote: maxBuffer specifies the largest amount of data allowed on stdout or stderr - if this value is exceeded then the child process is killed. I now use the following: this does not require handling the separated parts of the chunks separated by commas in stdout, as opposed to the accepted solution.
exec('dir /b /O-D ^2014*', {
maxBuffer: 2000 * 1024 //quick fix
}, function(error, stdout, stderr) {
list_of_filenames = stdout.split('\r\n'); //adapt to your line ending char
console.log("Found %s files in the replay folder", list_of_filenames.length)
}
);
The real (and best) solution to this problem is to use spawn instead of exec.
As stated in this article, spawn is more suited for handling large volumes of data :
child_process.exec returns the whole buffer output from the child process. By default the buffer size is set at 200k. If the child process returns anything more than that, you program will crash with the error message "Error: maxBuffer exceeded". You can fix that problem by setting a bigger buffer size in the exec options. But you should not do it because exec is not meant for processes that return HUGE buffers to Node. You should use spawn for that. So what do you use exec for? Use it to run programs that return result statuses, instead of data.
spawn requires a different syntax than exec :
var proc = spawn('sh', ['target/bin/solver', 'fields.dimx', 'fields.dimy']);
proc.on("exit", function(exitCode) {
console.log('process exited with code ' + exitCode);
});
proc.stdout.on("data", function(chunk) {
console.log('received chunk ' + chunk);
});
proc.stdout.on("end", function() {
console.log("finished collecting data chunks from stdout");
});
Edited:
I have tried with dir /s on my computer (windows) and got the same problem( it look like a bug), this code solve that problem for me:
var exec = require('child_process').exec;
function my_exec(command, callback) {
var proc = exec(command);
var list = [];
proc.stdout.setEncoding('utf8');
proc.stdout.on('data', function (chunk) {
list.push(chunk);
});
proc.stdout.on('end', function () {
callback(list.join());
});
}
my_exec('dir /s', function (stdout) {
console.log(stdout);
})

Categories

Resources