Invoke pre-parsed command line with Node.js

Invoke pre-parsed command line with Node.js - javascript

I need to invoke the following command, where password is user input. However, I am worried about the possibility of an attack, such as "; rm -rf / ;" being the input given by the user.
var checkPassword = exec('echo "'+password+ '"| cracklib-check\n', function(err, stdout, stderr) {
...
...
}
is there a way to invoke the command with pre-parsed arguments (preferably native to nodejs/ javascript), kind of like prepared statements which are used to avoid SQL injection?
I could probably avoid the problem by blacklisting certain characters, but that seems much less reliable, and I'd like to avoid it is possible.

As you point out, building a command line with user provided input is a security issue. Typically you would write a wrapper that verifies that each user-provided parameter meets a white-list before invoking the command.
In your case however there is a simpler solution: you are constructing a command line that simply sends the password to the stdin of cracklib-check. Instead of using child_process.exec you can switch to child_process.spawn which allows you to write directly to stdin, avoiding the need to build a command line with user-provided input.
The following sample code avoids the security problem:
const spawn = require('child_process').spawn;
// Read password from argument to nodejs invocation
var password = process.argv[2];
// Spawn cracklib-check
var cracklib_check = spawn("/usr/sbin/cracklib-check");
// Send password to cracklib-check STDIN
cracklib_check.stdin.write(password);
cracklib_check.stdin.end();
// Process results of cracklib-check
cracklib_check.stdout.on('data', function (data) {
console.log("[*] " + data);
});
cracklib_check.stderr.on('data', function (data) {
console.log("[-] " + data);
});

#Ilmora's answered me started, but I still had to handle encoding.
const spawn = require('child_process').spawn;
// Read password from argument to nodejs invocation
var password = process.argv[2];
var cracklib_check = spawn('/usr/sbin/cracklib-check');
cracklib_check.stdin.setEncoding = 'utf-8';
cracklib_check.stdin.write(password);
cracklib_check.stdin.end();
// Process results of cracklib-check
cracklib_check.stdout.on('data', function (data) {
console.log("[*] " + data.toString());
});
cracklib_check.stderr.on('data', function (data) {
console.log("[-] " + data.toString());
});

Related

using child_process spawn on aws lambda for a python script

I'm trying to run a python script using my javascript file through the child_process.spawn system, but it seems to never run on the aws lambda.
The relevant code is :
getEntities: function (){
var spawn = require('child_process').spawn;
var py = spawn('python', ['mainPythonFile.py']);
var outputString = "starting string";
console.log("BEFORE ANY INPUT");
py.stdout.on('data', function (data) {
console.log("----Getting information from the python script!---");
outputString += data.toString();
console.log(outputString);
});
py.stdout.on('end', function (){
console.log("===hello from the end call in python files===");
console.log("My output : " + outputString);
});
console.log("NO INPUT CHANGED??");
return outputString;
}
These files are in the same level of the folder structure (surface level).
The python file being run is quite simple and only contains a few print statements:
MainPythonFile:
import sys;
print("Hello There");
print("My name is Waffles");
print("Potato Waffles");
sys.stdout.flush()
The output I get from the aws service is this :
BEFORE ANY INPUT
NO INPUT CHANGED??
starting string
I've tried different paths on trying to access the python file, such as
*mainPythonFile.py ./mainPythonFile.py etc.
I think the code seems to be fine as this works on my local machine, but there's a subtlety of trying to make it run on AWS that I cannot understand.
I can provide any other info if need be.
NOTE: the "getEntities" function is being called by another node.js file, but I moved the code to the calling function, I get the same result.

Due to the asynchronous nature of JS, as explained by Chris, the function reaches the "return" statement before the "end" in the spawned thread is actually called.
This means the code never got a chance to actually set the correct output text.
I changed my function calls to take in a callback, that would then respond when the program had replied with the information.
My new function is slightly changed to this (without the prints):
getEntities: function(callbackFunction, that){
var spawn = require('child_process').spawn;
var py = spawn('python', ['mainPythonFile.py']);
var outputString = "starting string";
py.stdout.on('data', function (data) {
outputString += data.toString();
});
// that = "this == alexa" that's passed in as input.
py.stdout.on('end', function (){
callbackFunction(outputString, that);
});
The function that called this function is now as follows :
HelperFunctions.getEntities(function(returnString,that){
that.response.speak(returnString);
that.emit(':responseReady');
}, this);
I'm sure there's a prettier way to do this, but this seems to be working for now. Thanks to ChrisG

Pass large array to node child process

I have complex CPU intensive work I want to do on a large array. Ideally, I'd like to pass this to the child process.
var spawn = require('child_process').spawn;
// dataAsNumbers is a large 2D array
var child = spawn(process.execPath, ['/child_process_scripts/getStatistics', dataAsNumbers]);
child.stdout.on('data', function(data){
console.log('from child: ', data.toString());
});
But when I do, node gives the error:
spawn E2BIG
I came across this article
So piping the data to the child process seems to be the way to go. My code is now:
var spawn = require('child_process').spawn;
console.log('creating child........................');
var options = { stdio: [null, null, null, 'pipe'] };
var args = [ '/getStatistics' ];
var child = spawn(process.execPath, args, options);
var pipe = child.stdio[3];
pipe.write(Buffer('awesome'));
child.stdout.on('data', function(data){
console.log('from child: ', data.toString());
});
And then in getStatistics.js:
console.log('im inside child');
process.stdin.on('data', function(data) {
console.log('data is ', data);
process.exit(0);
});
However the callback in process.stdin.on isn't reached. How can I receive a stream in my child script?
EDIT
I had to abandon the buffer approach. Now I'm sending the array as a message:
var cp = require('child_process');
var child = cp.fork('/getStatistics.js');
child.send({
dataAsNumbers: dataAsNumbers
});
But this only works when the length of dataAsNumbers is below about 20,000, otherwise it times out.

With such a massive amount of data, I would look into using shared memory rather than copying the data into the child process (which is what is happening when you use a pipe or pass messages). This will save memory, take less CPU time for the parent process, and be unlikely to bump into some limit.
shm-typed-array is a very simple module that seems suited to your application. Example:
parent.js
"use strict";
const shm = require('shm-typed-array');
const fork = require('child_process').fork;
// Create shared memory
const SIZE = 20000000;
const data = shm.create(SIZE, 'Float64Array');
// Fill with dummy data
Array.prototype.fill.call(data, 1);
// Spawn child, set up communication, and give shared memory
const child = fork("child.js");
child.on('message', sum => {
console.log(`Got answer: ${sum}`);
// Demo only; ideally you'd re-use the same child
child.kill();
});
child.send(data.key);
child.js
"use strict";
const shm = require('shm-typed-array');
process.on('message', key => {
// Get access to shared memory
const data = shm.get(key, 'Float64Array');
// Perform processing
const sum = Array.prototype.reduce.call(data, (a, b) => a + b, 0);
// Return processed data
process.send(sum);
});
Note that we are only sending a small "key" from the parent to the child process through IPC, not the whole data. Thus, we save a ton of memory and time.
Of course, you can change 'Float64Array' (e.g. a double) to whatever typed array your application requires. Note that this library in particular only handles single-dimensional typed arrays; but that should only be a minor obstacle.

I too was able to reproduce the delay your were experiencing, but maybe not as bad as you. I used the following
// main.js
const fork = require('child_process').fork
const child = fork('./getStats.js')
const dataAsNumbers = Array(100000).fill(0).map(() =>
Array(100).fill(0).map(() => Math.round(Math.random() * 100)))
child.send({
dataAsNumbers: dataAsNumbers,
})
And
// getStats.js
process.on('message', function (data) {
console.log('data is ', data)
process.exit(0)
})
node main.js 2.72s user 0.45s system 103% cpu 3.045 total
I'm generating 100k elements composed of 100 numbers to mock your data, make sure you are using the message event on process. But maybe your children are more complex and might be the reason of the failure, also depends on the timeout you set on your query.
If you want to get better results, what you could do is chunk your data into multiple pieces that will be sent to the child process and reconstructed to form the initial array.
Also one possibility would be to use a third-party library or protocol, even if it's a bit more work. You could have a look to messenger.js or even something like an AMQP queue that could allow you to communicate between the two process with a pool and a guaranty of the message been acknowledged by the sub process. There is a few node implementations of it, like amqp.node, but it would still require a bit of setup and configuration work.

Use an in memory cache like https://github.com/ptarjan/node-cache, and let the parent process store the array contents with some key, the child process would retreive the contents through that key.

You could consider using OS pipes you'll find a gist here as an input to your node child application.
I know this is not exactly what you're asking for, but you could use the cluster module (included in node). This way you can get as many instances as cores you machine has to speed up processing. Moreover consider using streams if you don't need to have all the data available before you start processing. If the data to be processed is too large i would store it in a file so you can reinilize if there is any error during the process.
Here is an example of clustering.
var cluster = require('cluster');
var numCPUs = 4;
if (cluster.isMaster) {
for (var i = 0; i < numCPUs; i++) {
var worker = cluster.fork();
console.log('id', worker.id)
}
} else {
doSomeWork()
}
function doSomeWork(){
for (var i=1; i<10; i++){
console.log(i)
}
}
More info sending messages across workers question 8534462.

Why do you want to make a subprocess? The sending of the data across subprocesses is likely to cost more in terms of cpu and realtime than you will save in making the processing happen within the same process.
Instead, I would suggest that for super efficient coding you consider to do your statistics calculations in a worker thread that runs within the same memory as the nodejs main process.
You can use the NAN to write C++ code that you can post to a worker thread, and then have that worker thread to post the result and an event back to your nodejs event loop when done.
The benefit of this is that you don't need extra time to send the data across to a different process, but the downside is that you will write a bit of C++ code for the threaded action, but the NAN extension should take care of most of the difficult task for you.

To address the performance issue while passing large data to the child process, save the data to the .json or .txt file and pass only the filename to the childprocess. I've achieved 70% performance improvement with this approach.

For long process tasks you could use something like gearman You could do the heavy work process on workers, in this way you can setup how many workers you need, for example I do some file processing in this way, if I need scale you create more worker instance, also I have different workers for different tasks, process zip files, generate thumbnails, etc, the good of this is the workers can be written on any language node.js, Java, python and can be integrated on your project with ease
// worker-unzip.js
const debug = require('debug')('worker:unzip');
const {series, apply} = require('async');
const gearman = require('gearmanode');
const {mkdirpSync} = require('fs-extra');
const extract = require('extract-zip');
module.exports.unzip = unzip;
module.exports.worker = worker;
function unzip(inputPath, outputDirPath, done) {
debug('unzipping', inputPath, 'to', outputDirPath);
mkdirpSync(outputDirPath);
extract(inputPath, {dir: outputDirPath}, done);
}
/**
*
* #param {Job} job
*/
function workerUnzip(job) {
const {inputPath, outputDirPath} = JSON.parse(job.payload);
series([
apply(unzip, inputPath, outputDirPath),
(done) => job.workComplete(outputDirPath)
], (err) => {
if (err) {
console.error(err);
job.reportError();
}
});
}
function worker(config) {
const worker = gearman.worker(config);
if (config.id) {
worker.setWorkerId(config.id);
}
worker.addFunction('unzip', workerUnzip, {timeout: 10, toStringEncoding: 'ascii'});
worker.on('error', (err) => console.error(err));
return worker;
}
a simple index.js
const unzip = require('./worker-unzip').worker;
unzip(config); // pass host and port of the Gearman server
I normally run workers with PM2
the integration with your code it's very easy. something like
//initialize
const gearman = require('gearmanode');
gearman.Client.logger.transports.console.level = 'error';
const client = gearman.client(configGearman); // same host and port
the just add work to the queue passing the name of the functions
const taskpayload = {inputPath: '/tmp/sample-file.zip', outputDirPath: '/tmp/unzip/sample-file/'}
const job client.submitJob('unzip', JSON.stringify(taskpayload));
job.on('complete', jobCompleteCallback);
job.on('error', jobErrorCallback);

how to run a batch file in Node.js with input and get an output

In perl if you need to run a batch file it can be done by following statement.
system "tagger.bat < input.txt > output.txt";
Here, tagger.bat is a batch file, input.txt is the input file and output.txt is the output file.
I like to know whether it is possible to do it be done in Node.js or not? If yes, how?

You will need to create a child process. Unline Python, node.js is asynchronous meaning it doesn't wait on the script.bat to finish. Instead, it calls functions you define when script.bat prints data or exists:
// Child process is required to spawn any kind of asynchronous process
var childProcess = require("child_process");
// This line initiates bash
var script_process = childProcess.spawn('/bin/bash',["test.sh"],{env: process.env});
// Echoes any command output
script_process.stdout.on('data', function (data) {
console.log('stdout: ' + data);
});
// Error output
script_process.stderr.on('data', function (data) {
console.log('stderr: ' + data);
});
// Process exit
script_process.on('close', function (code) {
console.log('child process exited with code ' + code);
});
Apart from assigning events to the process, you can connect streams stdin and stdout to other streams. This means other processes, HTTP connections or files, as shown below:
// Pipe input and output to files
var fs = require("fs");
var output = fs.createWriteStream("output.txt");
var input = fs.createReadStream("input.txt");
// Connect process output to file input stream
script_process.stdout.pipe(output);
// Connect data from file to process input
input.pipe(script_process.stdin);
Then we just make a test bash script test.sh:
#!/bin/bash
input=`cat -`
echo "Input: $input"
And test text input input.txt:
Hello world.
After running the node test.js we get this in console:
stdout: Input: Hello world.
child process exited with code 0
And this in output.txt:
Input: Hello world.
Procedure on windows will be similar, I just think you can call batch file directly:
var script_process = childProcess.spawn('test.bat',[],{env: process.env});

Running multiple threads in parallel using Webworker-Threads for Node.js

I am VERY new to using Node.js and WebWorker-Threads for Node.js (https://github.com/audreyt/node-webworker-threads). The WebWorker-Threads module is based on Thread-A-Gogo (https://github.com/xk/node-threads-a-gogo).
Here is a summary of my problem:
I have a file called "readFile.js", which contains code to read a csv
file that is passed in, and converts the csv data into a 2D array.
I want the function ""exports.parseCSV" within "readFile.js" to be
loaded and executed in multiple worker threads, with each thread
running in parallel.
I have a file, "test1.js", which attempts to do point 2 by using a
thread pool:
var Threads = require('webworker-threads');
var parser = require('./readFile.js');
var numThreads = 3;
var threadPool = Threads.createPool(numThreads);
threadPool.all.eval(parser.parseCSV);
for (var i = 1; i <=3; i++) {
(function(i) {
threadPool.any.eval(parser.parseCSV("csvFile.csv"), function (err, val) {
if (i == 3) {
console.log("bye!");
threadPool.destroy();
}
});
})(i);
}
I have another file, "test2.js", which attempts to do point 2 by
manually creating threads:
function cb (err, result) {
if (err) {
console.log("\n" + " ERROR! ERROR! ERROR!");
throw err;
}
console.log(" NO ERROR!");
thread.destroy();
}
var Threads = require('webworker-threads');
var parser = require('./readFile.js');
var thread = Threads.create();
thread.eval(parser.parseCSV);
thread.eval(parser.parseCSV("csvFile.csv"), cb);
thread.eval(parser.parseCSV);
thread.eval(parser.parseCSV("csvFile.csv"), cb);
thread.eval(parser.parseCSV);
thread.eval(parser.parseCSV("csvFile.csv"), cb);
When I run a "test[x].js" file using the command node test[x].js in the command prompt, the threads seem to be working but it appears as if they are being executed sequentially, and not in parallel. I say this based on the outputs of the time each thread starts, which I print out to command prompt window.
How do I execute multiple Webworker-Threads which run in parallel in the background? I want the threads to start at the same time and end at the same (if that is possible). I have looked at the API for the WebWorker-Threads but I haven't been able to solve this problem. Once this works, I would like to use the same methodology to not only read the csv files, but also store their contents into a database.
Any help would be greatly appreciated! Thank you very much :)

NodeJS exec with binary from and to the process

I'm trying to write a function, that would use native openssl to do some RSA heavy-lifting for me, rather than using a js RSA library. The target is to
Read binary data from a file
Do some processing in the node process, using JS, resulting in a Buffer containing binary data
Write the buffer to the stdin stream of the exec command
RSA encrypt/decrypt the data and write it to the stdout stream
Get the input data back to a Buffer in the JS-process for further processing
The child process module in Node has an exec command, but I fail to see how I can pipe the input to the process and pipe it back to my process. Basically I'd like to execute the following type of command, but without having to rely on writing things to files (didn't check the exact syntax of openssl)
cat the_binary_file.data | openssl -encrypt -inkey key_file.pem -certin > the_output_stream
I could do this by writing a temp file, but I'd like to avoid it, if possible. Spawning a child process allows me access to stdin/out but haven't found this functionality for exec.
Is there a clean way to do this in the way I drafted here? Is there some alternative way of using openssl for this, e.g. some native bindings for openssl lib, that would allow me to do this without relying on the command line?

You've mentioned spawn but seem to think you can't use it. Possibly showing my ignorance here, but it seems like it should be just what you're looking for: Launch openssl via spawn, then write to child.stdin and read from child.stdout. Something very roughly like this completely untested code:
var util = require('util'),
spawn = require('child_process').spawn;
function sslencrypt(buffer_to_encrypt, callback) {
var ssl = spawn('openssl', ['-encrypt', '-inkey', ',key_file.pem', '-certin']),
result = new Buffer(SOME_APPROPRIATE_SIZE),
resultSize = 0;
ssl.stdout.on('data', function (data) {
// Save up the result (or perhaps just call the callback repeatedly
// with it as it comes, whatever)
if (data.length + resultSize > result.length) {
// Too much data, our SOME_APPROPRIATE_SIZE above wasn't big enough
}
else {
// Append to our buffer
resultSize += data.length;
data.copy(result);
}
});
ssl.stderr.on('data', function (data) {
// Handle error output
});
ssl.on('exit', function (code) {
// Done, trigger your callback (perhaps check `code` here)
callback(result, resultSize);
});
// Write the buffer
ssl.stdin.write(buffer_to_encrypt);
}

You should be able to set encoding to binary when you make a call to exec, like..
exec("openssl output_something_in_binary", {encoding: 'binary'}, function(err, out, err) {
//do something with out - which is in the binary format
});
If you want to write out the content of "out" in binary, make sure to set the encoding to binary again, like..
fs.writeFile("out.bin", out, {encoding: 'binary'});
I hope this helps!

Develop Reference

JavaScript is the programming language of the Web.

Invoke pre-parsed command line with Node.js - javascript

Related

using child_process spawn on aws lambda for a python script

Pass large array to node child process

how to run a batch file in Node.js with input and get an output

Running multiple threads in parallel using Webworker-Threads for Node.js

NodeJS exec with binary from and to the process

Categories

Resources