Node.js virtual memory increases constantly while using child process - javascript

I'm using spawn-child npm package to spawn a shell where i run a binary file which was originally built on C++. I provide Stdin's to the binary and then the binary would be sending out the Stdout's constantly for every second. On the node part once i start receiving the Stdout's from binary i have an on listener which would look something like stdout.on('data', function (data) {}) where i send these data's to the SSE channel.
Everything is working fine but the major concern is the constant memory growth of node process that i see when i hit the binary everytime with an Stdin. I have outlined how my code looks, is there an elegant way to control this memory growth, if so please share.
var sseChannel = require('sse-channel'),
spawnCommand = require('spawn-command'),
cmd = 'path to the binary file',
globalArray = [],
uuid = require('uuid');
module.exports = function(app) {
var child = spawnCommand(cmd),
privateChannel = new sseChannel({
historySize: 0,
cors: {
origins: ['*']
},
pingInterval: 15 * 1000,
jsonEncode: false
});
srvc = {
get: function(req, res) {
globalArray[uuid.v4()] = res;
child.stdin.write('a json object in a format that is expected by binary' + '\n'); // req.query.<queryVal>
child.stdout.on('data', function(data) {
privateChannel.send(JSON.stringify(data));
});
},
delete: function(sessionID) {
var response = globalArray[sessionID];
privateChannel.removeClient(response);
response.end();
delete globalArray[sessionID];
}
}
}
This code is just to enumerate how it would look in the app. Hitting the Run code snippet would not work in this case.
I collected heapdump at 2 different intervals and this is how the statistics looks, there is a tremendous increase in the Typed Array value, what could be done to maintain or suppress the growth of Typed Array,

The problem is that you're spawning a process once and then adding a new data event handler for every request to your http server that never gets removed. So this would explain why the memory usage never drops even after gc.
Another (unrelated) problem is that if you are using your single child process to process multiple incoming requests, you can run into the problem of mixing responses for different requests (you cannot assume that one data event will contain only the data for a particular request). If the child process is node.js-based, you could set up an ipc channel with it and then just pass regular JavaScript values back and forth instead of setting up stdout handling/parsing. If the child isn't node.js-based or you want an alternative (no-ipc) solution, you could set up a queue that all requests get pushed onto and then have a function that processes the queue and responds to each request serially (only moving onto the next request once you have somehow determined you have received all output from the child process for the current request).
If you instead meant for the child process to only be used for a single request, you will need to tweak your code to spawn once per request instead (moving spawn() inside get()).

Related

Call stack size exceeded on re-starting Node function

I'm trying to overcome Call stack size exceeded error but with no luck,
Goal is to re-run the GET request as long as I get music in type field.
//tech: node.js + mongoose
//import components
const https = require('https');
const options = new URL('https://www.boredapi.com/api/activity');
//obtain data using GET
https.get(options, (response) => {
//console.log('statusCode:', response.statusCode);
//console.log('headers:', response.headers);
response.on('data', (data) => {
//process.stdout.write(data);
apiResult = JSON.parse(data);
apiResultType = apiResult.type;
returnDataOutside(data);
});
})
.on('error', (error) => {
console.error(error);
});
function returnDataOutside(data){
apiResultType;
if (apiResultType == 'music') {
console.log(apiResult);
} else {
returnDataOutside(data);
console.log(apiResult); //Maximum call stack size exceeded
};
};
Your function returnDataOutside() is calling itself recursively. If it doesn't gets an apiResultType of 'music' on the first time, then it just keeps calling itself deeper and deeper until the stack overflows with no chance of ever getting the music type because you're just calling it with the same data over and over.
It appears that you want to rerun the GET request when you don't have music type, but your code is not doing that - it's just calling your response function over and over. So, instead, you need to put the code that makes the GET request into a function and call that new function that actually makes a fresh GET request when the apiResultType isn't what you want.
In addition, you shouldn't code something like this that keeping going forever hammering some server. You should have either a maximum number of times you try or a timer back-off or both.
And, you can't just assume that response.on('data', ...) contains a perfectly formed piece of JSON. If the data is anything but very small, then the data may arrive in any arbitrary sized chunks. It make take multiple data events to get your entire payload. And, this may work on fast networks, but not on slow networks or through some proxies, but not others. Instead, you have to accumulate the data from the entire response (all the data events that occur) concatenated together and then process that final result on the end event.
While, you can code the plain https.get() to collect all the results for you (there's an example of that right in the doc here), it's a lot easier to just use a higher level library that brings support for a bunch of useful things.
My favorite library to use in this regard is got(), but there's a list of alternatives here and you can find the one you like. Not only do these libraries accumulate the entire request for you with you writing any extra code, but they are promise-based which makes the asynchronous coding easier and they also automatically check status code results for you, follow redirects, etc... - many things you would want an http request library to "just handle" for you.

Recieving a stream from rails 4.0 in JS callback

I'm trying transmit an image file from the server to the client, but my javascript callback becomes active before the stream closes I doing this because sending it in a traditional render json: times out and takes way to long anyway. The stream takes much less time, but i keep can't get all the data before the callback fires up.
controller code
def mytest
image=ImageList.new(AssistMe.get_url(image_url))
response.stream.write image.export_pixels(0, 0, image.columns, image.rows, 'RGBA').to_s
response.stream.close
end
javascript
var getStream, runTest;
runTest = function() {
return $.post('/dotest', getStream);};
getStream = function(params) {
return document.getElementById('whatsup2').innerHTML =
"stream is here " + params.length;};
the response is an array, I can make it an array of arrays by adding a "[" at the front and a "],['finish'] at the end to be able to detect the end of the data, but I haven't been able to figure out how to get javascript to wait until the end of stream to run. I assume i need to set up some kind of pole to check for the end, but how do I attach it to the callback?
Okay, here's a blog that describes this pretty well
blog
But i decided to forgo a stream and use .to_s. Since you can pipe several actions tougher
render object.method.method.to_s you get all the server side benefits of using a stream without the complexity. If you have a slow process where you need to overlap the client and server actions, then go to the blog and do it. Otherwise to_s covers it pretty well

Socket.io: How to limit the size of emitted data from client to the websocket server

I have a node.js server with socket.io. My clients use socket.io to connect to the node.js server.
Data is transmitted from clients to server in the following way:
On the client
var Data = {'data1':'somedata1', 'data2':'somedata2'};
socket.emit('SendToServer', Data);
On the server
socket.on('SendToServer', function(Data) {
for (var key in Data) {
// Do some work with Data[key]
}
});
Suppose that somebody modifies his client and emits to the server a really big chunk of data. For example:
var Data = {'data1':'somedata1', 'data2':'somedata2', ...and so on until he reach for example 'data100000':'data100000'};
socket.emit('SendToServer', Data);
Because of this loop on the server...
for (var key in Data) {
// Do some work with Data[key]
}
... the server would take a very long time to loop through all this data.
So, what is the best solution to prevent such scenarios?
Thanks
EDIT:
I used this function to validate the object:
function ValidateObject(obj) {
var i = 0;
for(var key in obj) {
i++;
if (i > 10) { // object is too big
return false;
}
}
return false;
}
So the easiest thing to do is just check the size of the data before doing anything with it.
socket.on('someevent', function (data) {
if (JSON.stringify(data).length > 10000) //roughly 10 bytes
return;
console.log('valid data: ' + data);
});
To be honest, this is a little inefficient. Your client sends the message, socket.io parses the message into an object, and then you get the event and turn it back into a String.
If you want to be even more efficient then on the client side you should be enforcing max lengths of messages.
For even more efficiency (and to protect against malicious users), as packets come into Socket.io, if the length gets too long, then you should discard them. You'll either need to figure a way to extend the prototypes to do what you want or you'll need to pull the source and modify it yourself. Also, I haven't looked into the socket.io protocol but I'm sure you'll have to do more than just "discard" the packet. Also, some packets are ack-backs and nack-backs so you don't want to mess with those, either.
Side note: If you ONLY care about the number of keys then you can use Object.keys(obj) which returns an array of keys:
if (Object.keys(obj).length > 10)
return;
Probably you may consider switching to socket.io-stream and handle input stream directly.
This way you should join chunks and finally parse json input manually, but you have chance to close connection when input data length exceeds threshold you decide.
Otherwise (staying with socket.io approach) your callback won't be called until whole data stream were received. This doesn't stop your js main thread execution, but waste memory, cpu and bandwith.
On the other hand, if your only goal is to avoid overload of your processing algorithm you can continue limitting it by counting elements in the received object. For instance:
if (Object.keys(data).length > n) return; // Where n is your maximum acceptable number of elements.
// But, anyway, this doesn't control the actual size of each element.
EDIT: Because the question is about "how to handle server overload" You should check load balancing with gninx http://nginx.com/blog/nginx-nodejs-websockets-socketio/ - you could have additional servers in case one client is creating a bottleneck. The other servers would be available. Even if you solve this problem, there are still other problems, like client sending several small packets and so on.
The Socket.io -library seems to be a bit problematic, managing too big messages is not available at the websockets layer, there was a pull -request three years ago, which gives an idea how it might be solved:
https://github.com/Automattic/socket.io/issues/886
However, because WebSockets -protocol does have finite packet size it would allow you to stop processing of the packets if certain size has been achieved. The most effective way of doing this would be before the packet is stransformed to JavaScript heap. This means that you should handle the WebSocket transform manually - this is what the socket.io is doing for you but it does not take into account the packet size.
If you want to implement you own websocket layer, using this WebSocket -node implementation might be useful:
https://github.com/theturtle32/WebSocket-Node
If you do not need to support older browsers, using this pure websockets -approach might be suitable solution.
Well, I'll go with the Javascript side of the thing... let's say you don't want to allow users to go over a certain limit of data, you can just:
var allowedSize = 10;
Object.keys(Data).map(function( key, idx ) {
if( idx > allowedSize ) return;
// Do some work with Data[key]
});
this not only allows you to properly cycle through the elements of your object, it lets you limit easily. ( obviously this can also ruin your own pre-set requests )
Maybe destroy buffer size is what you need.
From the wiki:
destroy buffer size defaults to 10E7
Used by the HTTP transports. The Socket.IO server buffers HTTP request bodies up to this limit. This limit is not applied to websocket or flashsockets.

Strange issue with socket.on method

I am facing a strange issue with calling socket.on methods from the Javascript client. Consider below code:
for(var i=0;i<2;i++) {
var socket = io.connect('http://localhost:5000/');
socket.emit('getLoad');
socket.on('cpuUsage',function(data) {
document.write(data);
});
}
Here basically I am calling a cpuUsage event which is emitted by socket server, but for each iteration I am getting the same value. This is the output:
0.03549148310035006
0.03549148310035006
0.03549148310035006
0.03549148310035006
Edit: Server side code, basically I am using node-usage library to calculate CPU usage:
socket.on('getLoad', function (data) {
usage.lookup(pid, function(err, result) {
cpuUsage = result.cpu;
memUsage = result.memory;
console.log("Cpu Usage1: " + cpuUsage);
console.log("Cpu Usage2: " + memUsage);
/*socket.emit('cpuUsage',result.cpu);
socket.emit('memUsage',result.memory);*/
socket.emit('cpuUsage',cpuUsage);
socket.emit('memUsage',memUsage);
});
});
Where as in the server side, I am getting different values for each emit and socket.on. I am very much feeling strange why this is happening. I tried setting data = null after each socket.on call, but still it prints the same value. I don't know what phrase to search, so I posted. Can anyone please guide me?
Please note: I am basically Java developer and have a less experience in Javascript side.
You are making the assumption that when you use .emit(), a subsequent .on() will wait for a reply, but that's not how socket.io works.
Your code basically does this:
it emits two getLoad messages directly after each other (which is probably why the returning value is the same);
it installs two handlers for a returning cpuUsage message being sent by the server;
This also means that each time you run your loop, you're installing more and more handlers for the same message.
Now I'm not sure what exactly it is you want. If you want to periodically request the CPU load, use setInterval or setTimeout. If you want to send a message to the server and want to 'wait' for a response, you may want to use acknowledgement functions (not very well documented, but see this blog post).
But you should assume that for each type of message, you should only call socket.on('MESSAGETYPE', ) once during the runtime of your code.
EDIT: here's an example client-side setup for a periodic poll of the data:
var socket = io.connect(...);
socket.on('connect', function() {
// Handle the server response:
socket.on('cpuUsage', function(data) {
document.write(data);
});
// Start an interval to query the server for the load every 30 seconds:
setInterval(function() {
socket.emit('getLoad');
}, 30 * 1000); // milliseconds
});
Use this line instead:
var socket = io.connect('iptoserver', {'force new connection': true});
Replace iptoserver with the actual ip to the server of course, in this case localhost.
Edit.
That is, if you want to create multiple clients.
Else you have to place your initiation of the socket variable before the for loop.
I suspected the call returns average CPU usage at the time of startup, which seems to be the case here. Checking the node-usage documentation page (average-cpu-usage-vs-current-cpu-usage) I found:
By default CPU Percentage provided is an average from the starting
time of the process. It does not correctly reflect the current CPU
usage. (this is also a problem with linux ps utility)
But If you call usage.lookup() continuously for a given pid, you can
turn on keepHistory flag and you'll get the CPU usage since last time
you track the usage. This reflects the current CPU usage.
Also given the example how to use it.
var pid = process.pid;
var options = { keepHistory: true }
usage.lookup(pid, options, function(err, result) {
});

How to wait for client response with socket.io?

I'm working on an online, turned based game in order to teach myself Node.js and Socket.IO. Some aspects of the game are resolved serverside. At one point during one of these functions, the server may require input from the clients. Is there a way I can "pause" the resolution of the server's function in order to wait for the clients to respond (via a var x = window.prompt)?
Here's an idea of the code I'm working with:
Server:
for (some loop){
if (some condition){
request input via io.sockets.socket(userSocket[i]).emit('requestInput', data)
}
}
Client:
socket.on('requestInput', function (data) {
var input = window.prompt('What is your input regarding ' + data + '?');
//send input back to the server
socket.emit('refresh', input)
});
Any thoughts?
I don't think that is possible.
for (some loop){
if (some condition){
request input via io.sockets.socket(userSocket[i]).emit('requestInput', data)
/* Even if you were able to pause the execution here, there is no way to resume it when client emits the 'refresh' event with user input */
}
}
What you can do instead is emit all 'requestInput' events without pausing and save all responses you will get in socket.on('refresh',function(){}) event in an array, then you can process this array later. I don't know what your exact requirement is but let me know if that works.
Since you are emitting socket.emit('refresh', input) on the client side, you just need to set up a socket event listener on the server side as well. For example:
io.sockets.on('connection', function (socket) {
socket.on('refresh', function (data) {
console.log(data) //input
});
})
I will also point out, so that you don't run into trouble down the line, that indefinite loops are a big nono in node. Nodejs runs on a single thread so you are actually blocking ALL clients as long as your loop is running.

Categories

Resources