read/parse json data from socket in node - javascript

I have a node.js net tcp server, where I receive json data from the client(s). I would like to buffer it until the whole json block arrives, than maybe parse it, maybe just forward it to somewhere else. What modules are out there which are compatible with the latest node (0.6.x)?
Should be fast, and if it's pure js that's better for me.
b.

hum, net module for buffering, JSON.parse() for parser, sorry seems obvious but...
Check out :
https://github.com/nodejitsu/node-http-proxy
for forwarding

I have been looking at this issue for quite some hours after thinking again about it and solving it with a quite easy and straight forward solution.
My application is receiving utf-8 encoded JSON data, sometimes quite large data. In the end I also had chunks of my JSON document, which I needed to complete in order to successfully parse them.
My solution is a simple String buffer variable that I fill with the incoming data as long as I will receive an "new line" aka "\n". Then I "deliver" the data to be parsed and continue filling the buffer with next messages.
My code is as follows:
var buffer = '';
client.on('data', function(data) {
if (data.indexOf('\n') < 0) {
buffer += data;
} else {
var msg = buffer + data.substring(0, data.indexOf('\n'));
buffer = data.substring(data.indexOf('\n') + 1);
console.log('Sending msg: ' + msg);
}
}
This solution only works because my protocol is based on utf-8 and has '\n' as the delimiter.

Related

Cannot write to JSON in Nodejs

I'm trying to make a live JSON database of IDs and a tag. The Database refreshes by reading from the JSON file and I have traced my problem to Nodejs not writing to disk, and I don't quite know why.
This is my reading operation, and yes there is a file there with proper syntax.
let dbraw = fs.readFileSync('db.json');
var db = JSON.parse(dbraw);
This is my writing operation, where I need to save the updates to disk.
var authorid = msg.author.id
db[authorid] = "M";
fs.writeFileSync('db.json', JSON.stringify(db));
Am I doing something wrong? Is it a simple error I am just forgetting? Is there an easier/more efficient way to do this I am forgetting about? I can't seem to figure out what exactly is going wrong, but it has something to do with these two bits. There are no errors in my console, just the blank JSON file it reads every time on the Read Operation.
There is a problem with your JSON file's path.
Try using __dirname.
__dirname tells you the absolute path of the directory containing the currently executing file.
— source (DigitalOcean)
Example:
If the JSON file is in the root directory:
let dbraw = fs.readFileSync(__dirname + '/db.json');
var db = JSON.parse(dbraw);
If the JSON file is in a subdirectory:
let dbraw = fs.readFileSync(__dirname + '/myJsonFolder/' + 'db.json');
var db = JSON.parse(dbraw);
Side note: I suggest you read about Google Firestore, as it will be a faster way to work with real time updates.
Here's a simple block that does what is desired
const fs = require('fs');
let file_path = __dirname + '/db.json',
dbraw = fs.readFileSync(file_path),
db = JSON.parse(dbraw),
authorid = 'abc';
console.log(db);
db[authorid] = "M";
fs.writeFileSync(file_path, JSON.stringify(db));
dbraw = fs.readFileSync(file_path), db = JSON.parse(dbraw)
console.log(db);
I've added a couple of log statements for debugging. This works and so there may be something else that's missing or incorrect in your flow. The most probable issue would be that of different path references as pointed out by jfriend00 in the comment to your question.
As for better solutions, following are a few suggestions
use require for the json directly instead of file read if the file is small which will do the parsing for you
Use async fs functions
Stream the file if it's big in size
See if you can use a cache like redis or database as a storage means to reduce your app's serialization and deserialization overhead

Parsing a large JSON array in Javascript

I'm supposed to parse a very large JSON array in Javascipt. It looks like:
mydata = [
{'a':5, 'b':7, ... },
{'a':2, 'b':3, ... },
.
.
.
]
Now the thing is, if I pass this entire object to my parsing function parseJSON(), then of course it works, but it blocks the tab's process for 30-40 seconds (in case of an array with 160000 objects).
During this entire process of requesting this JSON from a server and parsing it, I'm displaying a 'loading' gif to the user. Of course, after I call the parse function, the gif freezes too, leading to bad user experience. I guess there's no way to get around this time, is there a way to somehow (at least) keep the loading gif from freezing?
Something like calling parseJSON() on chunks of my JSON every few milliseconds? I'm unable to implement that though being a noob in javascript.
Thanks a lot, I'd really appreciate if you could help me out here.
You might want to check this link. It's about multithreading.
Basically :
var url = 'http://bigcontentprovider.com/hugejsonfile';
var f = '(function() {
send = function(e) {
postMessage(e);
self.close();
};
importScripts("' + url + '?format=json&callback=send");
})();';
var _blob = new Blob([f], { type: 'text/javascript' });
_worker = new Worker(window.URL.createObjectURL(_blob));
_worker.onmessage = function(e) {
//Do what you want with your JSON
}
_worker.postMessage();
Haven't tried it myself to be honest...
EDIT about portability: Sebastien D. posted a comment with a link to mdn. I just added a ref to the compatibility section id.
I have never encountered a complete page lock down of 30-40 seconds, I'm almost impressed! Restructuring your data to be much smaller or splitting it into many files on the server side is the real answer. Do you actually need every little byte of the data?
Alternatively if you can't change the file #Cyrill_DD's answer of a worker thread will be able to able parse data for you and send it to your primary JS. This is not a perfect fix as you would guess though. Passing data between the 2 threads requires the information to be serialised and reinterpreted, so you could find a significant slow down when the data is passed between the threads and be back to square one again if you try to pass all the data across at once. Building a query system into your worker thread for requesting chunks of the data when you need them and using the message callback will prevent slow down from parsing on the main thread and allow you complete access to the data without loading it all into your main context.
I should add that worker threads are relatively new, main browser support is good but mobile is terrible... just a heads up!

Socket.io: How to limit the size of emitted data from client to the websocket server

I have a node.js server with socket.io. My clients use socket.io to connect to the node.js server.
Data is transmitted from clients to server in the following way:
On the client
var Data = {'data1':'somedata1', 'data2':'somedata2'};
socket.emit('SendToServer', Data);
On the server
socket.on('SendToServer', function(Data) {
for (var key in Data) {
// Do some work with Data[key]
}
});
Suppose that somebody modifies his client and emits to the server a really big chunk of data. For example:
var Data = {'data1':'somedata1', 'data2':'somedata2', ...and so on until he reach for example 'data100000':'data100000'};
socket.emit('SendToServer', Data);
Because of this loop on the server...
for (var key in Data) {
// Do some work with Data[key]
}
... the server would take a very long time to loop through all this data.
So, what is the best solution to prevent such scenarios?
Thanks
EDIT:
I used this function to validate the object:
function ValidateObject(obj) {
var i = 0;
for(var key in obj) {
i++;
if (i > 10) { // object is too big
return false;
}
}
return false;
}
So the easiest thing to do is just check the size of the data before doing anything with it.
socket.on('someevent', function (data) {
if (JSON.stringify(data).length > 10000) //roughly 10 bytes
return;
console.log('valid data: ' + data);
});
To be honest, this is a little inefficient. Your client sends the message, socket.io parses the message into an object, and then you get the event and turn it back into a String.
If you want to be even more efficient then on the client side you should be enforcing max lengths of messages.
For even more efficiency (and to protect against malicious users), as packets come into Socket.io, if the length gets too long, then you should discard them. You'll either need to figure a way to extend the prototypes to do what you want or you'll need to pull the source and modify it yourself. Also, I haven't looked into the socket.io protocol but I'm sure you'll have to do more than just "discard" the packet. Also, some packets are ack-backs and nack-backs so you don't want to mess with those, either.
Side note: If you ONLY care about the number of keys then you can use Object.keys(obj) which returns an array of keys:
if (Object.keys(obj).length > 10)
return;
Probably you may consider switching to socket.io-stream and handle input stream directly.
This way you should join chunks and finally parse json input manually, but you have chance to close connection when input data length exceeds threshold you decide.
Otherwise (staying with socket.io approach) your callback won't be called until whole data stream were received. This doesn't stop your js main thread execution, but waste memory, cpu and bandwith.
On the other hand, if your only goal is to avoid overload of your processing algorithm you can continue limitting it by counting elements in the received object. For instance:
if (Object.keys(data).length > n) return; // Where n is your maximum acceptable number of elements.
// But, anyway, this doesn't control the actual size of each element.
EDIT: Because the question is about "how to handle server overload" You should check load balancing with gninx http://nginx.com/blog/nginx-nodejs-websockets-socketio/ - you could have additional servers in case one client is creating a bottleneck. The other servers would be available. Even if you solve this problem, there are still other problems, like client sending several small packets and so on.
The Socket.io -library seems to be a bit problematic, managing too big messages is not available at the websockets layer, there was a pull -request three years ago, which gives an idea how it might be solved:
https://github.com/Automattic/socket.io/issues/886
However, because WebSockets -protocol does have finite packet size it would allow you to stop processing of the packets if certain size has been achieved. The most effective way of doing this would be before the packet is stransformed to JavaScript heap. This means that you should handle the WebSocket transform manually - this is what the socket.io is doing for you but it does not take into account the packet size.
If you want to implement you own websocket layer, using this WebSocket -node implementation might be useful:
https://github.com/theturtle32/WebSocket-Node
If you do not need to support older browsers, using this pure websockets -approach might be suitable solution.
Well, I'll go with the Javascript side of the thing... let's say you don't want to allow users to go over a certain limit of data, you can just:
var allowedSize = 10;
Object.keys(Data).map(function( key, idx ) {
if( idx > allowedSize ) return;
// Do some work with Data[key]
});
this not only allows you to properly cycle through the elements of your object, it lets you limit easily. ( obviously this can also ruin your own pre-set requests )
Maybe destroy buffer size is what you need.
From the wiki:
destroy buffer size defaults to 10E7
Used by the HTTP transports. The Socket.IO server buffers HTTP request bodies up to this limit. This limit is not applied to websocket or flashsockets.

Javascript json eval() injection

I am making an AJAX chat room with the guidance of an AJAX book teaching me to use JSON and eval() function.
This chat room has normal chat function and a whiteboard feature.
When a normal text message comes from the php server in JSON format, the javascript in browser does this:
Without Whiteboard Command -------------------------------------------
function importServerNewMessagesSince(msgid) {
//loadText() is going to return me a JSON object from the server
//it is an array of {id, author, message}
var latest = loadText("get_messages_since.php?message=" + msgid);
var msgs = eval(latest);
for (var i = 0; i < msgs.length; i++) {
var msg = msgs[i];
displayMessage(escape(msg.id), escape(msg.author), escape(msg.contents));
} ...
The whiteboard drawing commands are sent by server in JSON format with special user name called "SVR_CMD", now the javascript is changed slightly:
With Whiteboard Command --------------------------------------------------
function importServerNewMessagesSince(msgid) {
//loadText() is going to return me a JSON object from the server
//it is an array of {id, author, message}
var latest = loadText("get_messages_since.php?message=" + msgid);
var msgs = eval(latest);
for (var i = 0; i < msgs.length; i++) {
var msg = msgs[i];
if (msg.author == "SVR_CMD") {
eval(msg.contents); // <-- Problem here ...
//I have a javascript drawLine() function to handle the whiteboard drawing
//server command sends JSON function call like this:
//"drawLine(200,345,222,333)" eval() is going to parse execute it
//It is a hacker invitation to use eval() as someone in chat room can
//insert a piece of javascript code and send it using the name SVR_CMD?
else {
displayMessage(escape(msg.id), escape(msg.author), escape(msg.contents));
}
} ...
Now, if the hacker changes his username to SVR_CMD in the script, then in the message input start typing javascript code, insdead of drawLine(200,345,222,333), he is injecting redirectToMyVirusSite(). eval() will just run it for him in everyone's browser in the chat room.
So, as you can see, to let the eval to execute a command from an other client in the chat room is obviously a hacker invitation. I understand the book I followed is only meant to be an introduction to the functions. How do we do it properly with JSON in a real situation?
e.g. is there a server side php or .net function to javascriptencode/escape to make sure no hacker can send a valid piece of javascript code to other client's browser to be eval() ? Or is it safe to use JSON eval() at all, it seems to be a powerful but evil function?
Thank you,
Tom
What is this book? eval is evil, there is not a single reason to use it, ever.
To transform a JSON string into a javascript object, you can do the following:
var obj = JSON.parse(latest)
Which means you can then use:
[].forEach.call(obj, function( o ) {
// You can use o.message, o.author, etc.
} )
To do the opposite (javascript object -> JSON string), the following works:
var json = JSON.stringify(obj)
It only is unsafe if the executed code is generated by other clients and not by the server. Of course you would need to prevent anybody to use that name, though I don't understand why you would use the "author" field? Just send an object {"whiteboard":"drawLine(x,y,z)"} instead of {"author":"SVR_CMD","contents":"drawLine(x,y,z)"}.
But it is right, eval() is still an invitation for hackers. One can always send invalid data and try to influence the output more or less directly. The only way for escaping is a proper serialisation of the data you want to receive and send - the drawings data. How do you receive the whiteboard commands? There is no serverside "escape" function to make javascript code "clean" - it would always be a security hole.
I would expect a serialisation like
message = {
"author": "...", // carry the information /who/ draws
"whiteboard": {
"drawline": [200, 345, 222, 333]
}
}
so you can sanitize the commands (here: "drawline") easiliy.
The use of eval() might be OK if you have very complex commands and want to reduce the transferred data by building them serverside. Still, you need to parse and escape the received commands from other clients properly. But I'd recommend to find a solution without eval.
Setting eval issue aside, do not use field that can be filled by user - .author in your code - for authentication purposes. Add another field to your JSON message, say .is_server_command that when present, would signify special treating of message. This field is will be not depended on user input and thus wouldn't be hijacked by "hacker".

Node.js JSON parsing error

I am attempting to make a Facebook application with node.js, however I'm having trouble in checking signed requests. Every time I make a request, the program throws a SyntaxError: Unexpected token ILLEGAL as such:
undefined:1
":"721599476"}
^^
SyntaxError: Unexpected token ILLEGAL
The culprit function is below:
function parse_signed_request(signed_request, secret) {
encoded_data = signed_request.split('.',2);
// decode the data
sig = encoded_data[0];
json = base64url.decode(encoded_data[1]);
data = JSON.parse(json); // ERROR Occurs Here!
// check algorithm - not relevant to error
if (!data.algorithm || data.algorithm.toUpperCase() != 'HMAC-SHA256') {
console.error('Unknown algorithm. Expected HMAC-SHA256');
return null;
}
// check sig - not relevant to error
expected_sig = crypto.createHmac('sha256',secret).update(encoded_data[1]).digest('base64').replace(/\+/g,'-').replace(/\//g,'_').replace('=','');
if (sig !== expected_sig) {
console.error('Bad signed JSON Signature!');
return null;
}
return data;
}
Just for testing, a valid signed_request would be
WGvK-mUKB_Utg0l8gSPvf6smzacp46977pTtcRx0puE.eyJhbGdvcml0aG0iOiJITUFDLVNIQTI1NiIsImV4cGlyZXMiOjEyOTI4MjEyMDAsImlzc3VlZF9hdCI6MTI5MjgxNDgyMCwib2F1dGhfdG9rZW4iOiIxNTI1NDk2ODQ3NzczMDJ8Mi5ZV2NxV2k2T0k0U0h4Y2JwTWJRaDdBX18uMzYwMC4xMjkyODIxMjAwLTcyMTU5OTQ3NnxQaDRmb2t6S1IyamozQWlxVldqNXp2cTBmeFEiLCJ1c2VyIjp7ImxvY2FsZSI6ImVuX0dCIiwiY291bnRyeSI6ImF1In0sInVzZXJfaWQiOiI3MjE1OTk0NzYifQ
Why am I getting this error when it is valid JSON and simply using a static string of JSON will work fine, and are there any tips to fix this?
Thanks.
Ok, after a bit of testing I've fixed the problem myself, sorry for the wasted question.
Something in my base64 library wasn't decoding the string properly (although it appeared to be - so it must have been a non-displaying character or padding, etc.)
I've changed over to https://github.com/kriszyp/commonjs-utils/blob/master/lib/base64.js which suits my purposes, although needed to be modified to support base64url decoding rather than normal base64, and it seems to work fine now.

Categories

Resources