How to setup a very fast node.js UDP server

How to setup a very fast node.js UDP server - javascript

I'm quite new to Node.js and I have a request for an application that would receive a payload of UDP packets and process it.
I'm talking about more than 400 messages per second, which would reach something like 200.000 messages/minute.
I have written a code to setup a UDP server (grabbed from docs here http://nodejs.org/api/all.html#all_udp_datagram_sockets actually) but it's losing something around 5% of the packets.
What I really need to develop is a server which would get the packet and send it to another worker do the job with the message. But looks like threading in node.js is a nightmare.
This is my core as is:
var dgram = require("dgram");
var fs = require("fs");
var stream = fs.createWriteStream("received.json",{ flags: 'w',
encoding: "utf8",
mode: 0666 });
var server = dgram.createSocket("udp4");
server.on("message", function (msg, rinfo) {
console.log("server got: " + msg + " from " +
rinfo.address + ":" + rinfo.port);
stream.write(msg);
});
server.on("listening", function () {
var address = server.address();
console.log("server listening " +
address.address + ":" + address.port);
});
server.bind(41234);
// server listening 0.0.0.0:41234

You are missing concepts, NodeJS is not meant to be multi-thread in terms of you mean, requests should be handled in a cycle. No other thread exists so no context-switches happens. In a multi-core environment, you can create a cluster via node's cluster module, I have a blog-post about this here.
You set the parent proceses to fork child processes, and ONLY child processes should bind to a port. Your parent proceses will handle the load balancing between children.
Note: In my blog post, I made i < os.cpus().length / 2; but it should be i < os.cpus().length;

I wrote a soap/xml forwarding service with a similar structure, and found that the info would come in 2 packets. I needed to update my code to detect 2 halves of the message and put them back together. This payload size thing may be more of an HTTP issue than a udp issue, but my suggestion is that you add logging to write out everything you are receiving and then go over it with a fine tooth comb. It looks like you would be logging what you are getting now, but you may have to dig into the 5% that you are losing.
How do you know its 5%?
if you send that traffic again, will it always be 5%?
are the same messages always lost.
I built a UDP server for voip/sip call data using ruby and Event Machine, and so far things have been working well. (I'm curious about your test approach though, I was doing everything over netcat, or a small ruby client, I never did 10k messages)

A subtle tip here. Why did you use UDP for a stream? You need to use TCP for streams. The UDP protocol sends datagrams, discrete messages. It will not break them apart on you behind the scenes. What you send is what you receive with UDP. (IP fragmentation is a different issue I'm not talking about that here). You don't have to be concerned with re-assembling a stream on the other side. That's one of the main advantages of using UDP instead of TCP. Also, if you are doing localhost to localhost you don't have to worry about losing packets due to network hiccups. You could lose packets if you overflow the network stack buffers though so give yourself big ones if you are doing high speed data transfer. So, forget about the stream, just use UDP send:
var udp_server = dgram.createSocket({
type: 'udp4',
reuseAddr: true,
recvBufferSize: 20000000 // <<== mighty big buffers
});
udp_server.send("Badabing, badaboom", remote_port, remote_address);
Go was developed by Google to deal with the proliferation of languages that occurs in modern tech shops. (it's crazy, I agree). I cannot use it because its culture and design prohibit using Exceptions which are the most important feature modern languages have for removing the huge amount of clutter added by old fashion error handling. Other than that, it's fine but that's a show stopper for me.

Related

Node.js how to handle drop down of server?

On Node.js api there lots of ifs and one can easily send request with some undefined var and crash the whole server until it re-starts again - something that could take up to 20 seconds.
I know that it should be checked if a variable is defined before working with it. But its very easy to forget something and keep working with an undefined var.
Is there a global definition to the server to avoid such of a drop down?

The easiest solution I could think of is implementing a cluster, in which only one process will go down, not the whole server. You could also make the process to go up again automatically. See more here
const cluster = require('cluster');
const http = require('http');
const numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
console.log(`Master ${process.pid} is running`);
// Fork workers.
for (let i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('exit', (worker, code, signal) => {
console.log(`worker ${worker.process.pid} died`);
});
} else {
// Workers can share any TCP connection
// In this case it is an HTTP server
http.createServer((req, res) => {
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);
console.log(`Worker ${process.pid} started`);
}

In any application there are a lot of "ifs" and assumptions. With JavaScript, being weakly typed and dynamic, you can really shoot yourself in the foot.
But the same rules apply here as any other language. Practice defensive programming. That is, cover all the bases in each function and statement block.
You can also try out programming Nodejs with Typescript. It ads some static type checking and other nice features that help you miss your foot when you shoot. You can also use (I think) Flow to statically type check things. But these won't make you a better programmer.
One other suggestion is to design your system as a SOA. So that one portion going down doesn't necessarily affect others. "Microservices" being a subset of that.

First, defensive programming and extensive testing are your friends. Obviously, preventing an issue before it happens is much better than trying to react to it after it happens.
Second, there is no foolproof mechanism for catching all exceptions at some high level and then putting your server back into a known, safe state. You just can't really do that in any complex server because you don't know what you were in the middle of when the exception happened. People will often try to do this, but it's like proceeding with a wounded server that may have some messed up internals. It's not safe or advisable. If a problem was not intercepted (e.g. exception caught or error detected) at the level where it occurred by code that knows how to properly handle that situation, then the only completely safe path forward is to restart your server.
So, if after implementing as much defensive programming as you possibly can and testing the heck out of it, you still want to prevent end-user downtime from a server crash/restart, then the best way to do that is to assume that a given server process will occasionally need to be restarted and plan for that.
The simplest way to prevent end-user downtime when a server process restarts is to use clustering and thus have multiple server processes with some sort of load balancer that both monitors server processes and routes incoming connections among the healthy server processes. When one server process is down, it is temporarily taken out of the rotation and other server processes can handle new, incoming connections. When the failed server process is successfully restarted, it can be added back to the rotation and be used again for new requests. Clustering in this way can be done within a single server (multiple processes on the same server) or across servers (multiple servers, each with server processes on them).
In some cases, this same process can even be used to roll out a new version of server code without any system downtime (doing this requires additional planning).

Is it bad to keep a server client response connection open and alive in node.js?

Is it a bad idea or just bad performance overall to keep a node.js server to client connection open? (not using x.end(); ).
I'm trying to play around with node.js, getting a hang of it and this is what I'm trying to do:
Use the node.js executable file as an open handle for multi-clients to communicate with each other, what I've done so far just to test around is to create an HTTP server which just for the sense, have an interval of 5000ms writing "test". And since having this idea of communicating forever with the client until they disconnect, I can't see myself ending the handle.
var http = require("http"),
date = require("./modules/date.js");
var server = http.createServer(function(request, response) {
console.log("[" + date.currentTimestamp() + "] Receiving a connection from " + server.address().address);
response.writeHead(200, {"Content-Type": "text/html"});
response.write("Hello World!<br>Current time: " + date.currentTimestamp() + "<br>Current url: " + request.url);
setInterval(function() {
response.write(" test");
}, 5000);
}).listen(80);
console.log("[" + date.currentTimestamp() + "] Server has initialized.");

The HTTP protocol is a request-response paradigm. In a conversation, it would a person saying one thing and another responding. The code works, but the protocol is not being used the way it was designed. The better way to accomplish your goal is through using a websocket. Websockets are designed to be kept open and transfer data back and forth for long lived connections.

The idea is not that bad, you just reinvented long polling :)
It is a pattern of server-client interactions that has been widely used for sending server events. That’s how it works:
1. Client subscribes for some event by connecting to the server.
2. Server keeps the connection open as long as possible
3. If the event fires while the connection is open, server responds with the event payload and metadata and closes the connection.
4. If no events fire, the connection is closed after some amount of time to avoid spending resources on inactive clients.
5. In the both cases client reconnects and expects further events.
Long polling was invented before websocket have been introduced. While websockets are designed just for long living client-server event exchange, long polling is more kinda a trick. The best idea is to use websocket for browsers that support it, and fall back to long polling for the ones that don’t. Some libraries like socket.io can do that automatically, btw.

Sending lots of small packets out of Node.js UDP doesn't send all of them

I have a project that I'm working on that needs to send a small 9 byte packet to 7000 different hosts outside the local network, after which it waits for their replies back on the same port and processes the responses.
The problem I'm having is Node.js dgram (udp4) doesn't seem to be sending all packets out. I'm not rate limiting in any way so there might be an issue there.
I'm looping, creating the packets, then firing them straight out using .send(). With Wireshark open I can see that out of the 7000 being "sent" only ~1300 of them appear to be hitting the wire and leaving.
The script itself is reporting all packets as sent with no errors, Wireshark shows a different result, and the hosts at the other end reflect what Wireshark says, they don't receive the packet. I'm using the following to send and verify, packet is a buffer.
udpServer.send(packet, 0, packet.length, port, address, function(error) {
if (!error) {
successes++;
console.log(successes + "/" + total);
} else {
console.log(error);
}
});
Any ideas on what I am doing wrong here, or what's been overlooked?

There are many junctures where your packet could be dropped:
dropped when sending to kernel. Are you on linux? Try using strace to find the system call (probably send, sendmsg, or sendto) return value. If the system call is returning an error, I'd expect that to be reported in "error" by Node, though.
dropped in the kernel tx queue. On linux you can check, eg, /sys/class/net/eth0/statistics/ and see if any drop counters are incrementing.
dropped in hardware tx queue. If using an Intel NIC you can run ethtool -S eth0 to see if any drop counters are incremented.
dropped in intermediate hardware (eg switches/routers). This is trickier to see, as it's vendor dependent and maybe invisible. You can eliminate this by hooking your machines up directly to each other.
dropped in hardware rx queue. On the remote end, check all same stats to see if an overflow is occurring there.

Nodejs + Socket.IO - how to get total bytes transferred (read/write) after socket close

I have a simple socket.io client and server program, running on node.js. The client and server exchange messages between them for a few minutes, before disconnecting (like chat).
If there any function/method I can use to get the total bytes transferred (read/write), after the socket is closed?
At present I am adding up the message size for each each message sent and received by the client. But, as per my understanding, in socket.io depending on which protocol is used (websocket, xhr-polling, etc.), the size of the final payload being sent will differ due to the header/wrapper size. Hence, just adding message bytes won't give me an accurate measure of bytes transferred.
I can use monitoring tools like Wireshark to get this value, but I would prefer using a javascript utility to get this value. Search online, didn't give me any reasonable answer.
For pure websocket connections, I am being able to get this value using the functions: socket._socket.bytesRead and socket._socket.bytesWritten
Any help is appreciated!

As of socket v2.2.0 i managed to get byte data like this. Only problem these are specified when client closes browser window and reason parameter is transport error. If client uses socket.close() or .disconnect() or server uses .disconnect() then bytes are 0.
socket.on('disconnect', (reason) => {
let symbs = Object.getOwnPropertySymbols(socket.conn.transport.socket._socket);
let bytesRead = socket.conn.transport.socket._socket[symbs[3]];
let bytesWritten = socket.conn.transport.socket._socket[symbs[4]];
});

If you wanted such a feature that would work no matter what the underlying transport was below a socket.io connection, then this would have to be a fundamental feature of socket.io because only it knows the details of what it's doing with each transport and protocol.
But, socket.io does not have this feature built in for the various transports that it could use. I would conclude that if you're going to use the socket.io interface to abstract out the specific protocol and implementation on top of that protocol, then you give up the ability to know exactly how many bytes socket.io chooses to use in order to implement the connection on its chosen transport.
There are likely debug APIs (probably only available to browser extensions, not standard web pages) that can give you access to some of the page-wide info you see in the Chrome debugger so that might be an option to investigate. See the info for chrome.devtools.network if you want more info.

Proper way to monitor/control a server remotely over http in realtime

On my client (a phone with a browser) I want to see the stats of the server CPU,RAM & HDD and gather info from various logs.
I'm using ajax polling.
On the client every 5 sec (setInterval) I call a PHP file:
scan a folder containing N logs
read the last line of each log
convert that to JSON
Problems:
Open new connection every 5 sec.
Multiple AJAX calls.
Request headers (they are also data and so consume bandwidth)
Response headers (^)
Use PHP to read files every 5 sec. even if nothing changed.
The final JSON data is less than 5 KB, but I send it every 5 sec, and there are the headers and new connection every time, so basically every 5 sec., I have to send 5-10 KB to get 5 KB which are 10-20 KB.
Those are 60 sec / 5 sec = 12 new connections per minute and about 15 MB per hour of traffic if I leave the app open.
Lets say I have 100 users that I let monitor / control my server that would be around 1.5 GB outgoing traffic in one hour.
Not to mention that the PHP server is reading multiple files 100 times every 5 sec.
I need something that on the server reads the last lines of those logs every 5 sec and maybe writes them to a file, then I want to push this data to the client only if it's changed.
SSE (server sent events) with PHP
header('Content-Type: text/event-stream');
header('Cache-Control: no-cache');
while(true){
echo "id: ".time()."\ndata: ".ReadTheLogs()."\n\n";
ob_flush();
flush();
sleep(1);
}
In this case after the connection is established with the first user
the connection keeps open (PHP is not made for that) and so I save some space (request headers,response headers). This work on my server bu most server don't allow to keep the connection open for long time.
Also with multiple users I read the log multiple times.(slowing down my old server)
And I can't control the server ... I would need to use ajax to send a command...
I need WebSockets!!!
node.js and websockets
using node.js, from what i understand, i can do all this without consuming alot
of resources and bandwich. The connection keeps open so no unnecessary headers, i can recieve and send data.it handles multiple users very well.
And this is where i need your help.
the node.js server should in background update, and store the logs data every 5 sec if the files are modified.OR should that do the operating system with (iwatch,dnotify...)
the data should be pushed only if changed.
the reading of the logs should be happen only one time after 5 sec ... so not triggered by each user.
this is the first example i have found.....and modified..
var ws=require("nodejs-websocket");
var server=ws.createServer(function(conn){
var data=read(whereToStoreTheLogs);
conn.sendText(data)// send the logs data to the user
//on first connection.
setTimeout(checkLogs,5000);
/*
here i need to continuosly check if the logs are changed.
but if i use setInterval(checkLogs,5000) or setTimeout
every user invokes a new timer and so having lots of timers on the server
can i do that in background?
*/
conn.on("text",function(str){
doStuff(str); // various commands to control the server.
})
conn.on("close",function(code,reason){
console.log("Connection closed")
})
}).listen(8001);
var checkLogs=function(){
var data=read(whereToStoreTheLogs);
if(data!=oldData){
conn.sendText(data)
}
setTimeout(checkLogs,5000);
}
the above script would be the notification server, but i also need to find a solution to store somwhere the info of those multiple logs and do that everytime something is changed, in the background.
How would you do to keep the bandwich low but also the server resources.
How would you do?
EDIT
Btw. is there a way to stream this data simultaneosly to all the clinets?
EDIT
About the logs: i also want to be able to scale the time dilatation between updates... i mean if i read the logs of ffmpeg i ned the update every sec if possible... but when no conversion is active.. i need to get the basic machine info every 5min maybe ... and so on...
GOALS:
1. performant way to read & store somewhere the logs data (only if clinets connected...[mysql,file, it's possible to store this info inside the ram(with node.js??)]).
2. performant way to stream the data to the various clients (simultanously).
3. be able to send commands to the server.. (bidirectional)
4. using web languages (js,php...), lunix commands( something that is easy to implement on multiple machines).. free software if needed.
best approach would be:
read the logs, based on current activity, to the system memory and stream simultaneously and continuosly, with an already open connection, to the various clients with webSockets.
i'don't know anything that could be faster.
UPDATE
The node.js server is up and running, using the http://einaros.github.io/ws/ webSocketServer implementation, as it appears to be the fastest one.
I wrote with the help of #HeadCode the following code to handle properly the client situation & to keep the process as low as possible. checking various things inside the broadcast loop. Now the pushing & the client handling is at a good point.
var
wss=new (require('ws').Server)({port:8080}),
isBusy,
logs,
clients,
i,
checkLogs=function(){
if(wss.clients&&(clients=wss.clients.length)){
isBusy||(logs=readLogs()/*,isBusy=true*/);
if(logs){
i=0;
while(i<clients){
wss.clients[i++].send(logs)
}
}
}
};
setInterval(checkLogs,2000);
But atm i'm using a really bad way to parse the logs.. (nodejs->httpRequest->php).. lol. After some googling i found out that i totally could stream the output of linux software directly to the nodejs app ... i didn't checked... but maybe that would be the best way to do it. node.js also has a filesystem api where icould read the logs. linux has it's own filesystem api.
the readLogs()(can be async) function is still something i'm not happy with.
nodejs filesystem?
linuxSoftware->nodejs output implementation
linux filesystem api.
keep in mind that i need to scan various folders for logs and then parse somehow the outputted data, and this every 2 seconds.
ps.: i adde isBusy to the server variables in case the logReading sytem is async.
EDIT
Answer is not complete.
Missing:
A performant way to read,parse and store the logs somewhere (linux filesystem api, or nodejs api, so the i store directly into system memory)
An explaination if it's possible to stream data directly to multiple users .
apparently nodejs loops trough the clients and so (i think) sending multiple times the data.
btw is it possible/worth to close the node server if there are no clients and restart on new connections on the apache side. (ex: if i connect to the apache hosted html file a script launches the nodejs server again). doing so would further reduce the memory leaking???right?
EDIT
After some experimenting with websockets (some videos are in the comments) i learned some new stuff. Raspberry PI has the possibility to use some CPU DMA channels to to high frequency stuff like PWM... i need to somehow understand how that works.
When using sensors and stuff like that i should store everything inside the RAM, nodejs already does that?? (in a variable inside the script)
websocket remains the best choice as it's basically easely accessible from any device now, simply using a browser.

I haven't used nodejs-websocket, but it looks like it will accept an http connection and do the upgrade as well as creating the server. If all you care about receiving is text/json then I suppose that would be fine, but it seems to me you might want to serve a web page along with it.
Here is a way to use express and socket.io to achieve what you're asking about:
var express = require('express');
var app = express();
var http = require('http').Server(app);
var io = require('socket.io')(http);
app.use(express.static(__dirname + '/'));
app.get('/', function(req, res){
res.sendfile('index.html');
});
io.on('connection', function(socket){
// This is where we should emit the cached values of everything
// that has been collected so far so this user doesn't have to
// wait for a changed value on the monitored host to see
// what is going on.
// This code is based on something I wrote for myself so it's not
// going to do anything for you as is. You'll have to implement
// your own caching mechanism.
for (var stat in cache) {
if (cache.hasOwnProperty(stat)) {
socket.emit('data', JSON.stringify(cache[stat]));
}
}
});
http.listen(3000, function(){
console.log('listening on *:3000');
});
(function checkLogs(){
var data=read(whereToStoreTheLogs);
if(data!=oldData){
io.emit(data)
}
setTimeout(checkLogs,5000);
})();
Of course, the checkLogs function has to be fleshed out by you. I have only cut and pasted it in here for context. The call to the emit function of the io object will send the message out to all connected users but the checkLogs function will only fire once (and then keep calling itself), not every time someone connects.
In your index.html page you can have something like this. It should be included in the html page at the bottom, just before the closing body tag.
<script src="/path/to/socket.io.js"></script>
<script>
// Set up the websocket for receiving updates from the server
var socket = io();
socket.on('data', function(msg){
// Do something with your message here, such as using javascript
// to display it in an appropriate spot on the page.
document.getElementById("content").innerHTML = msg;
});
</script>
By the way, check out the Nodejs documentation for a variety of built-in methods for checking system resources (https://nodejs.org/api/os.html).
Here's also a solution more in keeping with what it appears you want. Use this for your html page:
<!DOCTYPE HTML>
<html>
<head>
<meta charset="utf-8">
<title>WS example</title>
</head>
<body>
<script>
var connection;
window.addEventListener("load", function () {
connection = new WebSocket("ws://"+window.location.hostname+":8001")
connection.onopen = function () {
console.log("Connection opened")
}
connection.onclose = function () {
console.log("Connection closed")
}
connection.onerror = function () {
console.error("Connection error")
}
connection.onmessage = function (event) {
var div = document.createElement("div")
div.textContent = event.data
document.body.appendChild(div)
}
});
</script>
</body>
</html>
And use this as your web socket server code, recently tweaked to use the 'tail' module (as found in this post: How to do `tail -f logfile.txt`-like processing in node.js?), which you will have to install using npm (Note: tail makes use of fs.watch, which is not guaranteed to work the same everywhere):
var ws = require("nodejs-websocket")
var os = require('os');
Tail = require('tail').Tail;
tail = new Tail('./testlog.txt');
var server = ws.createServer(function (conn) {
conn.on("text", function (str) {
console.log("Received " + str);
});
conn.on("close", function (code, reason) {
console.log("Connection closed");
});
}).listen(8001);
setInterval(function(){ checkLoad(); }, 5000);
function broadcast(mesg) {
server.connections.forEach(function (conn) {
conn.sendText(mesg)
})
}
var load = '';
function checkLoad(){
var new_load = os.loadavg().toString();
if (new_load === 'load'){
return;
}
load = new_load;
broadcast(load);
}
tail.on("line", function(data) {
broadcast(data);
});
Obviously this is very basic and you will have to change it for your needs.

I had made a similar implementation recently using Munin . Munin is a wonderful server monitoring tool, open source too which also provides a REST API. There several plugins available for your needs monitoring CPU, HDD and RAM usage of your server.

You need to build a push notification server. All clients who are listening, will then get a push notification when new data is updated. See this answer for more information: PHP - Push Notifications
As to how you would update the data, I'd suggest using OS-based tools to trigger a PHP script (command line) that will generate an "push" the json file out to any client currently listening. Any new client logging on to "listen" will get served the current json available, until it's updated.
This way you're not subject to 100 users using 100 connections and how much ever bandwidth to poll your server every 5 seconds, and only get updated when they need to know there's an update.

How about a service that reads all the log info (via IPMI, Nagios or whatever) and creates the output files on some schedule. Then anyone that wants to connect can just read this output rather than hammering the server logs. Essentially have one hit on the server logs then everyone else just reads a web page.
This could be implemented pretty easily.
BTW: Nagios has a v nice free edition

Answering just these bits of your question:
performant way to stream the data to the various clients (simultanously).
be able to send commands to the server.. (bidirectional)
using web languages (js,php...), lunix commands( something that is easy to implement on multiple machines).. free software if needed.
I'll recommend the Bayeux protocol as made simple by the CometD project. There are implementations in a variety of languages and it's really easy to use in its simplest form.
Meteor is broadly similar. It's an application development framework rather than a family of libraries, but it solves the same problems.

Some suggestions:
Munin for charts
NetSNMP (used by Munin, but you can also use Bash and Cron to build traps that send SMS texts on alerts)
Pingdom for remote alerts about how well the server is responding to ping and HTTP checks. It can SMS text you or call a phone, as well as have call escalation rules.

Develop Reference

JavaScript is the programming language of the Web.