EDITED
I have a nodeJS http server that is meant for receiving uploads from multiple clients and processing them separately.
My problem is that I've verified that the first request blocks the reception of any other request until the previous request is served.
This is the code I've tested:
var http = require('http');
http.globalAgent.maxSockets = 200;
var url = require('url');
var instance = require('./build/Release/ret');
http.createServer( function(req, res){
var path = url.parse(req.url).pathname;
console.log("<req>"+path+"</req>");
switch (path){
case ('/test'):
var body = [];
req.on('data', function (chunk) {
body.push(chunk);
});
req.on('end', function () {
body = Buffer.concat(body);
console.log("---req received---");
console.log(Date.now());
console.log("------------------");
instance.get(function(result){
postHTTP(result, res);
});
});
break;
}
}).listen(9999);
This is the native side (omitting obvious stuff) where getInfo is the exported method:
std::string ret2 (){
sleep(1);
return string("{\"image\":\"1.JPG\"}");
}
Handle<Value> getInfo(const Arguments &args) {
HandleScope scope;
if(args.Length() == 0 || !args[0]->IsFunction())
return ThrowException(Exception::Error(String::New("Error")));
Persistent<Function> fn = Persistent<Function>::New(Handle<Function>::Cast(args[0]));
Local<Value> objRet[1] = {
String::New(ret2().c_str())
};
Handle<Value> ret = fn->Call(Context::GetCurrent()->Global(), 1, objRet);
return scope.Close(Undefined());
}
I'm resting this with 3 curl parallel requests
for i in {1..3}; do time curl --request POST --data-binary "#/home/user/Pictures/129762.jpg" http://192.160.0.1:9999/test & done
This is the output from the server:
<req>/test</req>
---req received---
1397569891165
------------------
<req>/test</req>
---req received---
1397569892175
------------------
<req>/test</req>
---req received---
1397569893181
------------------
These the response and the timing from the client:
"1.JPG"
real 0m1.024s
user 0m0.004s
sys 0m0.009s
"1.JPG"
real 0m2.033s
user 0m0.000s
sys 0m0.012s
"1.JPG"
real 0m3.036s
user 0m0.013s
sys 0m0.001s
Apparently requests are received after the previous has been served. The sleep(1) simulates a synchronous operation that requires about 1s to complete and can't be changed.
The client receives the responses with an incremental delay of ~1s.
I would like to achieve a kind of parallelism, although I'm aware I'm in a single threaded environment such as nodeJS. What I would like to achieve is receiving all 3 answers is ~1s.
Thanks in advance for your help.
This:
for(var i=0;i<1000000000;i++) var a=a+i;
Is a pretty severe blocking operation. As soon as the first block ends. Your whole server hangs until this for loop is done. I'm interested in why you are trying to do this.
Perhaps you are trying to simulate a delayed response ?
setTimeout(function)({
send404(res);
}, 3000);
Right now you are turning a non-flowing stream into flowing mode by attaching a data event handler, and subsequently loading the whole stream into memory. You probably don't want to do this.
You can use the stream in now-flowing mode as illustrated below, this is useful if you want to send the data to some place that is only accessible after some other event.
However, using the stream in flowing mode is the fastest. If you want to write your own body parser I suppose you might want to use flowing mode, it depends on your use case.
req.on('readable', function () {
var chunk;
while (null !== (chunk = readable.read())) {
body.push(chunk);
}
});
Flowing and non-flowing mode is also know as respectively v1 and v2 streams, as the older streams used in node only supported flowing mode.
Related
My project works as intended except that I have to refresh the browser every time my keyword list sends something to it to display. I assume it's my inexperience with Expressjs and not creating the route correctly within my websocket? Any help would be appreciated.
Browser
let socket = new WebSocket("ws://localhost:3000");
socket.addEventListener('open', function (event) {
console.log('Connected to WS server')
socket.send('Hello Server!');
});
socket.addEventListener('message', function (e) {
const keywordsList = JSON.parse(e.data);
console.log("Received: '" + e.data + "'");
document.getElementById("keywordsList").innerHTML = e.data;
});
socket.onclose = function(code, reason) {
console.log(code, reason, 'disconnected');
}
socket.onerror = error => {
console.error('failed to connect', error);
};
Server
const ws = require('ws');
const express = require('express');
const keywordsList = require('./app');
const app = express();
const port = 3000;
const wsServer = new ws.Server({ noServer: true });
wsServer.on('connection', function connection(socket) {
socket.send(JSON.stringify(keywordsList));
socket.on('message', message => console.log(message));
});
// `server` is a vanilla Node.js HTTP server, so use
// the same ws upgrade process described here:
// https://www.npmjs.com/package/ws#multiple-servers-sharing-a-single-https-server
const server = app.listen(3000);
server.on('upgrade', (request, socket, head) => {
wsServer.handleUpgrade(request, socket, head, socket => {
wsServer.emit('connection', socket, request);
});
});
In answer to "How to Send and/or Stream array data that is being continually updated to a client" as arrived at in comment.
A possible solution using WebSockets may be to
Create an interface on the server for array updates (if you haven't already) that isolates the array object from arbitrary outside modification and supports a callback when updates are made.
Determine the latency allowed for multiple updates to occur without being pushed. The latency should allow reasonable time for previous network traffic to complete without overloading bandwidth unnecessarily.
When an array update occurs, start a timer if not already running for the latency period .
On timer expiry JSON.stringify the array (to take a snapshot), clear the timer running status, and message the client with the JSON text.
A slightly more complicated method to avoid delaying all push operations would be to immediately push single updates unless they occur within a guard period after the most recent push operation. A timer could then push modifications made during the guard period at the end of the guard period.
Broadcasting
The WebSockets API does not directly support broadcasting the same data to multiple clients. Refer to Server Broadcast in ws documentation for an example of sending data to all connected clients using a forEach loop.
Client side listener
In the client-side message listener
document.getElementById("keywordsList").innerHTML = e.data;
would be better as
document.getElementById("keywordsList").textContent = keywordList;
to both present keywords after decoding from JSON and prevent them ever being treated as HTML.
So I finally figured out what I wanted to accomplish. It sounds straight forward after I learned enough and thought about how to structure the back end of my project.
If you have two websockets running and one needs information from the other, you cannot run them side by side. You need to have one encapsulate the other and then call the websocket INSIDE of the other websocket. This can easily cause problems down the road for other projects since now you have one websocket that won't fire until the other is run but for my project it makes perfect sense since it is locally run and needs all the parts working 100 percent in order to be effective. It took me a long time to understand how to structure the code as such.
I am using SignalR to start long computations on server side and post a message to the client when the result is available.
The input bindings is an HTTP request.
I would like to be able to send multiple messages back in order to notify the client of the differents steps of the process (eg, computation starts, computation ends, etc..).
I tried pushing different messages to context.bindings.signalRMessages but I see that everything is sent together at the end of the whole process. Is there a way to send several messages at different times?
Another related issue is that my HTTP request on client side remains stuck until the end of the process. I would like to be able to post a quick response early, since I get the response via a signalR message.
Here is my server code :
module.exports = async function(context, req) {
let ID = context.bindingData.invocationId;
context.bindings.signalRMessages = [];
const messageQueue = context.bindings.signalRMessages;
var postMessage = (message) => {
message.userId = req.query.userId;
message.isPrivate = true;
messageQueue.push(message);
};
let preProcessData = preProcess(req.body.input);
let startMessage = {
"target": "optimStart",
"arguments": [{ preProcessData: preProcessData }]
};
postMessage(startMessage); // <<<< I want this one to be sent immediately
try {
let optimOutput = await computeOptim(req.body.input, ID); // that's the long process
let response = {
optimId: ID,
optimOutput: optimOutput
};
let optimCompleteMessage = {
"target": "optimComplete",
"arguments": [response]
};
postMessage(optimCompleteMessage);
} catch (err) {
// ....
}
};
Am I doing anything wrong or is it just not possible ?
Thanks!
This is not possible with a simple HTTP triggered function since bindings resolve only once the execution of the function completes.
For your scenario, durable functions would be the perfect choice.
You would still have a HTTP Triggered function (client function) to start on orchestration and return immediately. In the orchestration function, you would have separate activity functions for the processing and for sending updates to the client using the SignalR binding.
In short, I've run into an issue where multiple parallel GET requests to my Node.js server cause the server to get "clogged up" and hang, thus resulting in timeouts for the clients (503, service unavailable).
After a lot of performance analysis, I've realized it's a CPU issue. The specific request (we'll call it GET /foo) queries data from multiple services over HTTP, and then does a lot of computation, and returns the results to the client, like this:
Client request GET /foo
/foo controller queries data over HTTP from multiple other services`
/foo controller then does a bunch of iterations over the data to compile some output for the client
Step 3 takes around 2 seconds to complete. However, if I send 2 requests in parallel to /foo, each client will receive their response in about 4 seconds. When I run the app in a cluster using more cores, the requests run much faster, but not quite what I want.
Seems like I have several options here:
pre-compute the response (ideally would like to avoid this for now, since it will require a whole "cache invalidation" scheme), or
/foo sends the CPU-blocking computation asynchronously to another process (using Heroku, so that would be another dyno), and then I can use a websocket or something to push the results to the client (again, very complex for my situation), or
somehow yield to a child process in the request and return the results to the client
Would love to do something like option 3. Something like this:
get('/foo', function*(request) {
// I/O, so not blocking the event loop (I think)
let data = yield getData(request)
// make this happen in a different process
let response = yield doSomeHeavyProcessing(data)
return response
})
I've omitted a lot of implementation details above, but if it's necessary to know, I'm using Koa and Node.js 6.
Ideally, doSomeHeavyProcessing would do the CPU-intensive computation in some separate process, and when it's done, still send the results back in a "synchronous" fashion to the request client.
Been trying to wrap my head around child processes, web workers, fibers, etc., and have been doing some basic "hello worlds" with these to get them to do basically the above, but to no avail. Can post more details if necessary.
Here are some approaches that you can try:
1.
Split blocking computation in small chunks and use setImmediate to place the next chunk of work at the end of the event queue. So computation is no longer blocking and other requests can be processed.
2.
Microsoft recently released napajs. As stated in their README
As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them.
I haven't tried it, but it looks very promising:
var napa = require('napajs');
var zone1 = napa.zone.create('zone1', { workers: 4 });
get('/foo', function*(request) {
let data = yield getData(request)
let response = yield zone1.execute(doSomeHeavyProcessing, [data])
return response
})
3. If nothing of the above is enough and you need to spread the load across multiple machines, then you probably couldn't avoid using some sort of message queue to distribute work to different servers. In this case check out ZeroMQ. It is extremely easy to use from node, and you can implement any kind of distributed messaging pattern with it.
You could utilize Child process with additional wrapper for convenience.
worker.js - this module will run in a separate process and will do the heavy work
const crypto = require('crypto');
function doHeavyWork(data) {
return crypto.pbkdf2Sync(data, 'salt', 100000, 64, 'sha512');
}
process.on('message', (message) => {
const result = doHeavyWork(message.data);
process.send({ id: message.id, result });
});
client.js - a convenience (but primitive) wrapper for Child process
const cp = require('child_process');
let worker;
const resolves = new Map();
module.exports = {
init(moduleName, errorCallback) {
worker = cp.fork(moduleName);
worker.on('error', errorCallback);
worker.on('message', (message) => {
const resolve = resolves.get(message.id);
resolves.delete(message.id);
if (!resolve) {
errorCallback(new Error(`Got response from worker with unknown id: ${message.id}`));
return;
}
resolve(message.result);
});
console.log(`Service PID: ${process.pid}, Worker PID: ${worker.pid}`);
},
doHeavyWorkRemotly(data) {
const id = `${Date.now()}${Math.random()}`;
return new Promise((resolve) => {
worker.send({ id, data });
resolves.set(id, resolve);
});
}
}
I use fork() to utilize an additional communication channel as it is stated in the docs.
Also I keep a record of all submitted to worker process requests (const resolves = new Map();) and resolve Promises (resolve(message.result);) only when the worker process returns response for the specific request (const resolve = resolves.get(message.id);).
run.js - a startup module, it utilizes co to 'execute' generators.
const co = require('co');
const client = require('./client');
function errorCallback(error) {
console.log('Got an unexpected error!');
console.log(error);
}
client.init('./worker.js', errorCallback);
function* run() {
while(true) {
yield client.doHeavyWorkRemotly('mydata');
}
}
co(run);
To test it simply run node run.js, it will print
Service PID: XXXX, Worker PID: XXXX
then take a look at CPU utilization, worker process will probably take around 100% of CPU while Service will be quite idle.
Im currently building a webapp that has two clear use cases.
Traditional client request data from server.
Client request a stream from the server after wich the server starts pushing data to the client.
Currently im implementing both 1 and 2 using json message passing over a websocket. However this has proven hard since I need to handcode lots of error handling since the client is not waiting for the response. It just sends the message hoping it will get a reply sometime.
Im using Js and react on the frontend and Clojure on the backend.
I have two questions regarding this.
Given the current design, what alternatives are there for error handling over a websocket?
Would it be smarter to split the two UC using rest for UC1 and websockets for UC2 then i could use something like fetch on the frontend for rest calls.
Update.
The current problem is not knowing how to build an async send function over websockets can match send messages and response messages.
Here's a scheme for doing request/response over socket.io. You could do this over plain webSocket, but you'd have to build a little more of the infrastructure yourself. This same library can be used in client and server:
function initRequestResponseSocket(socket, requestHandler) {
var cntr = 0;
var openResponses = {};
// send a request
socket.sendRequestResponse = function(data, fn) {
// put this data in a wrapper object that contains the request id
// save the callback function for this id
var id = cntr++;
openResponses[id] = fn;
socket.emit('requestMsg', {id: id, data: data});
}
// process a response message that comes back from a request
socket.on('responseMsg', function(wrapper) {
var id = wrapper.id, fn;
if (typeof id === "number" && typeof openResponses[id] === "function") {
fn = openResponses[id];
delete openResponses[id];
fn(wrapper.data);
}
});
// process a requestMsg
socket.on('requestMsg', function(wrapper) {
if (requestHandler && wrapper.id) {
requestHandler(wrapper.data, function(responseToSend) {
socket.emit('responseMsg', {id: wrapper.id, data; responseToSend});
});
}
});
}
This works by wrapping every message sent in a wrapper object that contains a unique id value. Then, when the other end sends it's response, it includes that same id value. That id value can then be matched up with a particular callback response handler for that specific message. It works both ways from client to server or server to client.
You use this by calling initRequestResponseSocket(socket, requestHandler) once on a socket.io socket connection on each end. If you wish to receive requests, then you pass a requestHandler function which gets called each time there is a request. If you are only sending requests and receiving responses, then you don't have to pass in a requestHandler on that end of the connection.
To send a message and wait for a response, you do this:
socket.sendRequestResponse(data, function(err, response) {
if (!err) {
// response is here
}
});
If you're receiving requests and sending back responses, then you do this:
initRequestResponseSocket(socket, function(data, respondCallback) {
// process the data here
// send response
respondCallback(null, yourResponseData);
});
As for error handling, you can monitor for a loss of connection and you could build a timeout into this code so that if a response doesn't arrive in a certain amount of time, then you'd get an error back.
Here's an expanded version of the above code that implements a timeout for a response that does not come within some time period:
function initRequestResponseSocket(socket, requestHandler, timeout) {
var cntr = 0;
var openResponses = {};
// send a request
socket.sendRequestResponse = function(data, fn) {
// put this data in a wrapper object that contains the request id
// save the callback function for this id
var id = cntr++;
openResponses[id] = {fn: fn};
socket.emit('requestMsg', {id: id, data: data});
if (timeout) {
openResponses[id].timer = setTimeout(function() {
delete openResponses[id];
if (fn) {
fn("timeout");
}
}, timeout);
}
}
// process a response message that comes back from a request
socket.on('responseMsg', function(wrapper) {
var id = wrapper.id, requestInfo;
if (typeof id === "number" && typeof openResponse[id] === "object") {
requestInfo = openResponses[id];
delete openResponses[id];
if (requestInfo) {
if (requestInfo.timer) {
clearTimeout(requestInfo.timer);
}
if (requestInfo.fn) {
requestInfo.fn(null, wrapper.data);
}
}
}
});
// process a requestMsg
socket.on('requestMsg', function(wrapper) {
if (requestHandler && wrapper.id) {
requestHandler(wrapper.data, function(responseToSend) {
socket.emit('responseMsg', {id: wrapper.id, data; responseToSend});
});
}
});
}
There are a couple of interesting things in your question and your design, I prefer to ignore the implementation details and look at the high level architecture.
You state that you are looking to a client that requests data and a server that responds with some stream of data. Two things to note here:
HTTP 1.1 has options to send streaming responses (Chunked transfer encoding). If your use-case is only the sending of streaming responses, this might be a better fit for you. This does not hold when you e.g. want to push messages to the client that are not responding to some sort of request (sometimes referred to as Server side events).
Websockets, contrary to HTTP, do not natively implement some sort of request-response cycle. You can use the protocol as such by implementing your own mechanism, something that e.g. the subprotocol WAMP is doing.
As you have found out, implementing your own mechanism comes with it's pitfalls, that is where HTTP has the clear advantage. Given the requirements stated in your question I would opt for the HTTP streaming method instead of implementing your own request/response mechanism.
I'm creating http requests from my NodeJS application like this:
var start;
var req = http.request(options, function (res) {
res.setEncoding('utf8');
var body = '';
res.on('data', function (chunk) {
body += chunk;
});
res.on('end', function () {
var elapsed = process.hrtime(start)[1] / 1000000; // in milliseconds
console.log(elapsed);
// whatever
});
});
req.on('socket', function (res) {
start = process.hrtime();
});
req.on('error', function (res) {
// whatever
});
if (props.data) req.write(props.data);
req.end();
I want to find out how long my requests take - starting from (or the closest I can get to) the moment the request has been over the wire (and not the moment that the promise has been created) and up to the moment that "response end" event kicked in.
I'm having a bit of trouble finding out the closest moment / event which I could hook to to start measuring the time. Http client socket event is my best bet so far but its description:
Emitted after a socket is assigned to this request.
doesn't really tell whether that's the event I'm looking for.. So, my question is - am I doing this right or is there a better event I could use (or even a better way of doing the whole thing)? Thanks.
Yes, my understanding is that the socket event is the way to go.
For some proof: in the Node.js HTTP code, this.emit('socket') happens right before this.socket.write(data, encoding).
There's a check in there which I'm not sure about:
if (!this.socket.writable) return; // XXX Necessary?
So, in your socket event handler you may want to perform the same check and see if it ever comes back that the socket is not writable.