SignalR with long process: send intermediate messages - javascript

I am using SignalR to start long computations on server side and post a message to the client when the result is available.
The input bindings is an HTTP request.
I would like to be able to send multiple messages back in order to notify the client of the differents steps of the process (eg, computation starts, computation ends, etc..).
I tried pushing different messages to context.bindings.signalRMessages but I see that everything is sent together at the end of the whole process. Is there a way to send several messages at different times?
Another related issue is that my HTTP request on client side remains stuck until the end of the process. I would like to be able to post a quick response early, since I get the response via a signalR message.
Here is my server code :
module.exports = async function(context, req) {
let ID = context.bindingData.invocationId;
context.bindings.signalRMessages = [];
const messageQueue = context.bindings.signalRMessages;
var postMessage = (message) => {
message.userId = req.query.userId;
message.isPrivate = true;
messageQueue.push(message);
};
let preProcessData = preProcess(req.body.input);
let startMessage = {
"target": "optimStart",
"arguments": [{ preProcessData: preProcessData }]
};
postMessage(startMessage); // <<<< I want this one to be sent immediately
try {
let optimOutput = await computeOptim(req.body.input, ID); // that's the long process
let response = {
optimId: ID,
optimOutput: optimOutput
};
let optimCompleteMessage = {
"target": "optimComplete",
"arguments": [response]
};
postMessage(optimCompleteMessage);
} catch (err) {
// ....
}
};
Am I doing anything wrong or is it just not possible ?
Thanks!

This is not possible with a simple HTTP triggered function since bindings resolve only once the execution of the function completes.
For your scenario, durable functions would be the perfect choice.
You would still have a HTTP Triggered function (client function) to start on orchestration and return immediately. In the orchestration function, you would have separate activity functions for the processing and for sending updates to the client using the SignalR binding.

Related

JS socket.io client doesn't catch an emitted event

I'm using a Python based socketio server and JS socketio client. The workflow looks roughly like this:
client emits <<action1>> -> server receives <<action1>> -> server performs some calculations -> server emits <<action2>> -> client receives <<action2>>
It should work in a loop.
The problem is that for some reason the last step isn't performed, client doesn't seem to catch action2 event. I can see that it has been fired from the server logs, they state that: emitting event "action2" to all [/]
Server:
#sio.on('action1')
def handle_analysis():
... some calculations ...
sio.emit('action2')
Client:
import { io } from "socket.io-client";
export const socket = io("http://localhost:5550",
{
timeout: 200000
});
async runCalculations() {
console.log('progress1');
let stdoutChunks: string = '';
const getCalculations = () => {
socket.emit('action1');
return new Promise((resolve, reject) => {
socket.on('action2', (msg) => {
stdoutChunks += msg;
resolve(stdoutChunks);
})
});
};
await getCalculations()
console.log('progress2');
return {};
}
progress2 is never logged, because js keeps on awaiting the getCalculations function.
Can anyone point where the issue might be?
Each calculation takes roughly 50 seconds, I wonder if that might be the cause.
I used same approach for different events that take much less time and it works perfectly, server communicates with the client without problems.
I'm not that good/versed with python-socket.io but isn't getCalculations() method defined inside runCalculations() in Client. So, await getCalculations() won't directly work and also in Server shouldn't sio.emit('action2') also have a second argument like this sio.emit('action2', 'hello'). Hope, this may be useful

How to notify HTTP client of the completion of a long task

I have a Node.js system that uploads a large number of objects to MongoDB and creates folders in dropbox for each object. This takes around 0.5 seconds per object. In situations therefore where i have many objects this could take up to around a minute. What i currently do is notify the client that the array of objects has been accepted using a 202 response code. However how do i then notify the client of completion a minute later.
app.post('/BulkAdd', function (req, res) {
issues = []
console.log(req.body)
res.status(202).send({response:"Processing"});
api_functions.bulkAdd(req.body).then( (failed, issues, success) => {
console.log('done')
})
});
bulkAdd: async function (req, callback) {
let failed = []
let issues = []
let success = []
i = 1
await req.reduce((promise, audit) => {
// return promise.then(_ => dropbox_functions.createFolder(audit.scanner_ui)
let globalData;
return promise.then(_ => this.add(audit)
.then((data)=> {globalData = data; return dropbox_functions.createFolder(data.ui, data)}, (error)=> {failed.push({audit: audit, error: 'There was an error adding this case to the database'}); console.log(error)})
.then((data)=>{console.log(data, globalData);return dropbox_functions.checkScannerFolderExists(audit.scanner_ui)},(error)=>{issues.push({audit: globalData, error: 'There was an error creating the case folder in dropbox'})})
.then((data)=>{return dropbox_functions.moveFolder(audit.scanner_ui, globalData.ui)},(error)=>{issues.push({audit: globalData, error: 'No data folder was found so an empty one was created'}); return dropbox_functions.createDataFolder(globalData.ui)})
.then(()=>success.push({audit:globalData}), issues.push({audit: globalData, error: 'Scanner folder found but items not moved'}))
);
}, Promise.resolve()).catch(error => {console.log(error)});
return(failed, issues, success)
},
Well the problem with making client request wait, is it will timeout after certain period or sometimes will show error with no response received.
What you can do is
- Make client request to server to initiate the task, and return 200OK and keep doing your task on server.
- Now write a file on server after insertion of every object as status.
- Read the file from client every 5-10 sec to check if server has completed creating objects or not.
- Mean while your task is not completed on server, show status with completion percentage or some animation.
Or simply implement WebHook or WebSockets to maintain communication.

Can I yield to a child process and return the response in Node.js?

In short, I've run into an issue where multiple parallel GET requests to my Node.js server cause the server to get "clogged up" and hang, thus resulting in timeouts for the clients (503, service unavailable).
After a lot of performance analysis, I've realized it's a CPU issue. The specific request (we'll call it GET /foo) queries data from multiple services over HTTP, and then does a lot of computation, and returns the results to the client, like this:
Client request GET /foo
/foo controller queries data over HTTP from multiple other services`
/foo controller then does a bunch of iterations over the data to compile some output for the client
Step 3 takes around 2 seconds to complete. However, if I send 2 requests in parallel to /foo, each client will receive their response in about 4 seconds. When I run the app in a cluster using more cores, the requests run much faster, but not quite what I want.
Seems like I have several options here:
pre-compute the response (ideally would like to avoid this for now, since it will require a whole "cache invalidation" scheme), or
/foo sends the CPU-blocking computation asynchronously to another process (using Heroku, so that would be another dyno), and then I can use a websocket or something to push the results to the client (again, very complex for my situation), or
somehow yield to a child process in the request and return the results to the client
Would love to do something like option 3. Something like this:
get('/foo', function*(request) {
// I/O, so not blocking the event loop (I think)
let data = yield getData(request)
// make this happen in a different process
let response = yield doSomeHeavyProcessing(data)
return response
})
I've omitted a lot of implementation details above, but if it's necessary to know, I'm using Koa and Node.js 6.
Ideally, doSomeHeavyProcessing would do the CPU-intensive computation in some separate process, and when it's done, still send the results back in a "synchronous" fashion to the request client.
Been trying to wrap my head around child processes, web workers, fibers, etc., and have been doing some basic "hello worlds" with these to get them to do basically the above, but to no avail. Can post more details if necessary.
Here are some approaches that you can try:
1.
Split blocking computation in small chunks and use setImmediate to place the next chunk of work at the end of the event queue. So computation is no longer blocking and other requests can be processed.
2.
Microsoft recently released napajs. As stated in their README
As it evolves, we find it useful to complement Node.js in CPU-bound tasks, with the capability of executing JavaScript in multiple V8 isolates and communicating between them.
I haven't tried it, but it looks very promising:
var napa = require('napajs');
var zone1 = napa.zone.create('zone1', { workers: 4 });
get('/foo', function*(request) {
let data = yield getData(request)
let response = yield zone1.execute(doSomeHeavyProcessing, [data])
return response
})
3. If nothing of the above is enough and you need to spread the load across multiple machines, then you probably couldn't avoid using some sort of message queue to distribute work to different servers. In this case check out ZeroMQ. It is extremely easy to use from node, and you can implement any kind of distributed messaging pattern with it.
You could utilize Child process with additional wrapper for convenience.
worker.js - this module will run in a separate process and will do the heavy work
const crypto = require('crypto');
function doHeavyWork(data) {
return crypto.pbkdf2Sync(data, 'salt', 100000, 64, 'sha512');
}
process.on('message', (message) => {
const result = doHeavyWork(message.data);
process.send({ id: message.id, result });
});
client.js - a convenience (but primitive) wrapper for Child process
const cp = require('child_process');
let worker;
const resolves = new Map();
module.exports = {
init(moduleName, errorCallback) {
worker = cp.fork(moduleName);
worker.on('error', errorCallback);
worker.on('message', (message) => {
const resolve = resolves.get(message.id);
resolves.delete(message.id);
if (!resolve) {
errorCallback(new Error(`Got response from worker with unknown id: ${message.id}`));
return;
}
resolve(message.result);
});
console.log(`Service PID: ${process.pid}, Worker PID: ${worker.pid}`);
},
doHeavyWorkRemotly(data) {
const id = `${Date.now()}${Math.random()}`;
return new Promise((resolve) => {
worker.send({ id, data });
resolves.set(id, resolve);
});
}
}
I use fork() to utilize an additional communication channel as it is stated in the docs.
Also I keep a record of all submitted to worker process requests (const resolves = new Map();) and resolve Promises (resolve(message.result);) only when the worker process returns response for the specific request (const resolve = resolves.get(message.id);).
run.js - a startup module, it utilizes co to 'execute' generators.
const co = require('co');
const client = require('./client');
function errorCallback(error) {
console.log('Got an unexpected error!');
console.log(error);
}
client.init('./worker.js', errorCallback);
function* run() {
while(true) {
yield client.doHeavyWorkRemotly('mydata');
}
}
co(run);
To test it simply run node run.js, it will print
Service PID: XXXX, Worker PID: XXXX
then take a look at CPU utilization, worker process will probably take around 100% of CPU while Service will be quite idle.

Error handling over websockets a design dessision

Im currently building a webapp that has two clear use cases.
Traditional client request data from server.
Client request a stream from the server after wich the server starts pushing data to the client.
Currently im implementing both 1 and 2 using json message passing over a websocket. However this has proven hard since I need to handcode lots of error handling since the client is not waiting for the response. It just sends the message hoping it will get a reply sometime.
Im using Js and react on the frontend and Clojure on the backend.
I have two questions regarding this.
Given the current design, what alternatives are there for error handling over a websocket?
Would it be smarter to split the two UC using rest for UC1 and websockets for UC2 then i could use something like fetch on the frontend for rest calls.
Update.
The current problem is not knowing how to build an async send function over websockets can match send messages and response messages.
Here's a scheme for doing request/response over socket.io. You could do this over plain webSocket, but you'd have to build a little more of the infrastructure yourself. This same library can be used in client and server:
function initRequestResponseSocket(socket, requestHandler) {
var cntr = 0;
var openResponses = {};
// send a request
socket.sendRequestResponse = function(data, fn) {
// put this data in a wrapper object that contains the request id
// save the callback function for this id
var id = cntr++;
openResponses[id] = fn;
socket.emit('requestMsg', {id: id, data: data});
}
// process a response message that comes back from a request
socket.on('responseMsg', function(wrapper) {
var id = wrapper.id, fn;
if (typeof id === "number" && typeof openResponses[id] === "function") {
fn = openResponses[id];
delete openResponses[id];
fn(wrapper.data);
}
});
// process a requestMsg
socket.on('requestMsg', function(wrapper) {
if (requestHandler && wrapper.id) {
requestHandler(wrapper.data, function(responseToSend) {
socket.emit('responseMsg', {id: wrapper.id, data; responseToSend});
});
}
});
}
This works by wrapping every message sent in a wrapper object that contains a unique id value. Then, when the other end sends it's response, it includes that same id value. That id value can then be matched up with a particular callback response handler for that specific message. It works both ways from client to server or server to client.
You use this by calling initRequestResponseSocket(socket, requestHandler) once on a socket.io socket connection on each end. If you wish to receive requests, then you pass a requestHandler function which gets called each time there is a request. If you are only sending requests and receiving responses, then you don't have to pass in a requestHandler on that end of the connection.
To send a message and wait for a response, you do this:
socket.sendRequestResponse(data, function(err, response) {
if (!err) {
// response is here
}
});
If you're receiving requests and sending back responses, then you do this:
initRequestResponseSocket(socket, function(data, respondCallback) {
// process the data here
// send response
respondCallback(null, yourResponseData);
});
As for error handling, you can monitor for a loss of connection and you could build a timeout into this code so that if a response doesn't arrive in a certain amount of time, then you'd get an error back.
Here's an expanded version of the above code that implements a timeout for a response that does not come within some time period:
function initRequestResponseSocket(socket, requestHandler, timeout) {
var cntr = 0;
var openResponses = {};
// send a request
socket.sendRequestResponse = function(data, fn) {
// put this data in a wrapper object that contains the request id
// save the callback function for this id
var id = cntr++;
openResponses[id] = {fn: fn};
socket.emit('requestMsg', {id: id, data: data});
if (timeout) {
openResponses[id].timer = setTimeout(function() {
delete openResponses[id];
if (fn) {
fn("timeout");
}
}, timeout);
}
}
// process a response message that comes back from a request
socket.on('responseMsg', function(wrapper) {
var id = wrapper.id, requestInfo;
if (typeof id === "number" && typeof openResponse[id] === "object") {
requestInfo = openResponses[id];
delete openResponses[id];
if (requestInfo) {
if (requestInfo.timer) {
clearTimeout(requestInfo.timer);
}
if (requestInfo.fn) {
requestInfo.fn(null, wrapper.data);
}
}
}
});
// process a requestMsg
socket.on('requestMsg', function(wrapper) {
if (requestHandler && wrapper.id) {
requestHandler(wrapper.data, function(responseToSend) {
socket.emit('responseMsg', {id: wrapper.id, data; responseToSend});
});
}
});
}
There are a couple of interesting things in your question and your design, I prefer to ignore the implementation details and look at the high level architecture.
You state that you are looking to a client that requests data and a server that responds with some stream of data. Two things to note here:
HTTP 1.1 has options to send streaming responses (Chunked transfer encoding). If your use-case is only the sending of streaming responses, this might be a better fit for you. This does not hold when you e.g. want to push messages to the client that are not responding to some sort of request (sometimes referred to as Server side events).
Websockets, contrary to HTTP, do not natively implement some sort of request-response cycle. You can use the protocol as such by implementing your own mechanism, something that e.g. the subprotocol WAMP is doing.
As you have found out, implementing your own mechanism comes with it's pitfalls, that is where HTTP has the clear advantage. Given the requirements stated in your question I would opt for the HTTP streaming method instead of implementing your own request/response mechanism.

NodeJS HTTP server stalled on V8 execution

EDITED
I have a nodeJS http server that is meant for receiving uploads from multiple clients and processing them separately.
My problem is that I've verified that the first request blocks the reception of any other request until the previous request is served.
This is the code I've tested:
var http = require('http');
http.globalAgent.maxSockets = 200;
var url = require('url');
var instance = require('./build/Release/ret');
http.createServer( function(req, res){
var path = url.parse(req.url).pathname;
console.log("<req>"+path+"</req>");
switch (path){
case ('/test'):
var body = [];
req.on('data', function (chunk) {
body.push(chunk);
});
req.on('end', function () {
body = Buffer.concat(body);
console.log("---req received---");
console.log(Date.now());
console.log("------------------");
instance.get(function(result){
postHTTP(result, res);
});
});
break;
}
}).listen(9999);
This is the native side (omitting obvious stuff) where getInfo is the exported method:
std::string ret2 (){
sleep(1);
return string("{\"image\":\"1.JPG\"}");
}
Handle<Value> getInfo(const Arguments &args) {
HandleScope scope;
if(args.Length() == 0 || !args[0]->IsFunction())
return ThrowException(Exception::Error(String::New("Error")));
Persistent<Function> fn = Persistent<Function>::New(Handle<Function>::Cast(args[0]));
Local<Value> objRet[1] = {
String::New(ret2().c_str())
};
Handle<Value> ret = fn->Call(Context::GetCurrent()->Global(), 1, objRet);
return scope.Close(Undefined());
}
I'm resting this with 3 curl parallel requests
for i in {1..3}; do time curl --request POST --data-binary "#/home/user/Pictures/129762.jpg" http://192.160.0.1:9999/test & done
This is the output from the server:
<req>/test</req>
---req received---
1397569891165
------------------
<req>/test</req>
---req received---
1397569892175
------------------
<req>/test</req>
---req received---
1397569893181
------------------
These the response and the timing from the client:
"1.JPG"
real 0m1.024s
user 0m0.004s
sys 0m0.009s
"1.JPG"
real 0m2.033s
user 0m0.000s
sys 0m0.012s
"1.JPG"
real 0m3.036s
user 0m0.013s
sys 0m0.001s
Apparently requests are received after the previous has been served. The sleep(1) simulates a synchronous operation that requires about 1s to complete and can't be changed.
The client receives the responses with an incremental delay of ~1s.
I would like to achieve a kind of parallelism, although I'm aware I'm in a single threaded environment such as nodeJS. What I would like to achieve is receiving all 3 answers is ~1s.
Thanks in advance for your help.
This:
for(var i=0;i<1000000000;i++) var a=a+i;
Is a pretty severe blocking operation. As soon as the first block ends. Your whole server hangs until this for loop is done. I'm interested in why you are trying to do this.
Perhaps you are trying to simulate a delayed response ?
setTimeout(function)({
send404(res);
}, 3000);
Right now you are turning a non-flowing stream into flowing mode by attaching a data event handler, and subsequently loading the whole stream into memory. You probably don't want to do this.
You can use the stream in now-flowing mode as illustrated below, this is useful if you want to send the data to some place that is only accessible after some other event.
However, using the stream in flowing mode is the fastest. If you want to write your own body parser I suppose you might want to use flowing mode, it depends on your use case.
req.on('readable', function () {
var chunk;
while (null !== (chunk = readable.read())) {
body.push(chunk);
}
});
Flowing and non-flowing mode is also know as respectively v1 and v2 streams, as the older streams used in node only supported flowing mode.

Categories

Resources