nodejs how to avoid blocking requests - javascript

hi i wrote this simple nodejs server:
const express = require('express')
const app = express()
app.use(express.json())
app.get('/otherget', (req,res) => {
res.send('other get');
});
app.get('/get', async (req, res) => {
let response;
try{
response = await func1();
}catch(error){
}
res.send('response: ' + response);
});
const func1 = () => {
let j = 0;
for(let i=0; i<10000000000; i++){
j++;
}
return j;
}
app.listen(4000, () => console.log('listen 4000'));
and where the server gets a request to the route '/get' he cant perform any more requests units he has done with that one.
how can i prevent this situation when my server will have a lot of requests doing a lot of things?

every thread can get blocked by your "crazy loop".
try to learn about js's event loop to know which kind of work can block it (cpu intensive tasks).
the general solution to survive to this kind of cpu intensive tasks is to create more threads / processes :
you can fork, then you will be able to have two loops of this kind.
you can create workers
or you can load balance between mutiple servers
you can dcreate a queue to prevent having to much parrallel work
use the yield keyword to stop execution at specific checkpoints
etc
every solution has its pros and cons

Related

Making 10,000 HTTP Requests Quickly

I would like to make 10,000 concurrent HTTP requests. I am currently doing it by using Promise.all. However, I seem to be rate limited in some way, it takes around 15-30 mins to complete all 10,000 requests. Is there something in axios or in the http requests in node that is limiting me? How can I raise the limt if there is one?
const axios = require('axios');
function http_request(url) {
return new Promise(async (resolve) => {
await axios.get(url);
// -- DO STUFF
resolve();
});
}
async function many_requests(num_requests) {
let all_promises = [];
for (let i = 0; i < num_requests; i++) {
let url = 'https://someurl.com/' + i;
let promise = http_request(url);
all_promises.push(promise);
}
return Promise.all(all_promises);
}
async function run() {
await many_requests(10000);
}
run();
In Node.js there are two types of threads: one Event Loop (aka the
main loop, main thread, event thread, etc.), and a pool of k Workers
in a Worker Pool (aka the threadpool).
...
The Worker Pool of Node.js is implemented in libuv (docs), which
exposes a general task submission API.
Event loop run in a thread, push tasks to pool of k Workers. And these workers will run parallel. Default number of work in pool is 4. You can set more.
source
libuv
Default UV_THREADPOOLSIZE is 4. You can set UV_THREADPOOLSIZE as link. Limit of it depend on os, you need check your os:
set UV_THREADPOOL_SIZE

How to send batch axios GET requests by iterating +1

How do I send bulk HTTP GET requests using Axios, for example:
let maxI = 3000;
let i = 0;
do{
i = i + 1 ;
await exampleUrl = axios.get(`https://hellowWorld.com/${i}`);
} while (i < maxI);
How will I be able to receive the data from all the provided URLs and can this be merged into a single variable? And how can I make sure that this gets executed quickly?
I know about axios.all, but I don't know how to apply it in my case.
Thank you in advance.
You can do something like this but be careful, servers will reject your request if you make them in bulk to prevent DDOS and this also doesn't guarantee that all the requests would return successfully and you will receive all the data, here is the snippet for it:
import axios from "axios";
const URL = "https://randomuser.me/api/?results=";
async function getData() {
const requests = [];
for (let i = 1; i < 6; i++) {
requests.push(axios.get(URL + i));
}
const responses = await Promise.allSettled(requests);
console.log(responses);
const result = [];
responses.forEach((item) => {
if (item.status === "rejected") return;
result.push(item.value.data.results);
});
console.log(result.flat());
}
getData();
AFAIK, it is impossible to increase the speed/reduce the time taken to complete your batch requests unless you implement a batch request handling API on server which would reduce both number of requests handled by the server and the number of requests made by the browser. The solution I gave you is just to demonstrate how it could be done from client side but your approach is not an optimal way to do it.
There is a different limit for number of parallel requests that can be made to a single domain for different browsers because of which we cannot reduce the time taken to execute queries.
Please read through these resources for further information they will be of great help:
Bulk Requests Implementation
Browser batch request ajax
Browser request limits
Limit solution

multiple tcp connections despite of http-alive

I'm not sure whether I understand http-keep-alive correctly, In my opinion, it should reuse the tcp connection, not building a new one. However, I found something really strange, it seems like it is hard to anticipate the behavior of http keep-alive.
Server: NodeJS & Express ^4.16.3
and I have used Wireshark to analyze the results
Situation 1:
Server-side
for(let i =1; i<11; i++){
app.use('/' + i, (req, res) => {
res.header('cache-control', 'no-store');
res.send('i');
});
}
server.keepAliveTimeout = 50000;
Client side
setTimeout(() => {
for (let i = 1; i < 11; i++) {
fetch('' + i).then(data => console.log(data));
}
}, 10000);
result: tcp connection is reused(only one tcp connection), all fetch requests reuse the tcp connection established by index.html
Situation 2:
Client side codes are the same, only server side codes change here
for(let i =1; i<11; i++){
app.use('/' + i, (req, res) => {
res.header('cache-control', 'no-store');
// here I have added timeout!
setTimeout(() => {
res.send('i');
}, 2000);
});
}
result: 5 more tcp connection are established(in the picture only 4, because the screenshot is not complete), despite that I have set server.keepAliveTimeout = 50000;
So my question is, what does http keep alive really mean? why it behaves like this?
If it will not use the same tcp connection in situation 2, what is the meaning of keep alive??
appreciate for any thoughts!
Yes, HTTP Keep Alive should reuse your TCP connection with the server. The server append Connection: keep-alive Header with the response, So the client keeps the connection alive. So the client won't keep the connection alive till your server response.
So in your first scenario, The server replies with the header as soon as the request received. So the second response (actually may reuse, you got lucky since the server respond your request, before it sends the second one) reuses the TCP connection.
But in second scenario, server waits 2 seconds to send the response, So the client won't know it should be a keep alive connection till next 2 seconds. But all other requests need to be sent before that, so as default it will create a new connection for each HTTP Request.
This might be efficient if you need to continuously call HTTP interface, like req -> res -> req -> res, But also this might be inefficient if you want to get independent collection of data from the server.
Try this on client side if you have any doubts,
setTimeout(() => {
fetch('' + i).then(data => console.log(data));
setTimeout(function () {
for (let i = 2; i < 11; i++) {
fetch('' + i).then(data => console.log(data));
}
}, 5000)
}, 10000);

How to prevent race condition in node.js?

Can someone explain me how to prevent race conditions in node.js with Express?
If have for example this two methods:
router.get('/addUser/:department', function(req, res) { ...})
router.get('/deleteUser/:department', function(req, res) { ...})
Both functions are using a non blocking I/O Operation ( like writing to a file or a database).
Now someone calls 'addUser' with Department 'A' and someone tries to delete all users with department 'A'. How can I solve this (or other similar) race conditions?
How can I solve the problem if every user has its own file/database-record?
How can I solve the problem if I have a single user (filesystem) file that I have to read alter and write again?
Note: This is just an example for understanding. No optimization tipps needed here.
To archive this goal, you need to implement a communication within the two services.
This can be done with a simple queue of operations to process each request in order.
The counter effect is that the request waiting for the queue will have a delayed response (and may occur timeout).
A simple "meta" implementation is:
const operationQueue = new Map();
const eventEmitter = new events.EventEmitter();
router.get('/addUser/:department', function(req, res) {
const customEvent = `addUser-${new Date().getTime()}`;
const done = () => {
res.send('done');
operationQueue.delete(customEvent);
};
eventEmitter.once(customEvent, done);
operationQueue.set(customEvent, () => addUser(customEvent, req));
})
router.get('/deleteUser/:department', function(req, res) {
const customEvent = `deleteUser-${new Date().getTime()}`;
const done = () => {
res.send('done');
operationQueue.delete(customEvent);
};
eventEmitter.once(customEvent, done);
operationQueue.set(customEvent, () => deleteUser(customEvent, req));
})
function addUser(customEvent, req){
// do the logic
eventEmitter.emit(customEvent, {done: true});
}
function deleteUser(customEvent, req){
// do the logic
eventEmitter.emit(customEvent, {done: true});
}
// not the best performance
setInterval(()=>{
const process = operationQueue.shift();
if(process) {
process();
}
}, 1);
Of course, if you'll use tools like a DB or a Redis queue it could fit better than this solution in terms of robustness and failover.
(This is a very broad question.)
Typically, one would use a database (instead of regular text files) and make use its in-built locking mechanisms.
Example of locking mechanisms in the Postgres database management system: https://www.postgresql.org/docs/10/static/explicit-locking.html

NodeJS HTTP server stalled on V8 execution

EDITED
I have a nodeJS http server that is meant for receiving uploads from multiple clients and processing them separately.
My problem is that I've verified that the first request blocks the reception of any other request until the previous request is served.
This is the code I've tested:
var http = require('http');
http.globalAgent.maxSockets = 200;
var url = require('url');
var instance = require('./build/Release/ret');
http.createServer( function(req, res){
var path = url.parse(req.url).pathname;
console.log("<req>"+path+"</req>");
switch (path){
case ('/test'):
var body = [];
req.on('data', function (chunk) {
body.push(chunk);
});
req.on('end', function () {
body = Buffer.concat(body);
console.log("---req received---");
console.log(Date.now());
console.log("------------------");
instance.get(function(result){
postHTTP(result, res);
});
});
break;
}
}).listen(9999);
This is the native side (omitting obvious stuff) where getInfo is the exported method:
std::string ret2 (){
sleep(1);
return string("{\"image\":\"1.JPG\"}");
}
Handle<Value> getInfo(const Arguments &args) {
HandleScope scope;
if(args.Length() == 0 || !args[0]->IsFunction())
return ThrowException(Exception::Error(String::New("Error")));
Persistent<Function> fn = Persistent<Function>::New(Handle<Function>::Cast(args[0]));
Local<Value> objRet[1] = {
String::New(ret2().c_str())
};
Handle<Value> ret = fn->Call(Context::GetCurrent()->Global(), 1, objRet);
return scope.Close(Undefined());
}
I'm resting this with 3 curl parallel requests
for i in {1..3}; do time curl --request POST --data-binary "#/home/user/Pictures/129762.jpg" http://192.160.0.1:9999/test & done
This is the output from the server:
<req>/test</req>
---req received---
1397569891165
------------------
<req>/test</req>
---req received---
1397569892175
------------------
<req>/test</req>
---req received---
1397569893181
------------------
These the response and the timing from the client:
"1.JPG"
real 0m1.024s
user 0m0.004s
sys 0m0.009s
"1.JPG"
real 0m2.033s
user 0m0.000s
sys 0m0.012s
"1.JPG"
real 0m3.036s
user 0m0.013s
sys 0m0.001s
Apparently requests are received after the previous has been served. The sleep(1) simulates a synchronous operation that requires about 1s to complete and can't be changed.
The client receives the responses with an incremental delay of ~1s.
I would like to achieve a kind of parallelism, although I'm aware I'm in a single threaded environment such as nodeJS. What I would like to achieve is receiving all 3 answers is ~1s.
Thanks in advance for your help.
This:
for(var i=0;i<1000000000;i++) var a=a+i;
Is a pretty severe blocking operation. As soon as the first block ends. Your whole server hangs until this for loop is done. I'm interested in why you are trying to do this.
Perhaps you are trying to simulate a delayed response ?
setTimeout(function)({
send404(res);
}, 3000);
Right now you are turning a non-flowing stream into flowing mode by attaching a data event handler, and subsequently loading the whole stream into memory. You probably don't want to do this.
You can use the stream in now-flowing mode as illustrated below, this is useful if you want to send the data to some place that is only accessible after some other event.
However, using the stream in flowing mode is the fastest. If you want to write your own body parser I suppose you might want to use flowing mode, it depends on your use case.
req.on('readable', function () {
var chunk;
while (null !== (chunk = readable.read())) {
body.push(chunk);
}
});
Flowing and non-flowing mode is also know as respectively v1 and v2 streams, as the older streams used in node only supported flowing mode.

Categories

Resources