I am exploring ways to abort client requests that are taking too long, thereby consuming server resources.
Having read some sources (see below), I tried a solution like the following (as suggested here):
const express = require('express');
const server = express();
server
.use((req, res, next) => {
req.setTimeout(5000, () => {
console.log('req timeout!');
res.status(400).send('Request timeout');
});
res.setTimeout(5000, () => {
console.log('res timeout!');
res.status(400).send('Request timeout');
});
next();
})
.use(...) // more stuff here, of course
.listen(3000);
However, it seems not to work: the callbacks are never called, and the request is not interrupted.
Yet, according to recent posts, it should.
Apart from setting the timeout globally (i.e. server.setTimeout(...)), which would not suit my use case,
I have seen many suggesting the connect-timeout middleware.
However, I read in its docs that
While the library will emit a ‘timeout’ event when requests exceed the given timeout, node will continue processing the slow request until it terminates.
Slow requests will continue to use CPU and memory, even if you are returning a HTTP response in the timeout callback.
For better control over CPU/memory, you may need to find the events that are taking a long time (3rd party HTTP requests, disk I/O, database calls)
and find a way to cancel them, and/or close the attached sockets.
It is not clear to me how to "find the events that are taking long time" and "a way to cancel them",
so I was wondering if someone could share their suggestions.
Is this even the right way to go, or is there a more modern, "standard" approach?
Specs:
Node 12.22
Ubuntu 18.04
Linux 5.4.0-87-generic
Sources:
Express.js Response Timeout
Express.js connect timeout vs. server timeout
Express issue 3330 on GitHub
Express issue 4664 on GitHub
Edit:
I have seen some answers and comments offering a way to setup a timeout on responses or "request handlers": in other words, the time taken by the middleware is measured, and aborted if it takes too long. However, I was seeking for a way to timeout requests, for example in the case of a client sending a large file over a slow connection. This happens probably even before the first handler in the express router, and that is why I suspect that there must be some kind of setting at the server level.
Before rejecting long request, I think, it's better to measure requests, find long ones, and optimize them. if it is possible.
How to measure requests?
Simplest way it is to measure time from start, till end of request. You'll get Request Time Taken = time in nodejs event loop + time in your nodejs code + wait for setTimeout time + wait for remote http/db/etc services
If you don't have many setTimeout's in code, then Request Time Taken is a good metric.
(But in high load situations, it becomes unreliable, it is greatly affected by time in event loop )
So you can try this measure and log solution
http://www.sheshbabu.com/posts/measuring-response-times-of-express-route-handlers/
How to abort long request
it all depends on your request handler
Handler does Heavy computing
In case of heavy computing, which block main thread, there's noting you can do without rewriting handler.
if you set req.setTimeout(5000, ...) - it fires after res.send(), when main loop will be unlocked
function (req, res) {
for (let i = 0; i < 1000000000; i++) {
//blocking main thread loop
}
res.send("halted " + timeTakenMs(req));
}
So you can make your code async, by injecting setTimeout(, 0) some where;
or move computing to worker thread
Handler has many remote requests
I simulate remote requests with Promisfied setTimeout
async function (req, res) {
let delayMs = 500;
await delay(delayMs); // maybe axios http call
await delay(delayMs); // maybe slow db query
await delay(delayMs);
await delay(delayMs);
res.send("long delayed" + timeTakenMs(req));
}
In this case you can inject some helpers to abort your request chain
blockLongRequest - throws error if request time is too big;
async function (req, res) {
let delayMs = 500;
await delay(delayMs); // mayby axios call
blockLongRequest(req);
await delay(delayMs); // maybe db slow query
blockLongRequest(req);
await delay(delayMs);
blockLongRequest(req);
await delay(delayMs);
res.send("long delayed" + timeTakenMs(req));
})
single remote request
function (req, res) {
let delayMs = 1000;
await delay(delayMs);
//blockLongRequest(req);
res.send("delayed " + timeTakenMks(req));
}
we don't use blockLongRequest because it's better to deliver answer instead of error.
Error may trigger client to retry, and you get your slow requests doubled.
Full example
(sorry for TypeScript, yarn ts-node sever.ts )
import express, { Request, Response, NextFunction } from "express";
declare global {
namespace Express {
export interface Request {
start?: bigint;
}
}
}
const server = express();
server.use((req, res, next) => {
req["start"] = process.hrtime.bigint();
next();
});
server.use((err: any, req: Request, res: Response, next: NextFunction) => {
console.error("Error captured:", err.stack);
res.status(500).send(err.message);
});
server.get("/", function (req, res) {
res.send("pong " + timeTakenMks(req));
});
server.get("/halt", function (req, res) {
for (let i = 0; i < 1000000000; i++) {
//halting loop
}
res.send("halted " + timeTakenMks(req));
});
server.get(
"/delay",
expressAsyncHandler(async function (req, res) {
let delayMs = 1000;
await delay(delayMs);
blockLongRequest(req); //actually no need for it
res.send("delayed " + timeTakenMks(req));
})
);
server.get(
"/long_delay",
expressAsyncHandler(async function (req, res) {
let delayMs = 500;
await delay(delayMs); // mayby axios call
blockLongRequest(req);
await delay(delayMs); // maybe db slow query
blockLongRequest(req);
await delay(delayMs);
blockLongRequest(req);
await delay(delayMs);
res.send("long delayed" + timeTakenMks(req));
})
);
server.listen(3000, () => {
console.log("Ready on 3000");
});
function delay(delayTs: number): Promise<void> {
return new Promise((resolve) => {
setTimeout(() => {
resolve();
}, delayTs);
});
}
function timeTakenMks(req: Request) {
if (!req.start) {
return 0;
}
const now = process.hrtime.bigint();
const taken = now - req.start;
return taken / BigInt(1000);
}
function blockLongRequest(req: Request) {
const timeTaken = timeTakenMks(req);
if (timeTaken > 1000000) {
throw Error("slow request - aborting after " + timeTaken + " mks");
}
}
function expressAsyncHandler(
fn: express.RequestHandler
): express.RequestHandler {
return function asyncUtilWrap(...args) {
const fnReturn = fn(...args);
const next = args[args.length - 1] as any;
return Promise.resolve(fnReturn).catch(next);
};
}
I hope, this approach helps you to find an acceptable solution
Related
I am creating a Node.js module that takes a list of movie titles and fetches their respective metadata from omdbapi.com.
These lists are often very large and sometimes (with my current slow internet connection) the connection stalls due to too many concurrent connections. So I set up a timeout/abort method that restarts the process after 30 seconds.
The problem I'm having is that whenever I lose internet connection or the connection stalls, it just bails out of the process, and doesn't restart the connection.
Example:
async function getMetadata () {
const remainingMovies = await getRemainingMovies();
for (let i = 0; i < remainingMovies.length;i++) {
const { data, didTimeout } = await fetchMetadata(remainingMovies[i]);
// Update "remainingMovies" Array...
if (didTimeout) {
getMetadata();
break;
}
}
if (!didTimeout) {
return data;
}
}
This is obviously a simplified version but essentially:
The getMetadata Function gets the remainingMovies Array from the global scope.
Fetches the metadata from the server with the fetchMetadata Function.
Checks if the connection timed out or not.
If it did it should restart the Function and attempt to connect again.
If it didn't timeout then finish the for loop and continue.
I guess you want something similar to below script. Error handling using try/catch for async/await which probably is what you are looking for as a missing puzzle.
async function getMetadata() {
const remainingMovies = await getRemainingMovies();
remainingMovies.map(movie => {
try {
return await fetchMetadata(movie);
} catch (err) {
return getMetadata();
}
});
}
I am trying to send the contents of a text file through a socket connection every time the text file updates using Express:
console.log('Server running!');
var express = require('express');
var app = express();
var server = app.listen(3000);
var fs = require("fs");
var x = 0;
app.use(express.static('public'));
var socket = require('socket.io');
var io = socket(server);
io.sockets.on('connection', newConnection);
function newConnection(socket) {
console.log("New connection: " + socket.id);
while (true) {
fs.readFile('song.txt', function(err, data) {
if (err) throw err;
console.log(data);
if (data != x) {
var songdata = data;
console.log(songdata);
io.sockets.emit('update', songdata);
x = data;
} else {
console.log("Song is not different:)");
}
})
}
}
Without the while loop, everything works just fine and I recieve the contents in the seperate client. However, now nothing is happening, no console log of data. This indicates the readFile is suddenly no longer running, why?
Thanks:)
First off, some basics. node.js runs your Javascript as single threaded and thus this is a single threaded server. It can only do one thing with your Javascript at a time. But, if you program it carefully, it can scale really well and do lots of things.
Second off, you pretty much never want to do while (true) in server-side Javascript. That's just going to run forever and never let anything else run on your server. Nothing else.
Third, you are attempting to create a new version of that infinite loop every time a new client connects. That's not a correct design (even if there wasn't an infinite loop). You only need one instance of code checking the file, not N.
Now, if you what you're really trying to do is to "poll" for changes in song.txt and notify the client whenever it changes, you need to pick a reasonable time delta between checks on the file and use a timer. This will check that file every so often and let your server run normally all the rest of the time.
Here's a simple version that polls with setInterval():
console.log('Server code started!');
const express = require('express');
const app = express();
const server = app.listen(3000);
const fs = require("fs");
let lastSongData = 0;
app.use(express.static('public'));
const io = require('socket.io')(server);
// get initial songData for future use
// there will not be any connected clients yet so we don't need to broadcast it
try {
lastSongData = fs.readFileSync('song.txt');
} catch(e) {
console.log(e, "\nDidn't find song.txt on server initialization");
}
// here, we create a polling loop that notifies all connected clients
// any time the song has changed
const pollInterval = 60*1000; // 60 seconds, ideally it should be longer than this
const pollTimer = setInterval(() => {
fs.readFile('song.txt', (err, songData) => {
if (!err && songData !== lastSongData) {
// notify all connect clients
console.log("found changed songData");
io.emit('update', songData);
lastSongData = songData;
}
});
}, pollInterval);
io.sockets.on('connection', socket => {
console.log("New connection: " + socket.id);
});
If your songData is binary, then you will have to change how you send the data to the client and how you compare the data to the previous data so you are sending and receiving binary data, not string data and so you are comparing buffers, not strings.
Here's are some references on sending binary data with socket.io:
How to send binary data with socket.io?
How to send binary data from a Node.js socket.io server to a browser client?
A little more efficient way to detect changes to the file is to use fs.watch() which should notify you of changes to the file though you will have to thoroughly test it on whatever platform you are running to make sure it works the way you want. The feature has a number of platform caveats (it does not work identically on all platforms), so you have to test it thoroughly on your platform to see if you can use it for what you want.
console.log('Server code started!');
const express = require('express');
const app = express();
const server = app.listen(3000);
const fs = require("fs");
let lastSongData = 0;
app.use(express.static('public'));
const io = require('socket.io')(server);
// get initial songData for future use
// there will not be any connected clients yet so we don't need to broadcast it
try {
lastSongData = fs.readFileSync('song.txt');
} catch(e) {
console.log(e, "\nDidn't find song.txt on server initialization");
}
// ask node.js to tell us when song.txt is modified
fs.watch('song.txt', (eventType, filename) => {
// check the file for all eventTypes
fs.readFile('song.txt', (err, songData) => {
if (!err && songData !== lastSongData) {
// notify all connect clients
console.log("found changed songData");
lastSongData = songData;
io.emit('update', songData);
}
});
});
io.sockets.on('connection', socket => {
console.log("New connection: " + socket.id);
});
It is unclear from your original code if you need to send the songData to each new connection (whether it has recently changed or not).
If so, you can just change your connection event handler to this:
io.sockets.on('connection', socket => {
console.log("New connection: " + socket.id);
// send most recent songData to each newly connected client
if (lastSongData) {
socket.emit('update', lastSongData);
}
});
Continuously reading the file to detect changes is not a good idea. Instead you should use fs.watch(filename[, options][, listener]) to notify you when the file has changed. When a new socket connects only that socket should have the content broadcast to them, sending it to every client is redundant.
io.sockets.on('connection', newConnection);
var filename = 'song.txt';
function update(socket) {
fs.readFile(filename, function (err, data) {
if (err) throw err;
socket.emit('update', data);
});
}
function newConnection(socket) {
console.log("New connection: " + socket.id);
update(socket); // Read and send to just this socket
}
fs.watch(filename, function () {
console.log("File changed");
update(io.sockets); // Read and send to all sockets.
});
My code is just a regular app:
app
.use(sassMiddleware({
src: __dirname + '/sass',
dest: __dirname + '/',
// This line controls sass log output
debug: false,
outputStyle: 'compressed'
}))
// More libraries
...
.get('/', auth.protected, function (req, res) {
res.sendfile(__dirname + '/views/index.html');
})
.post('/dostuff', auth.protected, function (req, res) {
console.log(req.body)
res.redirect('back')
child = require('child_process').spawn(
'./script.rb',
['arguments'],
{ stdio: ['ignore', 'pipe', 'pipe'] }
);
child.stdout.pipe(process.stdout);
child.stderr.pipe(process.stdout);
})
My original goal is to limit the number of spawns that /dostuff can spawn to a single instance. I was thinking that there might be a simple way to limit the number of users on the entire app, but can't seem to find any.
I was trying to look for some session limiting mechanism but can't find one either, only various rate limiters but I don't think that's what I want.
Since the app is running in docker I limit the number of tcp connections on the port using iptalbes but this has proven to be less then ideal since the app retains some connections in established state which prevents efficient hand off from one user to another.
So... any programmatic way of doing this?
UPDATE
The app is not an api server. /dostuff is actually triggered by the user from a webpage. That's why simple rate limiting is not the best option. Also the times of execution for the ruby script are variable.
ANSWER
Based on the answer below from #jfriend00, by fixing a couple of logical errors I came up with:
.post('/dostuff*', auth.protected, function (req, res) {
console.log(req.body)
if (spawnCntr >= spawnLimit) {
res.status(502).send('Server is temporarily busy');
console.log("You already have process running. Please either abort the current run or wait until it completes".red)
return;
}
let childClosed = false
function done () {
if (!childClosed) {
--spawnCntr;
childClosed = true;
}
}
++spawnCntr;
child = require('child_process').spawn(
blahblah
);
child.on('close', done);
child.on('error', done);
child.on('exit', done);
child.stdout.pipe(process.stdout);
child.stderr.pipe(process.stdout);
res.redirect('back');
})
I am still going to accept his answer although incomplete it helped a lot.
You can keep a simple counter of how many spawn() operations are in process at the same time and if a new request comes in and you are currently over that limit, you can just return a 502 error (server temporarily busy).
let spawnCntr = 0;
const spawnLimit = 1;
app.post('/dostuff', auth.protected, function (req, res) {
console.log(req.body)
if (spawnCntr > spawnLimit) }
return res.status(502).send("Server temporarily busy");
}
let childClosed = false;
function done() {
// make sure we count it closing only once
if (!childClosed) {
--spawnCntr;
childClosed = true;
}
}
++spawnCntr;
let child = require('child_process').spawn(
'./script.rb',
['arguments'],
{ stdio: ['ignore', 'pipe', 'pipe'] }
);
// capture all the ways we could know it's done
child.on('close', done);
child.on('error', done);
child.on('exit', done);
child.stdout.pipe(process.stdout);
child.stderr.pipe(process.stdout);
res.redirect('back');
});
Note: The code in your question does not declare child as a local variable which looks like a bug waiting to happen.
You can use the node package Express Rate Limit - https://www.npmjs.com/package/express-rate-limit.
For an API-only server where the rate-limiter should be applied to all requests:
var RateLimit = require('express-rate-limit');
app.enable('trust proxy'); // only if you're behind a reverse proxy (Heroku, Bluemix, AWS if you use an ELB, custom Nginx setup, etc)
var limiter = new RateLimit({
windowMs: 15*60*1000, // 15 minutes
max: 100, // limit each IP to 100 requests per windowMs
delayMs: 0 // disable delaying - full speed until the max limit is reached
});
// apply to all requests
app.use(limiter);
For a "regular" web server (e.g. anything that uses express.static()), where the rate-limiter should only apply to certain requests:
var RateLimit = require('express-rate-limit');
app.enable('trust proxy'); // only if you're behind a reverse proxy (Heroku, Bluemix, AWS if you use an ELB, custom Nginx setup, etc)
var apiLimiter = new RateLimit({
windowMs: 15*60*1000, // 15 minutes
max: 100,
delayMs: 0 // disabled
});
// only apply to requests that begin with /api/
app.use('/api/', apiLimiter);
I've node application which done spawn(child process) to and application,
the application have host and port:
var exec = require('child_process').spawn;
var child = exec('start app');
console.log("Child Proc ID " + child.pid)
child.stdout.on('data', function(data) {
console.log('stdout: ' + data);
});
child.stderr.on('data', function(data) {
console.log('stdout: ' + data);
});
child.on('close', function(code) {
console.log('closing code: ' + code);
});
some application will start immediately and some application will take some time 10 - 20 sec until they start.
Now I use the node http proxy to run the app and the problem is that Im getting error when the use want to run the app before it up and running.
Any idea how somehow I can solve this issue?
proxy.on('error', function (err, req, res) {
res.end('Cannot run app');
});
Btw, I cannot send response 500 in proxy error due to limitation of our framework. Any other idea how can I track the application maybe with some timeout to see weather it send response 200.
UPDATE - Sample of my logic
httpProxy = require('http-proxy');
var proxy = httpProxy.createProxyServer({});
http.createServer(function (req, res) {
console.log("App proxy new port is: " + 5000)
res.end("Request received on " + 5000);
}).listen(5000);
function proxyRequest(req, res) {
var hostname = req.headers.host.split(":")[0];
proxy.web(req, res, {
target: 'http://' + hostname + ':' + 5000
});
proxy.on('error', function (err, req, res) {
res.end('Cannot run app');
});
}
What you need is to listen for the first response on your proxy and look at its status code to determine whether your app started successfully or not. Here's how you do that:
proxy.on('proxyRes', function (proxyRes, req, res) {
// Your target app is definitely up and running now because it just sent a response;
// Use the status code now to determine whether the app started successfully or not
var status = res.statusCode;
});
Hope this helps.
Not sure if it make sense , In your Main App the experience should start with a html page and each child process should have is own loader.
So basically , you need a http Handler, which linger the request until the the child process is ready. So just make and ajax call from the html , and show loading animation till the service is ready .
//Ajax call for each process and update the UI accordingly
$.get('/services/status/100').then(function(resp) {
$('#service-100').html(resp.responseText);
})
//server side code (express syntax)
app.get('/services/status/:id ', function(req,res) {
// Check if service is ready
serviceManager.isReady(req.params.id, function(err, serviceStats) {
if(err) {
//do logic err here , maybe notify the ui if an error occurred
res.send(err);
return;
}
// notify the ui , that the service is ready to run , and hide loader
res.send(serviceStats);
});
})
I am not sure i understand the question correctly, but you want to wait for a child process to spin on request and you want this request to wait for this child process and then be send to it?
If that is so a simple solution will be to use something like this
var count = 0;//Counter to check
var maxDelay = 45;//In Seconds
var checkEverySeconds = 1;//In seconds
async.whilst(
function () {
return count < maxDelay;
},
function (callback) {
count++;
self.getAppStatus(apiKey, function (err, status) {
if (status != 200) {
return setTimeout(callback, checkEverySeconds * 1000);
}
continueWithRequest();
});
},
function (err) {
if (err) {
return continueWithRequest(new Error('process cannot spin!'));
}
}
);
The function continueWithRequest() will forward the request to the child process and the getAppStatus will return 200 when the child process has started and some other code when it is not. The general idea is that whilst will check every second if the process is running and after 45 second if not it will return an error. Waiting time and check intervals can be easily adjusted. This is a bit crude, but will work for delaying the request and setTimeout will start a new stack and will not block. Hope this helps.
If you know (around) how much time the app takes to get up and running, simply add setTimeout(proxyRequest, <Time it takes the app to start in MS>)
(There are most likely more smart/complicated solutions, but this one is the easiest.)
Why not use event-emitter or messenger?
var eventEmitter = require('event-emitter')
var childStart = require('./someChildProcess').start()
if (childStart !== true) {
eventEmitter.emit('progNotRun', {
data: data
})
}
function proxyRequest(req, res) {
var hostname = req.headers.host.split(":")[0];
proxy.web(req, res, {
target: 'http://' + hostname + ':' + 5000
});
eventEmitter.on('progNotRun', function(data) {
res.end('Cannot run app', data);
})
}
To optimize the response delay, it is necessary to perform work after are response has been sent back to the client. However, the only way I can seem to get code to run after the response is sent is by using setTimeout. Is there a better way? Perhaps somewhere to plug in code after the response is sent, or somewhere to run code asynchronously?
Here's some code.
koa = require 'koa'
router = require 'koa-router'
app = koa()
# routing
app.use router app
app
.get '/mypath', (next) ->
# ...
console.log 'Sending response'
yield next
# send response???
console.log 'Do some more work that the response shouldn\'t wait for'
Do NOT call ctx.res.end(), it is hacky and circumvents koa's response/middleware mechanism, which means you might aswell just use express.
Here is the proper solution, which I also posted to https://github.com/koajs/koa/issues/474#issuecomment-153394277
app.use(function *(next) {
// execute next middleware
yield next
// note that this promise is NOT yielded so it doesn't delay the response
// this means this middleware will return before the async operation is finished
// because of that, you also will not get a 500 if an error occurs, so better log it manually.
db.queryAsync('INSERT INTO bodies (?)', ['body']).catch(console.log)
})
app.use(function *() {
this.body = 'Hello World'
})
No need for ctx.end()
So in short, do
function *process(next) {
yield next;
processData(this.request.body);
}
NOT
function *process(next) {
yield next;
yield processData(this.request.body);
}
I have the same problem.
koa will end response only when all middleware finish(In application.js, respond is a response middleware, it end the response.)
app.callback = function(){
var mw = [respond].concat(this.middleware);
var gen = compose(mw);
var fn = co.wrap(gen);
var self = this;
if (!this.listeners('error').length) this.on('error', this.onerror);
return function(req, res){
res.statusCode = 404;
var ctx = self.createContext(req, res);
onFinished(res, ctx.onerror);
fn.call(ctx).catch(ctx.onerror);
}
};
But, we can make problem solved by calling response.end function which is node's api:
exports.endResponseEarly = function*(next){
var res = this.res;
var body = this.body;
if(res && body){
body = JSON.stringify(body);
this.length = Buffer.byteLength(body);
res.end(body);
}
yield* next;
};
you can run code in async task by use setTimeout, just like:
exports.invoke = function*() {
setTimeout(function(){
co(function*(){
yield doSomeTask();
});
},100);
this.body = 'ok';
};