Node Express server terminates eight hours after inactivity - javascript

I have written a small backend application with Node Express.
Its purpose is to retrieve data from a MySQL database and send the resulting rows as a JSON-formatted string back to the requesting client.
app.get(`${baseUrl}/data`, (req, res) => {
console.log("Get data");
getDataFromDatabase((error, data) => {
if (error) {
return res.json({status: CODE_ERROR, content: error});
}
else {
return res.json({status: CODE_SUCCESS, content: data});
}
});
});
Inside the getDataFromDatabase() method a simple SELECT statement is sent to the DB and it receives a status code plus content. In case of success, the content would be a JSON of returned rows, otherwise information about the MySQL error - again in JSON format.
Basically this code works fine. There are a few other methods which were built the same way but don't cause the following problem:
After running this code on a server, I found that the process always dies exactly eight hours after the last call of the above method. The method can be called dozens of times, the problem occurs only after inactivity.
A quick and dirty workaround due to a lack of time was to simply create a cronjob which kills the process and re-run the application every six hours. However, the new process also gets killed eight hours after the last request has been sent in the last process.
While writing this question, I checked again for any differences between my methods. I found the following, here a snippet of getDataFromDatabase():
if (error) {
callback(error, null);
}
However, a method getOtherDataFromDatabase() has got a return keyword before its callback:
if(!error) {
return callback(null, data);
}
So, is the return keyword making a difference here? Is there some kind of unfinished asynchronous code which terminates after a timeout? I've got no exceptions in my console output, the process dies silently.

Related

NodeJS and MongoDB Query Slow

so running into an issue where I am using NodeJS with Express for API calls. I am fetching all documents in a collection using
export async function main(req, res) {
try {
const tokens = await tokenModel.find({}).lean();
res.json(tokens);
} catch {(err) => {
res.status(500).json({ message: err.message })
console.log('err', err.message)
}
}
console.log('Get Data')
}
Now this request works great and returns me the data I need. The problem is I have over 10K documents, and on a PC takes about 10 seconds to return that data, and on a mobile phone it takes over 45 seconds. I know network on phone matters, but is there any way I can increase this? Nothing I have tried works. I keep finding that lean is the option to use, and I am already using it with no success or improvements.
Well, it's slow because you are returning all 10k results.
Do you actually need all 10k results? If not, you should consider filtering only results that you actually need.
If not, I suggest implementing pagination, where you would return results in batches (50 per page for example).
In addition, if you are using only some of the fields from the documents, you should tell MongoDB to return only these fields, and not all of them. That would also increase the performance since less data will be transferred through the network.

Weird socket.io behavior when Node server is down and then restarted

I implemented a simple chat for my website where users can talk to each other with ExpressJS and Socket.io. I added a simple protection from a ddos attack that can be caused by one person spamming the window like this:
if (RedisClient.get(user).lastMessageDate > currentTime - 1 second) {
return error("Only one message per second is allowed")
} else {
io.emit('message', ...)
RedisClient.set(user).lastMessageDate = new Date()
}
I am testing this with this code:
setInterval(function() {
$('input').val('message ' + Math.random());
$('form').submit();
}, 1);
It works correctly when Node server is always up.
However, things get extremely weird if I turn off the Node server, then run the code above, and start Node server again in a few seconds. Then suddenly, hundreds of messages are inserted into the window and the browser crashes. I assume it is because when Node server is down, socket.io is saving all the client emits, and once it detects Node server is online again, it pushes all of those messages at once asynchronously.
How can I protect against this? And what is exactly happening here?
edit: If I use Node in-memory instead of Redis, this doesn't happen. I am guessing cause servers gets flooded with READs and many READs happen before RedisClient.set(user).lastMessageDate = new Date() finishes. I guess what I need is atomic READ / SET? I am using this module: https://github.com/NodeRedis/node_redis for connecting to Redis from Node.
You are correct that this happens due to queueing up of messages on client and flooding on server.
When the server receives messages, it receives messages all at once, and all of these messages are not synchronous. So, each of the socket.on("message:... events are executed separately, i.e. one socket.on("message... is not related to another and executed separately.
Even if your Redis-Server has a latency of a few ms, these messages are all received at once and everything always goes to the else condition.
You have the following few options.
Use a rate limiter library like this library. This is easy to configure and has multiple configuration options.
If you want to do everything yourself, use a queue on server. This will take up memory on your server, but you'll achieve what you want. Instead of writing every message to server, it is put into a queue. A new queue is created for every new client and delete this queue when processing the last item in queue.
(update) Use multi + watch to create lock so that all other commands except the current one will fail.
the pseudo-code will be something like this.
let queue = {};
let queueHandler = user => {
while(queue.user.length > 0){
// your redis push logic here
}
delete queue.user
}
let pushToQueue = (messageObject) => {
let user = messageObject.user;
if(queue.messageObject.user){
queue.user = [messageObject];
} else {
queue.user.push(messageObject);
}
queueHandler(user);
}
socket.on("message", pushToQueue(message));
UPDATE
Redis supports locking with WATCH which is used with multi. Using this, you can lock a key, and any other commands that try to access that key in thet time fail.
from the redis client README
Using multi you can make sure your modifications run as a transaction,
but you can't be sure you got there first. What if another client
modified a key while you were working with it's data?
To solve this, Redis supports the WATCH command, which is meant to be
used with MULTI:
var redis = require("redis"),
client = redis.createClient({ ... });
client.watch("foo", function( err ){
if(err) throw err;
client.get("foo", function(err, result) {
if(err) throw err;
// Process result
// Heavy and time consuming operation here
client.multi()
.set("foo", "some heavy computation")
.exec(function(err, results) {
/**
* If err is null, it means Redis successfully attempted
* the operation.
*/
if(err) throw err;
/**
* If results === null, it means that a concurrent client
* changed the key while we were processing it and thus
* the execution of the MULTI command was not performed.
*
* NOTICE: Failing an execution of MULTI is not considered
* an error. So you will have err === null and results === null
*/
});
}); });
Perhaps you could extend your client-side code, to prevent data being sent if the socket is disconnected? That way, you prevent the library from queuing messages while the socket is disconnected (ie the server is offline).
This could be achieved by checking to see if socket.connected is true:
// Only allow data to be sent to server when socket is connected
function sendToServer(socket, message, data) {
if(socket.connected) {
socket.send(message, data)
}
}
More information on this can be found at the docs https://socket.io/docs/client-api/#socket-connected
This approach will prevent the built in queuing behaviour in all scenarios where a socket is disconnected, which may not be desirable, however if should protect against the problem you are noting in your question.
Update
Alternatively, you could use a custom middleware on the server to achieve throttling behaviour via socket.io's server API:
/*
Server side code
*/
io.on("connection", function (socket) {
// Add custom throttle middleware to the socket when connected
socket.use(function (packet, next) {
var currentTime = Date.now();
// If socket has previous timestamp, check that enough time has
// lapsed since last message processed
if(socket.lastMessageTimestamp) {
var deltaTime = currentTime - socket.lastMessageTimestamp;
// If not enough time has lapsed, throw an error back to the
// client
if (deltaTime < 1000) {
next(new Error("Only one message per second is allowed"))
return
}
}
// Update the timestamp on the socket, and allow this message to
// be processed
socket.lastMessageTimestamp = currentTime
next()
});
});

How to handle the Storage Queue using the WebJobs

I just started to use Azure as my mobile development as well as my web development.
I am using NodeJs as my framework to work on the azure backend. I am using mobile services and web apps in the azure.
Here is the situation, I am using the Storage Queue from Azure and I am using webjob from my webapps to handle the storage queues. The messages in the queue are going to be sent out to each specific user via notification hub. (Push Notification)
So, the queues will have the size of the 50,000 or more queue messages. All these messages are used to push out the message to the user one by one. However, I tried to handle the queues using WebJob by scheduling 2minutes interval. I know that webjob wont run two instances when the schedule is currently running.
Initially, I wanna use the webjob which run continuously but it will go to pending to restart once the script run finished. My assumption for the continuously running of webjob is that it will run under an endless loop for the script over and over again. until it caught exception or something wrong. My assumption goes wrong, where it will restart by it self once it succeeded the whole script. I know the restart can be adjusted to less than 60seconds but I am not sure whether this helps as I could a lot aysnc operation as well.
For my script, it will run 50,000 or more users messages in the loop. Then, it will send out the push message via Azure nodejs package and then upon return, then it will delete the messages so that it wont appear in the queue anymore. So, there will be some async operation for each loop in the action.
However, everything is working fine but the webjob only have execute maximum of 5 mins and then it will run again on next schedule. Meaning, it will only run to a maximum 5 mins regardless of the operation. I tried with 1,000 messages from the queue and everything works fine but when the messages go up to 5,000 and above, the time is not sufficient. Therefore, some of the async operation is not completed which cause the messages are not deleted.
Is there a way to extend the 5 mins execution time or other better ways to handle the Storage Queues. I looked into the Webjobs SDK but it is only limited to C# and Visual Studio. I am using Mac OSX and Javascript which I could not use.
Please advise as I wasted a lot of time figuring out whats best to handle the storage queue using webjobs but now it seems like it does not serve the purpose when the messages grow bigger and when it dealt with async operation with the total of only 5 mins execution time. I do not have any VM at the moments which I only use PAAS in azure.
According your description:
All these messages are used to push out the message to the user one by one
it will run 50,000 or more users messages in the loop
So your requirement is to send each message in queue to user,and now you get all the messages in queue one time even the message size will get up to more then 50,000, and loop the messages for further operations?
If there is any misunderstanding, feel free to let me know.
In my opinion, cloud you get the top message of the queue at once, and send it to your user, so that it will remarkbly reduce the processing time and which can be set in a continuously webjob. You can refer to How To: Peek at the Next Message to see how to peek at the message in the front of a queue without removing it from the queue
update
As I found you have mentioned that I also have a Web App in Node.js in your whole project architecture.
So I consider whether you can leverage continuous webjob in Web Apps to get one message and send to Notification Hub one time.
And here is my test code snippet:
var azureStorage = require('azure-storage'),
azure = require('azure'),
accountName = '<accountName>',
accountKey = '<accountKey>';
var queueSvc = azureStorage.createQueueService(accountName, accountKey);
var notificationHubService = azure.createNotificationHubService('<notificationhub-name>', '<connectionstring>');
queueSvc.getMessages('myqueue', {numOfMessages:1}, function(error, result, response) {
if (!error) {
// Message text is in messages[0].messagetext
var message = result[0];
console.log(message.messagetext);
var payload = {
data: {
msg: message.messagetext
}
};
notificationHubService.gcm.send(null, payload, function(error) {
if (!error) {
//notification sent
console.log('notification sent');
queueSvc.deleteMessage('myqueue', message.messageid,message.popreceipt,function(error, response) {
if (!error) {
console.log(response);
// Message deleted
} else {
console.log(error);
}
});
}
});
}
});
Details refer to How to use Notification Hubs from Node.js And https://github.com/Azure/azure-storage-node/blob/master/lib/services/queue/queueservice.js#L727
update2
As I get the idea of Service-bus demo on GitHub, I modified the code above, and which greatly improve the efficiency.
Here the code snippet, for your information:
var queueName = 'myqueue';
function checkForMessages(queueSvc, queueName, callback) {
queueSvc.getMessages(queueName, function(err, message) {
if (err) {
if (err === 'No messages to receive') {
console.log('No messages');
} else {
console.log(err);
// callback(err);
}
} else {
callback(null, message[0]);
console.log(message);
}
});
}
function processMessage(queueSvc, err, lockedMsg) {
if (err) {
console.log('Error on Rx: ', err);
} else {
console.log('Rx: ', lockedMsg);
var payload = {
data: {
msg: lockedMsg.messagetext
}
};
notificationHubService.gcm.send(null, payload, function(error) {
if (!error) {
//notification sent
console.log('notification sent');
console.log(lockedMsg)
console.log(lockedMsg.popreceipt)
queueSvc.deleteMessage(queueName, lockedMsg.messageid, lockedMsg.popreceipt, function(err2) {
if (err2) {
console.log('Failed to delete message: ', err2);
} else {
console.log('Deleted message.');
}
})
}
});
}
}
var t = setInterval(checkForMessages.bind(null, queueSvc, queueName, processMessage.bind(null, queueSvc)), 100);
I set the loop time as 100ms in setInterval, now it can process almost 600 message per minutes in my test.
The various configuration settings for WebJobs are explained on this wiki page. In your case you should increase the WEBJOBS_IDLE_TIMEOUT value, which is the time in seconds that a triggered job will timeout if it hasn't produced any output for a period of time. The WEBJOBS_IDLE_TIMEOUT setting needs to be configured in the portal app settings, not via the app.config file.

Waiting for MongoDB findOne callback to complete before finishing app.get()

I'm relatively new to Javascript and I am having trouble understanding how to use a MongoDB callback with an ExpressJS get. My problem seems to be if it takes too long for the database search, the process falls out of the app.get() and gives the webpage an "Error code: ERR_EMPTY_RESPONSE".
Currently it works with most values, either finding the value or properly returning a 404 - not found, but there are some cases where it hangs for a few seconds before turning the ERR_EMPTY_RESPONSE. In the debugger, it reaches the end of the app.get(), where it returns ERR_EMPTY_RESPONSE, and after that the findOne callback finishes and goes to the 404, but by then it is too late.
I've tried using async and introducing waits with no success, which makes me feel like I am using app.get and findOne incorrectly.
Here is a general version of my code below:
app.get('/test', function (req, res) {
var value = null;
if (req.query.param)
value = req.query.param;
else
value = defaultValue;
var query = {start: {$lte: value}, end: {$gte: value}};
var data = collection.findOne(query, function (err, data) {
if (err){
res.sendStatus(500);
}
else if (data) {
res.end(data);
}
else{
res.sendStatus(404);
}
});
});
What can I do to have the response wait for the database search to complete? Or is there a better way to return a database document from a request? Thanks for the help!
You should measure how long the db query takes.
If it's slow >5sec and you can't speed it up, than it might be a good idea to decouple it from the request by using some kind of job framework.
Return a redirect the url where the job status/result will be available.
I feel silly about this, but I completely ignored the fact that when using http.createServer(), I had a timeout set of 3000 ms. I misunderstood what this timeout was for and this is what was causing my connection to close prematurely. Increasing this number allowed my most stubborn queries to complete.

NodeJS one-shot script using MongoDB connection pooling - how to terminate?

I'm aware of the best practice of MongoDB connection pooling in NodeJS of the singleton DB connection type like this
var db = null;
var connection = function getDBConnection(callback) {
if(db) { callback(null, db) } else { MongoClient.connect( .... ) }
}
module.exports = getDBConnection;
However, what I cannot get my head around at the moment is how to handle this in a one-shot script that, say, does some pre-initialization on the documents of a certain db collection:
getDBConnection(function (err, database) {
var collection = database.collection("objects");
var allObjectsArray = collection.find( /* ... */
).toArray(function (err, objects) {
if(err != null) console.log(err);
assert.equal(null, err);
_.each(objects, function (item) {
collection.update(
{ id: item.id},
{ $set: { /* ... */ }},
function (err, result) {
if(err != null) console.log(err);
assert.equal(null, err);
}
);
});
// database.close(); <-- this fails with "MongoError: Connection Closed By Application", thrown by the update callback
});
// database.close(); <-- this fails too, thrown by the toArray callback
});
If I call the script like that, it never terminates, due to the still open connection. If I close the connection at the bottom, it fails because of, well, a closed connection.
Considering that opening a new connection for every update is not really an option, what am I missing? Keeping the connection open may be fine for webapps, but for a one-shot script called from a shell script this really doesn't work out, does it?
Sorry if this question has arisen before, I've given it some research but have not quite been able to come up with a working answer for me...
Thanks!
Julian
As a "pooled connection" there is code running to keep the connection alive and establish more connections in the pool if required under the driver connection. So much like various "server code" methods, event loop handlers have been invoked and the process does not exit at the end of your code until these are de-registered.
Therefore your two choices to call after all your code has executed are either:
Call db.close() or in your code context specifically database.close() once all is done.
Call process.exit() which is a generic call in node.js applications which will shut the whole process down and therefore stop any other current event loop code. This actually gives you an option to throw an error on exit if you want your code to be "shell integrated" somewhere and look for the exit status.
Or call both. db.close() will allow execution to the next line of code and whatever you put there will also run.
But you have to wait until everything is called, so you can cannot use synchronous loops with asynchronous code in the middle:
async.each(objects,function(item,callback) {
collection.update(
{ "_id": item._id },
{
// updates
},
callback
);
},function(err) {
if (err) throw err;
database.close();
});

Categories

Resources