Exceptions cause Node.js servers to crash. The common wisdom is that one needs to monitor Node.js processes and restart them on crash. Why can't one just wrap the entire script in a try {} catch {} or listen for exception events?
It's understood that catching all exceptions is bad because it leaves the interpreter in an unpredictable state. Is this actually true? Why? Is this an issue specific to V8?
Catching all exception does not leave the interpreter in a bad state, but may leave your application in a bad state. An uncaught exception means that something that you were expecting to work, failed, and your application does not know how to handle that failure.
For example, if your app is a web server that listens to port 80 for connections, and when the app launches the port is in use, an exception is raised. Your code may ignore it, and then the process may continue running without actually listening to the port, so the process will be futile, or handle it: print an error message, a warning, kill the other process, or any way you want it to be handled. But you can see why ignoring it is not a good idea.
Another example is failing to communicate with a database (dropping a connection, unable to connect, receiving an unexpected error). If you app flow does not catch the exception properly, but just ignores it, you may be sending the user an acknowledgement of an event that failed.
Exceptions are part of the event engine. Instead of wrapping in a try what you want to do is listen for that exception
http://debuggable.com/posts/node-js-dealing-with-uncaught-exceptions:4c933d54-1428-443c-928d-4e1ecbdd56cb Then respond in the proper manner.
As for the second part of your question:
It really depends on your application. You need to test to see if the exception is something more or less you expect. Sometimes an exception is real not just a file not found.
Related
I have a nodejs project that spawns multiple processes that communicate with socket io (the process both send data and receieve it).
Sometimes during feature development, other programmers might do mistakes that will cause my socket infrastructure code to send large messages that are over size X (for example, over 500MB).
I am looking for a way to catch such cases and log them, I also wan't that such specific request specifically will fail.
Right now the behavior is that the entire socket connections fails (not just the specific big messages) and I didn't find anyway to even catch the case in order to know that this is the cause.
Right now I limit the message size with the "maxHttpBufferSize" configuration:
Socket.IO doc about it here - "how many bytes or characters a message can be, before closing the session (to avoid DoS)."
But I can't seem to catch any error when this size exceeds and the entire connection fails which is very bad for me.
I did think about trying to add some checks around my sending data through socket code, to check myself if the message size is to big, but serializing the data to make this check will have a very heavy price on performance, and socket.io code should already be doing this before sending the data so I don't want to make this happen twice.
The best solution i'm hoping to find is a socket.io supported way to handle this, if there is a different way that is performace costy that can also be good.
Note: I am working with socket io 2.0.4.
If there is a fix only in higher version that's also acceptable.
I am writing server code in ASP.NET and using the MVC framework. My client code is javascript.
Whether using JQuery or not to send an AJAX request from browser to server, you can set a timeout, and of course the browser also has a timeout it can enforce seperately. If either of those timeouts are reached, the failure method of the ajax invocation is called and one of the arguments specifies a timeout.
HOWEVER. I have recently discovered, it seems, that the server can timeout an AJAX call, and terminate with a 500 error code, which the client will receive as a 500 error code no different than any unexpected error. However, in this case, the underlying server worker process continues -- it is only the web server itself (IIS in my case) that interrupts and sends a 500 error code without letting the worker know, so that the worker continues on blissfully unaware that there is a problem.
Meanwhile, the client in the error handler for the AJAX call gets a status of 500, which it would naturally interpret as a failure of the worker process, perhaps an unhandled exception and ungraceful termination. As far as I can see, there is no way for the client to KNOW that the problem was a timeout, seperately from an unexpected exception. So the client code might incorrectly assume that the worker process is dead, when in fact it is very much alive.
So.... threefold question:
Is there a way in MVC ASP.NET to control the server timeout settings?
Is there a way for the worker process that the server thinks is taking too long to be informed if the server generates a timeout on its behalf?
Is there a way for the client side AJAX failure callback to know that this particular 500 error is not because the worker had an unexpected error, but because the wrapping server code decided it was taking too long?
w.r.t. #3, I can see that the responseText property of the Ajax response does contain some html that if rendered would tell the user that there was a timeout, but programmatically parsing for that seems really messy and unreliable.
Anyone else run into this?
ADD / EDIT, 4pm PDT on 1/26:
Based on a comment immediately below suggesting that I might find a solution with this article, I implemented the suggested filter. It did not work. I was able to trigger the new filter by explicitly throwing an unhandled timeout exception from the worker process, so my filter as per that other SO article was clearly in play, but it wasn't triggered.
I should add that this application is running as a windows Azure web site. It is my belief from the the circumstantial evidence/data that I have been able to accumulate that IIS on the VM is itself interrupting the request and responding with a 500 error without even telling the underlying worker process or MVC app that it has summarily terminated the request.
So this almost seems like an IT issue of being able to configure IIS on that particular VM for that particular web site.
From my data, it appears that IIS is just canceling the request after 230 seconds.
Try this web.config entry:
<system.web>
<httpRuntime executionTimeout="your-timeout-in-seconds">
...
</system.web>
I'm trying to apply this so that my server will tell the clients when it is closed. I don't understand why the server will not emit. It seems like the program closes before it gets a chance to emit, but console.log() works. I think my problem probably has to do with the synchronous nature of process.on as mentioned here, but honestly I don't understand enough about what (a)synchronous really means in this context. Also, I'm on Windows 7 if that helps.
// catch ctrl+c event and exit normally
process.on('SIGINT', function (code) {
io.emit("chat message", "Server CLOSED");
console.log("Server CLOSED");
process.exit(2);
});
I just started messing around with this stuff today so forgive my ignorance. Any help is greatly appreciated!
Full server code.
io.emit() is an asynchronous operation (you can say that it works in the background) and due to various TCP optimizations (perhaps such as Nagle's algorithm), your data may not be sent immediately.
process.exit() takes effect immediately.
You are likely shutting down your app and thus all resources it owns before the message is successfully sent and acknowledged over TCP.
One possible work-around is to do the process.exit(2) on a slight delay that gives the TCP stack a chance to send the data before you shut it down.
Another possibility is to just avoid that last chat message. The client will shortly see that the connection to the server was closed and that it cannot reconnect so it should be equipped to display that info to the user anyway (in cases of a server crash).
You could also consider turning off the Nagle algorithm which attempts to wait a short bit before sending data in case you immediately send some more data that could be combined into the same packet. But, to know whether that would work reliably, you'd have to test pretty thoroughly on appropriate platforms and it's possible that even turning this off wouldn't fix the issue since it is a race between the TCP stack to send out its buffered data and the shutting down of all resources owned by this process (which includes the open socket).
i am running a nodejs code (server.js) as a jxcore using
jx mt-keep:4 server.js
we have a lot of request hit per seconds and mostly transaction take place.
I am looking for a way to catch error incase any thread dies and the request information is
returned back to me so that i can catch that request and take appropriate action based on it.
So in this i might not lose teh request coming in and would handle it.
This is a nodejs project and due to project urgency has been moved to jxcore.
Please let me know if there is a way to handle it even from code level.
Actually it's similar to a single Node.JS instance. You have same tools and options for handling the errors.
Besides, JXcore thread warns the task queue when it catches an unexpected exception on the JS land (Task queue stops sending the requests back to this instance) then safely restarts the particular thread. You may listen to 'uncaught exception', 'restart' events for the thread and manage a softer restart.
process.on('restart', res_cb, exit_code){
// thread needs a restart (due to unhandled exception, IO, hardware etc..)
// prepare your app for this thread's restart
// call res_cb(exit_code) for restart.
});
Note: JXcore expects the application is up and running at least for 5 seconds before restarting any thread. Perhaps this limitation protects the application from looping thread restarts.
You may start your application using 'jx monitor' it supports multi thread and reloads the crashed processes.
I'm quite new to Express.js and one of the things that surprised me more at first, compare to other servers such as Apache or IIS, is that Express.js server crashes every time it encounters an uncatched exception or some error turning down the site and making it accessible for users. A terrible thing!
For example, my application is crashing with a Javascript error because a variable is not defined due to a name change in the database table.
TypeError: Cannot call method 'replace' of undefined
This is not such a good example, because I should solve it before moving the site to production, but sometimes similar errors can take part which shouldn't be causing a server crash.
I would expect to see an error page or just an error in that specific page, but turning down the whole server for these kind of things sounds terrifying.
Express error handlers doesn't seem to be enough for this purposes.
I've been reading about how to solve this kind of things in Node.js by using domains, but I found nothing specifically for Express.js.
Another option I found, which doesn't seem to be recommended in all cases, is using tools to keep running a process forever, so after a crash, it would restart itself. Tools like Forever, Upstart or Monit.
How do you guys deal with this kind of problems in Express.js?
The main difference between Apache and nodejs in general is that Apache forks a process per request while nodejs is single threaded, hence if an error occurs in Apache then the process handling that request will crash while the others will continue to work, in nodejs instead the only thread goes down.
In my projects I use monit to check memory/cpu (if nodejs takes to much resources of my vps then monit will restart nodejs) and daemontools to be sure nodejs is always up and running.
I would recommend using Domains along with clusters. There is example in doc itself at http://nodejs.org/api/domain.html. There are also some modules for expressjs https://www.npmjs.org/package/express-domain-middleware.
So when such errors occur use of domain along with cluster will help us separate context of where error occur and will effect only single worker in cluster, we should be logging them and disconnect that worker in cluster and refork it. We can then read logs to fix such errors that need to be fixed in code.
I was facing the same issue and I fixed using try/cache like this. So I created different route files and included each route files in try/cache block like this.
try{
app.use('/api', require('./routes/user'))
}
catch(e)
{
console.log(e);
}
try{
app.use('/api', require('./routes/customer'))
}
catch(e)
{
console.log(e);
}