Request functions are executing out of order in node.js

Request functions are executing out of order in node.js - javascript

I am trying to make a scraper, but I cant seem to get the code to execute in the right order. I need the album/albumart request function to execute after the title and artist function. I know node.js is weird about this sort of thing, but I've tried moving things all over and still no luck.
Here's the Code
Please pardon the mess and excess debug code.
Current output:
TESTED!!!
req
No error
Pentemple - Pazuzu 2
Now Playing: Pentemple - Pazuzu 2
10
Pentemple
10
Pentemple
1
{ artist: '',
title: '',
album: '',
albumArt: '',
testval: 'TESTED!!!' }
xtest

Because of the asynchronous request calls, the responses might not be in order therefore to keep the order, you will need to make next request call in previous request's callback. Below is the example for the same -
request(url1, function(err, res, html){
if(!err)
{
// url1 successfully returned , call another dependent url
request(url2, function(err2, res2, html2){
if(!err2)
{
// url2 successfully returned, go on with another request call and so on ...
}
});
}
else
{
// first call failed, return gracefully here --
callback(err); // if you have any
}
})
However, as suggested in earlier answer as well, this is anti pattern and will result in messy and cluttered code known as pyramid of doom or callback hell.
I would suggest going with wonderful async npm module and then the same code can be written as -
var async = require('async');
async.waterfall([
function(callback) {
request(url1, function(error, res, html){
callback(null, res, html);
});
},
function(res1, html1, callback) {
request(url1, function(error, res, html){
callback(null, res1, html1, res, html);
});
} // ... AND SO ON
], function (err, result) {
// the result contains the response sent by the last request callback
if(!err)
{
// use your data
}
});

JavaScript is asynchronous. If the requests are dependent on each other I recommend using callbacks so that when one request is complete, it calls the next one.

In most cases performing a request in Javascript has an asynchronous nature. That means that requests do not block the entire process. To perform and action when the request is done callbacks are used. Callbacks are functions that are added into the event loop queue once the request gets in finished state. The easiest way (but for sure not the best one) to make reqeust run one after another is to call second request in the first callback, third request in seconds callback and so on.
request(profileurl, function (error, response, html) {
console.log("req");
if (!error) {
// ...
request(albumurl, function (error, response, html) {
if (!error) {
// ...
request(albumurl, function (error, response, html) {
// ...
});
});
} else {
console.log("ERROR: " + error);
}
});
But such a practice is considered to be anti-pattern and is called Pyramid of Doom, because nesting callbacks make the code unreadable, hard to to test and hard to maintain.
Good practice is considered to use promises. They come "in box" with ES2015. But if you use ES5, you should use some additional module for them like: request-promise or Q.

Related

Waiting for MongoDB findOne callback to complete before finishing app.get()

I'm relatively new to Javascript and I am having trouble understanding how to use a MongoDB callback with an ExpressJS get. My problem seems to be if it takes too long for the database search, the process falls out of the app.get() and gives the webpage an "Error code: ERR_EMPTY_RESPONSE".
Currently it works with most values, either finding the value or properly returning a 404 - not found, but there are some cases where it hangs for a few seconds before turning the ERR_EMPTY_RESPONSE. In the debugger, it reaches the end of the app.get(), where it returns ERR_EMPTY_RESPONSE, and after that the findOne callback finishes and goes to the 404, but by then it is too late.
I've tried using async and introducing waits with no success, which makes me feel like I am using app.get and findOne incorrectly.
Here is a general version of my code below:
app.get('/test', function (req, res) {
var value = null;
if (req.query.param)
value = req.query.param;
else
value = defaultValue;
var query = {start: {$lte: value}, end: {$gte: value}};
var data = collection.findOne(query, function (err, data) {
if (err){
res.sendStatus(500);
}
else if (data) {
res.end(data);
}
else{
res.sendStatus(404);
}
});
});
What can I do to have the response wait for the database search to complete? Or is there a better way to return a database document from a request? Thanks for the help!

You should measure how long the db query takes.
If it's slow >5sec and you can't speed it up, than it might be a good idea to decouple it from the request by using some kind of job framework.
Return a redirect the url where the job status/result will be available.

I feel silly about this, but I completely ignored the fact that when using http.createServer(), I had a timeout set of 3000 ms. I misunderstood what this timeout was for and this is what was causing my connection to close prematurely. Increasing this number allowed my most stubborn queries to complete.

Issuing internal express request

I'm curious if there is any way to issue an internal request in express without going through all the actual overhead of a real request. An example probably shows the motivation better:
app.get("/pages/:page", funciton(req, res)
{
database_get(req.params.page, function(result)
{
// "Page" has an internal data reference, which we want to inline with the actual data:
request(result.user_href, function(user_response)
{
result.user = user.response.json;
res.send(result);
});
});
});
/// ....
app.get("/user/:name", function() ... );
So what we have here is a route whose data requires making another request to get further data. I'd like to access it by just doing something like app.go_get(user_href) instead of the heavy weight actual request. Now, I've asked around and the going strategy seems to be "split out your logic". However, it actually requires me to duplicate the logic, since the recursive data is referenced properly through URLs (as in the example above). So I end up having to do my own routing and duplicating routes everywhere.

Can you avoid the overhead of a real request? No. If you need the href from the first request in order to go to get a user object, you absolutely need to follow that link by making a second "real request."
If you have a database of users, you CAN avoid the request by including the user's ID on the page, and making a regular database call instead of following your own href.
Demo refactor on splitting out logic:
// Keep as little logic as possible in your routes:
app.get('/page/:page', function(req, res){
var pageId = req.params.page;
makePage(pageId, function(err, result){
if(err){ return res.send(500) }
res.send(result)
})
})
// Abstract anything with a bunch of callback hell:
function makePage(pageId, callback){
database_get(pageId, function(result) {
// Since it's only now you know where to get the user info, the second request is acceptable
// But abstract it:
getUserByHref(result.user_href, function(err, data){
if(err){return callback(err)};
result.user = data.json;
callback(null, result);
});
});
}
// Also abstract anything used more than once:
function getUserByHref(href, callback){
request(href, function(err, response, body){
if(response.statusCode != 200){
return callback(err);
}
var user = JSON.parse(body);
return callback(null, user);
})
}
// It sounds like you don't have local users
// If you did, you would abstract the database call, and use getUserById
function getUserById(id, callback){
db.fetch(id, function(err, data){
return callback(err, data);
})
}

I've made a dedicated middleware for this uest, see my detailed answer here: https://stackoverflow.com/a/59514893/133327

Node.js function without callback

I have a node.js server. When a user requests a page I call a function that pulls some info from db and services the request. Simple function with callback then execute response.send
I need to perform secondary computation/database updates which are not necessary for rendering the page request. I don't want the user to wait for these secondary ops to complete (even though they take only 200 ms.)
Is there a way to call a function and exit gracefully without callback?

You can simply do something like this
app.get('/path', function(req, res){
getInfoFromDatabase(); // get info from the database
res.render('myview', {info: data});
// perform post render operations
postRenderingCode();
return;
});

If I understand your problem correctly you can use setTimeout with a value of 0 to place the maintenance code at the end of the execution queue.
function service(user, callback) {
// This will be done later
setTimeout(function() {
console.log("Doing some maintenance work now...");
}, 0);
// Service the user
callback("Here's your data " + user);
}
service("John", function(data) { console.log(data); });
service("Jane", function(data) { console.log(data); });
The output will be:
Here's your data John
Here's your data Jane
Doing some maintenance work now...
Doing some maintenance work now...

You can call your extra ASYNCHRONOUS function before, or after your actual response; for example:
yourCoolFunction() // does awesome stuff...
response.writeHead(200, 'OK');
response.write('some cool data response');
response.end();
Note that the "yourCoolFunction" mentioned must be asynchronous, else the rest of the code will wait for it to complete.

Assuming you're using express.js:
function(req, res, next) {
doSomeAsyncWork(function(e, d) {
// Some logic.
doSomeMoreAsyncWork(function() {})
res.send(/* some data*/)
})
}
Basically you don't really care about the response of the additional async work so you can put in a function that does nothing for the callback.

since I can see none of the answers so far are even somehow helpful, and in order to avoid confusing. What I suggest is use on the object you are working on the following:
function doStuff() {
myObj.emit('myEvent', param);
}
function callback(param) {
do stuff;
}
myObj.on('myEvent', callback);

well, just do what you said, render the page, respond to the request and do whatever you have to do, your code isn't suddenly going to die because you responded to the request.
with express:
function handleTheRequest(req, res) {
res.status(200).send("the response")
// do whatever you like here
}

Nested callbacks in javascript

I have a small problem with nested callbacks in javascript. Apparently im doing something wrong, but i did my research and tried to follow the tutorials avaialble throughout the web. I know that my code works, since query returns proper data, but i have no idea why my code doesnt "wait" within executeQuery method till the res is fetched from database, it just goes straight to "oh noes" section.
DatabaseConnection.prototype.executeQuery = function(query, executeQueryDone){
var activeConnection;
console.log("YEAAA, executing Query: " + query);
this.pool.getConnection(function (err, connection){
console.log("Got Connection, we are ready to go!");
if (err){
console.log("Error, DAMMNIT! " + err);
executeQueryDone(err);
}
activeConnection = connection;
activeConnection.connect();
activeConnection.query(query, function(error, res){
console.log("Connection from pool is executing Query");
if(error){
console.log("Error during executing query");
executeQueryDone(error);
}
else {
console.log(" OK now release connection (dont be selfish)! ");
activeConnection.release();
executeQueryDone(null, res);
}
});
});
console.log("oh noes! IM AFTER CONNECTION, why dude? WHY???? ");
};
I'd be grateful for any hints since im struggling with that since yesterday.
=====================
PROBLEM SOLVED:
generally all was OK, the "issue" was mistakenly written test:
i made it like that:
describe('testDB2', function () {
it('should return proper STUFF', function (done) {
assert.equal(1, someService.getStuff(function(err, result){
if (err === null){
console.log("err is null, as it should be!");
}
console.log(" result from DB " + result[1].NUMBERS);
}));
});
});
while is should be like that:
describe('testDB2', function () {
it('should return proper STUFF', function (done) {
someService.getStuff(function(err, result){
assert.equal(err, null);
assert.equal(result[1].NUMBERS, 43637654);
done();
});
});
});
as a result (in the incorrect case), i didnt fetch the result the way i wanted as assert couldnt "catch up"
thanks to all for the enlightment ;)

Your console.log call isn't part of a callback, so it will be called as soon as the getConnection call is made. If you want it to be called only after your callback to getConnection fires, you either need to call it at the end of that call, or you need to use some form of promises.

Javascript is single threaded, but it does use a task queue. When the database connection is instantiated and the pool connected to, the response to that is placed into the task queue to be executed or actioned when complete.
Directly after placing that in the task queue the next piece of execution is the log that does the "oh noes" - nice messaging btw lol.
So essentially what happens is the db call gets placed in the task queue for later execution, and then the log occurs, and then the task queue executes at a later time with the db response.

error handling in asynchronous node.js calls

I'm new to node.js although I'm pretty familiar with JavaScript in general. My question is regarding "best practices" on how to handle errors in node.js.
Normally when programming web servers, FastCGI servers or web pages in various languages I'm using Exceptions with blocking handlers in a multi-threading environment. When a request comes in I usually do something like this:
function handleRequest(request, response) {
try {
if (request.url=="whatever")
handleWhateverRequest(request, response);
else
throw new Error("404 not found");
} catch (e) {
response.writeHead(500, {'Content-Type': 'text/plain'});
response.end("Server error: "+e.message);
}
}
function handleWhateverRequest(request, response) {
if (something)
throw new Error("something bad happened");
Response.end("OK");
}
This way I can always handle internal errors and send a valid response to the user.
I understand that with node.js one is supposed to do non-blocking calls which obviously leads to various number of callbacks, like in this example:
var sys = require('sys'),
fs = require('fs');
require("http").createServer(handleRequest).listen(8124);
function handleRequest(request, response) {
fs.open("/proc/cpuinfo", "r",
function(error, fd) {
if (error)
throw new Error("fs.open error: "+error.message);
console.log("File open.");
var buffer = new require('buffer').Buffer(10);
fs.read(fd, buffer, 0, 10, null,
function(error, bytesRead, buffer) {
buffer.dontTryThisAtHome(); // causes exception
response.end(buffer);
}); //fs.read
}); //fs.open
}
This example will kill the server completely because exceptions aren't being catched.
My problem is here that I can't use a single try/catch anymore and thus can't generally catch any error that may be raised during the handling of the request.
Of course I could add a try/catch in each callback but I don't like that approach because then it's up to the programmer that he doesn't forget a try/catch. For a complex server with lots of different and complex handlers this isn't acceptable.
I could use a global exception handler (preventing the complete server crash) but then I can't send a response to the user since I don't know which request lead to the exception. This also means that the request remains unhandled/open and the browser is waiting forever for a response.
Does someone have a good, rock solid solution?

Node 0.8 introduces a new concept called "Domains". They are very roughly analogousness to AppDomains in .net and provide a way of encapsulating a group of IO operations. They basically allow you to wrap your request processing calls in a context specific group. If this group throws any uncaught exceptions then they can be handled and dealt with in a manner which gives you access to all the scope and context specific information you require in order to successfully recover from the error (if possible).
This feature is new and has only just been introduced, so use with caution, but from what I can tell it has been specifically introduced to deal with the problem which the OP is trying to tackle.
Documentation can be found at: http://nodejs.org/api/domain.html

Checkout the uncaughtException handler in node.js. It captures the thrown errors that bubble up to the event loop.
http://nodejs.org/docs/v0.4.7/api/process.html#event_uncaughtException_
But not throwing errors is always a better solution. You could just do a return res.end('Unabled to load file xxx');

This is one of the problems with Node right now. It's practically impossible to track down which request caused an error to be thrown inside a callback.
You're going to have to handle your errors within the callbacks themselves (where you still have a reference to the request and response objects), if possible. The uncaughtException handler will stop the node process from exiting, but the request that caused the exception in the first place will just hang there from the user point of view.

Very good question. I'm dealing with the same problem now. Probably the best way, would be to use uncaughtException. The reference to respone and request objects is not the problem, because you can wrap them into your exception object, that is passed to uncaughtException event. Something like this:
var HttpException = function (request, response, message, code) {
this.request = request;
this.response = response;
this.message = message;
this.code = code || 500;
}
Throw it:
throw new HttpException(request, response, 'File not found', 404);
And handle the response:
process.on('uncaughtException', function (exception) {
exception.response.writeHead(exception.code, {'Content-Type': 'text/html'});
exception.response.end('Error ' + exception.code + ' - ' + exception.message);
});
I haven't test this solution yet, but I don't see the reason why this couldn't work.

I give an answer to my own question... :)
As it seems there is no way around to manually catch errors. I now use a helper function that itself returns a function containing a try/catch block. Additionally, my own web server class checks if either the request handling function calls response.end() or the try/catch helper function waitfor() (raising an exception otherwise). This avoids to a great extent that request are mistakenly left unprotected by the developer. It isn't a 100% error-prone solution but works well enough for me.
handler.waitfor = function(callback) {
var me=this;
// avoid exception because response.end() won't be called immediately:
this.waiting=true;
return function() {
me.waiting=false;
try {
callback.apply(this, arguments);
if (!me.waiting && !me.finished)
throw new Error("Response handler returned and did neither send a "+
"response nor did it call waitfor()");
} catch (e) {
me.handleException(e);
}
}
}
This way I just have to add a inline waitfor() call to be on the safe side.
function handleRequest(request, response, handler) {
fs.read(fd, buffer, 0, 10, null, handler.waitfor(
function(error, bytesRead, buffer) {
buffer.unknownFunction(); // causes exception
response.end(buffer);
}
)); //fs.read
}
The actual checking mechanism is a little more complex, but it should be clear how it works. If someone is interested I can post the full code here.

One idea: You could just use a helper method to create your call backs and make it your standard practice to use it. This does put the burden on the developer still, but at least you can have a "standard" way of handling your callbacks such that the chance of forgetting one is low:
var callWithHttpCatch = function(response, fn) {
try {
fn && fn();
}
catch {
response.writeHead(500, {'Content-Type': 'text/plain'}); //No
}
}
<snipped>
var buffer = new require('buffer').Buffer(10);
fs.read(fd, buffer, 0, 10, null,
function(error, bytesRead, buffer) {
callWithHttpCatch(response, buffer.dontTryThisAtHome()); // causes exception
response.end(buffer);
}); //fs.read
}); //fs.open
I know that probably isn't the answer you were looking for, but one of the nice things about ECMAScript (or functional programming in general) is how easily you can roll your own tooling for things like this.

At the time of this writing, the approach I am seeing is to use "Promises".
http://howtonode.org/promises
https://www.promisejs.org/
These allow code and callbacks to be structured well for error management and also makes it more readable.
It primarily uses the .then() function.
someFunction().then(success_callback_func, failed_callback_func);
Here's a basic example:
var SomeModule = require('someModule');
var success = function (ret) {
console.log('>>>>>>>> Success!');
}
var failed = function (err) {
if (err instanceof SomeModule.errorName) {
// Note: I've often seen the error definitions in SomeModule.errors.ErrorName
console.log("FOUND SPECIFIC ERROR");
}
console.log('>>>>>>>> FAILED!');
}
someFunction().then(success, failed);
console.log("This line with appear instantly, since the last function was asynchronous.");

Two things have really helped me solve this problem in my code.
The 'longjohn' module, which lets you see the full stack trace (across multiple asyncronous callbacks).
A simple closure technique to keep exceptions within the standard callback(err, data) idiom (shown here in CoffeeScript).
ferry_errors = (callback, f) ->
return (a...) ->
try f(a...)
catch err
callback(err)
Now you can wrap unsafe code, and your callbacks all handle errors the same way: by checking the error argument.

I've recently created a simple abstraction named WaitFor to call async functions in sync mode (based on Fibers): https://github.com/luciotato/waitfor
It's too new to be "rock solid".
using wait.for you can use async function as if they were sync, without blocking node's event loop. It's almost the same you're used to:
var wait=require('wait.for');
function handleRequest(request, response) {
//launch fiber, keep node spinning
wait.launchFiber(handleinFiber,request, response);
}
function handleInFiber(request, response) {
try {
if (request.url=="whatever")
handleWhateverRequest(request, response);
else
throw new Error("404 not found");
} catch (e) {
response.writeHead(500, {'Content-Type': 'text/plain'});
response.end("Server error: "+e.message);
}
}
function handleWhateverRequest(request, response, callback) {
if (something)
throw new Error("something bad happened");
Response.end("OK");
}
Since you're in a fiber, you can program sequentially, "blocking the fiber", but not node's event loop.
The other example:
var sys = require('sys'),
fs = require('fs'),
wait = require('wait.for');
require("http").createServer( function(req,res){
wait.launchFiber(handleRequest,req,res) //handle in a fiber
).listen(8124);
function handleRequest(request, response) {
try {
var fd=wait.for(fs.open,"/proc/cpuinfo", "r");
console.log("File open.");
var buffer = new require('buffer').Buffer(10);
var bytesRead=wait.for(fs.read,fd, buffer, 0, 10, null);
buffer.dontTryThisAtHome(); // causes exception
response.end(buffer);
}
catch(err) {
response.end('ERROR: '+err.message);
}
}
As you can see, I used wait.for to call node's async functions in sync mode,
without (visible) callbacks, so I can have all the code inside one try-catch block.
wait.for will throw an exception if any of the async functions returns err!==null
more info at https://github.com/luciotato/waitfor

Also in synchronous multi-threaded programming (e.g. .NET, Java, PHP) you can't return any meaningful information to the client when a custom unkown Exception is caught. You may just return HTTP 500 when you have no info regarding the Exception.
Thus, the 'secret' lies in filling a descriptive Error object, this way your error handler can map from the meaningful error to the right HTTP status + optionally a descriptive result. However you must also catch the exception before it arrives to process.on('uncaughtException'):
Step1: Define a meaningful error object
function appError(errorCode, description, isOperational) {
Error.call(this);
Error.captureStackTrace(this);
this.errorCode = errorCode;
//...other properties assigned here
};
appError.prototype.__proto__ = Error.prototype;
module.exports.appError = appError;
Step2: When throwing an Exception, fill it with properties (see step 1) that allows the handler to convert it to meannigul HTTP result:
throw new appError(errorManagement.commonErrors.resourceNotFound, "further explanation", true)
Step3: When invoking some potentially dangerous code, catch errors and re-throw that error while filling additional contextual properties within the Error object
Step4: You must catch the exception during the request handling. This is easier if you use some leading promises library (BlueBird is great) which allows you to catch async errors. If you can't use promises than any built-in NODE library will return errors in callback.
Step5: Now that your error is caught and contains descriptive information about what happens, you only need to map it to meaningful HTTP response. The nice part here is that you may have a centralized, single error handler that gets all the errors and map these to HTTP response:
//this specific example is using Express framework
res.status(getErrorHTTPCode(error))
function getErrorHTTPCode(error)
{
if(error.errorCode == commonErrors.InvalidInput)
return 400;
else if...
}
You may other related best practices here

Develop Reference

JavaScript is the programming language of the Web.

Request functions are executing out of order in node.js - javascript

JavaScript is asynchronous. If the requests are dependent on each other I recommend using callbacks so that when one request is complete, it calls the next one.

Related

Waiting for MongoDB findOne callback to complete before finishing app.get()

Issuing internal express request

Node.js function without callback

Nested callbacks in javascript

error handling in asynchronous node.js calls

Categories

Resources