Node.js async, but only handle first positive/defined result - javascript

What is the best way to create parallel asynchronous HTTP requests and take the first result that comes back positive? I am familiar with the async library for JavaScript and would happy to use that but am not sure if it has exactly what I want.
Background - I have a Redis store that serves as state for a server. There is an API we can call to get some data that takes much longer than reaching the Redis store.
In most cases the data will already be in the Redis store, but in some cases it won't be there yet and we need to retrieve it from the API.
The simple thing to do would be to query Redis, and if the value is not in Redis then go to the API afterwards. However, we'll needlessly lose 20-50ms if the data is not yet in our Redis cache and we have to go to the API after failing to find the data with Redis. Since this particular API server is not under great load, it won't really hurt to go to the API simultaneously/in parallel, even if we don't absolutely need the returned value.
//pseudocode below
async.minimum([
function apiRequest(cb){
request(opts,function(err,response,body){
cb(err,body.result.hit);
}
},
function redisRequest(cb){
client.get("some_key", function(err, reply) {
cb(err,reply.result.hit);
});
}],
function minimumCompleted(err,result){
// this mimimumCompleted final callback function will be only fired once,
// and would be fired by one of the above functions -
// whichever one *first* returned a defined value for result.hit
});
is there a way to get what I am looking for with the async library or perhaps promises, or should I implement something myself?

Use Promise.any([ap, bp]).
The following is a possible way to do it without promises. It is untested but should meet the requirements.
To meet requirement of returning the first success and not just the first completion, I keep a count of the number of completions expected so that if an error occurs it can be ignored it unless it is the last error.
function asyncMinimum(a, cb) {
var triggered = false;
var completions = a.length;
function callback(err, data) {
completions--;
if (err && completions !== 0) return;
if (triggered) return;
triggered = true;
return cb(err, data);
}
a.map(function (f) { return f(callback); });
}
asyncMinimum([
function apiRequest(cb){
request(opts,function(err,response,body){
cb(err,body.result.hit);
}
},
function redisRequest(cb){
client.get("some_key", function(err, reply) {
cb(err,reply.result.hit);
});
}],
function minimumCompleted(err,result){
// this mimimumCompleted final callback function will be only fired once,
// and would be fired by one of the above functions -
// whichever one had a value for body.result.hit that was defined
});

The async.js library (and even promises) keep track of the number of asynchronous operations pending by using a counter. You can see a simple implementation of the idea in an answer to this related question: Coordinating parallel execution in node.js
We can use the same concept to implement the minimum function you want. Only, instead of waiting for the counter to count all responses before triggering a final callback, we deliberately trigger the final callback on the first response and ignore all other responses:
// IMHO, "first" is a better name than "minimum":
function first (async_functions, callback) {
var called_back = false;
var cb = function () {
if (!called_back) {
called_back = true; // block all other responses
callback.apply(null,arguments)
}
}
for (var i=0;i<async_functions.length;i++) {
async_functions[i](cb);
}
}
Using it would be as simple as:
first([apiRequest,redisRequest],function(err,result){
// ...
});

Here's an approach using promises. It takes a little extra custom code because of the non-standard result you're looking for. You aren't just looking for the first one to not return an error, but you're looking for the first one that has a specific type of result so that takes a custom result checker function. And, if none get a result, then we need to communicate that back to the caller by rejecting the promise too. Here's the code:
function firstHit() {
return new Promise(function(resolve, reject) {
var missCntr = 0, missQty = 2;
function checkResult(err, val) {
if (err || !val) {
// see if all requests failed
++missCntr;
if (missCntr === missQty) {
reject();
}
} else {
resolve(val);
}
}
request(opts,function(err, response, body){
checkResult(err, body.result.hit);
}
client.get("some_key", function(err, reply) {
checkResult(err, reply.result.hit);
});
});
}
firstHit().then(function(hit) {
// one of them succeeded here
}, function() {
// neither succeeded here
});
The first promise to call resolve() will trigger the .then() handler. If both fail to get a hit, then it will reject the promise.

Related

Observable determine if subscriber function has finished

What is the best way to determine if the subscriber has finished executing or better yet return something and catch it up-stream? For example:
this._subscriptions.push(this._client
.getCommandStream(this._command) // Returns an IObservable from a Subject stream
.subscribe(msg => {
// Do some processing maybe some promise stuff
http.request(url).then(
// some more stuff
);
});
What's the best know to determine that subscription has finished. I've implemented it as follows:
this._subscriptions.push(this._client
.getCommandStream(this._command)
.subscribe(msg => {
// Do some processing maybe some promise stuff
http.request(url).then(re => {
// some more stuff
msg.done()
}).catch(err => msg.done(err));
});
i.e. added a done method to the object being passed in to determine if this is finished. The issue with that is I'll have to call done in every promise or catch block and find that a little too exhaustive. Is there a cleaner and more automated way of doing this?
I think the examples I've given are not good enough. This implementation is using RX to build an internal messaging bus. The get command stream is actually returning a read-only channel (as an Observable) to get commands and process them. Now the processing could be a http request followed by many other things or just an if statement.
this._client
.getCommandStream(this._command) // Returns an IObservable from a Subject stream
.subscribe(msg => {
// Do some processing maybe some promise stuff
http.request(url).then({
// some more stuff
}).then({
// Here I wanna do some file io
if(x) {
file.read('path', (content) => {
msg.reply(content);
msg.done();
});
} else {
// Or maybe not do a file io or maybe even do some image processing
msg.reply("pong");
msg.done()
}
});
});
I feel like this is a fine usage of the Observable pattern as this is exactly a sequence of commands coming in and this logic would like to act on them. The question is notice msg.done() being called all over the place. I want to know what is the best way to limit that call and know when the entire thing is done. Another option is to wrap it all in a Promise but then again what's the difference between resolve or msg.done()?
Actually, making another asynchronous request inside subscribe() isn't recommended because it just makes things more complicated and using Rx in this way doesn't help you make your code more understandable.
Since you need to make a request to a remote service that returns a PRomise you can merge it into the chain:
this._subscriptions.push(this._client
.getCommandStream(this._command)
.concatMap(msg => http.request(url))
.subscribe(...)
Also the 3rd parameter to subscribe is a callback that is called when the source Observable completes.
You can also add your own teardown logic when the chain is being disposed. This is called after the complete callback in subscribe(...) is called:
const subscription = this._subscriptions.push(this._client
...
.subscribe(...)
subscription.add(() => doWhatever())
Btw, this is equivalent to using the finally() operator.
As per RxJs subscribe method documentation, the last Argument is completed function
var source = Rx.Observable.range(0, 3)
var subscription = source.subscribe(
function (x) {
console.log('Next: %s', x);
},
function (err) {
console.log('Error: %s', err);
},
function () {
console.log('Completed');
});
please refer this documentation
https://github.com/Reactive-Extensions/RxJS/blob/master/doc/api/core/operators/subscribe.md

Using promises/callbacks to wait for function to finish in javascript

I have a function which creates a database object out of three arrays. The arrays are filled in an each loop, one of the arrays relies on the value in the same iteration of the loop.
The dependent array uses the requests library and the cheerio library to grab a string to populate the array with.
Currently the dependent array fills with nulls which I think is because the loop is not waiting for the request to be returned.
I am still learning and would like to get this to work without direct blocking to keep things asynchronous so I'm looking into promises/callbacks.
This is being done server-side but from what I've seen in cheerios docs there is no promises capability.
Here's what I have so far. (getFile() is the function that isn't filling the 'c' array, it also depends on the current value being put into 'b'). I do know that the getFile function gets the correct value with a console log test, so the issue must be in the implementation of filling 'c'.
addToDB() is a function which saves a value into mongoDB, from testing I know that the objects are correctly being put into the db, just the c array is not correct.
function getInfo(path) {
$(path).each(function(i,e) {
a.push(...)
b.push(value)
c.push(getFile(value))
})
var entry = new DB...//(a,b,c)
addToDB(entry);
}
function getFile(...) {
request(fullUrl, function (err, resp, page) {
if (!err && resp.statusCode == 200) {
var $ = cheerio.load(page); // load the page
srcEp = $(this).attr("src");
return srcEp;
} // end error and status code
}); // end request
}
I've been reading about promises/callbacks and then() but I've yet to find anything which works.
First, you have to get your mind around the fact that any process that relies, at least in part, on an asynchronous sub-process, is itself inherently asynchronous.
At the lowest level of this question's code, request() is asynchronous, therefore its caller, getFile() is asynchronous, and its caller, getInfo() is also asynchronous.
Promises are an abstraction of the outcome of asynchronous processes and help enormously in coding the actions to be taken when such processes complete - successfully or under failure.
Typically, low-level asynchronous functions should return a promise to be acted on by their callers, which will, in turn, return a promise to their callers, and so on up the call stack. Inside each function, returned promise(s) may be acted on using promise methods, chiefly .then(), and may be aggregated using Promise.all() for example (syntax varies).
In this question, there is no evidence that request() currently returns a promise. You have three options :
discover that request() does, in fact, return a promise.
rewrite request() to return a promise.
write an adapter function (a "promisifier") that calls request(), and generates/returns the promise, which is later fulfilled or rejected depending on the outcome of request().
The first or second options would be ideal but the safe assumption for me (Roamer) is to assume that an adapter is required. Fortunately, I know enough from the question to be able to write one. Cheerio appears not to include jQuery's promise implementation, so a dedicated promise lib will be required.
Here is an adapter function, using syntax that will work with the Bluebird lib or native js promises:
//Promisifier for the low level function request()
function requestAsync(url) {
return new Promise(function(resolve, reject) {
request(url, function(err, resp, page) {
if (err) {
reject(err);
} else {
if (resp.statusCode !== 200) {
reject(new Error('request error: ' + resp.statusCode));
}
} else {
resolve(page);
}
});
});
}
Now getFile(...) and getInfo() can be written to make use of the promises returned from the lowest level's adapter.
//intermediate level function
function getFile(DOMelement) {
var fullUrl = ...;//something derived from DOMelement. Presumably either .val() or .text()
return requestAsync(fullUrl).then(function (page) {
var $ = cheerio.load(page);
srcEp = $(???).attr('src');//Not too sure what the selector should be. `$(this)` definitely won't work.
return srcEp;
});
}
//high level function
function getInfo(path) {
var a = [], b = [], c = [];//presumably
// Now map the $(path) to an array of promises by calling getFile() inside a .map() callback.
// By chaining .then() a/b/c are populated when the async data arrives.
var promises = $(path).map(function(i, e) {
return getFile(e).then(function(srcEp) {
a[i] = ...;
b[i] = e;
c[i] = srcEp;
});
});
//return an aggregated promise to getInfo's caller,
//in case it needs to take any action on settlement.
return Promise.all(promises).then(function() {
//What to do when all promises are fulfilled
var entry = new DB...//(a,b,c)
addToDB(entry);
}, function(error) {
//What to do if any of the promises fails
console.log(error);
//... you may want to do more.
});
}

WinJS, return a promise from a function which may or may not be async

I have a situation where my WinJS app wants to call a function which may or may not be async (e.g. in one situation I need to load some data from a file (async) but at other times I can load from a cache syncronously).
Having a look through the docs I though I could wrap the conditional logic in a promise like:
A)
return new WinJS.Promise(function() { // mystuff });
or possibly use 'as' like this:
B)
return WinJS.Promise.as(function() { // mystuff });
The problem is that when I call this function, which I'm doing from the ready() function of my first page like this:
WinJS.UI.Pages.define("/pages/home/home.html", {
ready: function () {
Data.Survey.init().done(function (result) {
// do some stuff with 'result'
});
}
});
When it is written like 'A' it never hits my done() call.
Or if I call it when it's written like 'B', it executes the code inside my done() instantly, before the promise is resolved. It also looks from the value of result, that it has just been set to the content of my init() function, rather than being wrapped up in a promise.
It feels like I'm doing something quite basically wrong here, but I'm unsure where to start looking.
If it's any help, this is a slimmed down version of my init() function:
function init() {
return new WinJS.Promise(function() {
if (app.context.isFirstRun) {
app.surveyController.initialiseSurveysAsync().then(function (result) {
return new WinJS.Binding.List(result.surveys);
});
} else {
var data = app.surveyController.getSurveys();
return new WinJS.Binding.List(data);
}
});
}
Does anyone have any thoughts on this one? I don't believe the 'may or may not be async' is the issue here, I believe the promise setup isn't doing what I'd expect. Can anyone see anything obviously wrong here? Any feedback greatly appreciated.
Generally speaking, if you're doing file I/O in your full init routine, those APIs return promises themselves, in which case you want to return one of those promises or a promise from one of the .then methods.
WinJS.Promise.as, on the other hand, is meant to wrap a value in a promise. But let me explain more fully.
First, read the documentation for the WinJS.Promise constructor carefully. Like many others, you're mistakenly assuming that you just wrap a piece of code in the promise and voila! it is async. This is not the case. The function that you pass to the constructor is an initializer that receives three arguments: a completeDispatcher function, an errorDispatcher function, and a progressDispatcher function, as I like to call them.
For the promise to ever complete with success, complete with an error, or report progress, it is necessary for the rest of the code in the initializer to eventually call one of the dispatchers. These dispatchers, inside the promise, then loop through and call any complete/error/progress methods that have been given to that promise's then or done methods. Therefore, if you don't call a dispatcher at all, there is no completion, and this is exactly the behavior you're seeing.
Using WinJS.Promise.as is similar in that it wraps a value inside a promise. In your case, if you pass a function to WinJS.promise.as, what you'll get is a promise that's fulfilled with that function value as a result. You do not get async execution of the function.
To achieve async behavior you must either use setTimeout/setInterval (or the WinJS scheduler in Windows 8.1) to do async work on the UI thread, or use a web worker for a background thread and tie its completion (via a postMessage) into a promise.
Here's a complete example of creating a promise using the constructor, handling complete, error, and progress cases (as well as cancellation):
function calculateIntegerSum(max, step) {
if (max < 1 || step < 1) {
var err = new WinJS.ErrorFromName("calculateIntegerSum", "max and step must be 1 or greater");
return WinJS.Promise.wrapError(err);
}
var _cancel = false;
//The WinJS.Promise constructor's argument is a function that receives
//dispatchers for completed, error, and progress cases.
return new WinJS.Promise(function (completeDispatch, errorDispatch, progressDispatch) {
var sum = 0;
function iterate(args) {
for (var i = args.start; i < args.end; i++) {
sum += i;
};
//If for some reason there was an error, create the error with WinJS.ErrorFromName
//and pass to errorDispatch
if (false /* replace with any necessary error check -- we don’t have any here */) {
errorDispatch(new WinJS.ErrorFromName("calculateIntegerSum", "error occurred"));
}
if (i >= max) {
//Complete--dispatch results to completed handlers
completeDispatch(sum);
} else {
//Dispatch intermediate results to progress handlers
progressDispatch(sum);
//Interrupt the operation if canceled
if (!_cancel) {
setImmediate(iterate, { start: args.end, end: Math.min(args.end + step, max) });
}
}
}
setImmediate(iterate, { start: 0, end: Math.min(step, max) });
},
//Cancellation function
function () {
_cancel = true;
});
}
This comes from Appendix A ("Demystifying Promises") of my free ebook, Programming Windows Store Apps in HTML, CSS, and JavaScript, Second Edition (in preview), see http://aka.ms/BrockschmidtBook2.
You would, in your case, put your data initialization code in the place of the iterate function, and perhaps call it from within a setImmediate. I encourage you to also look at the WinJS scheduler API that would let you set the priority for the work on the UI thread.
In short, it's essential to understand that new WinJS.Promise and WinJS.Promise.as do not in themselves create async behavior, as promises themselves are just a calling convention around "results to be delivered later" that has nothing inherently to do with async.

NodeJS Callback Scoping Issue

I am quite new (just started this week) to Node.js and there is a fundamental piece that I am having trouble understanding. I have a helper function which makes a MySQL database call to get a bit of information. I then use a callback function to get that data back to the caller which works fine but when I want to use that data outside of that callback I run into trouble. Here is the code:
/** Helper Function **/
function getCompanyId(token, callback) {
var query = db.query('SELECT * FROM companies WHERE token = ?', token, function(err, result) {
var count = Object.keys(result).length;
if(count == 0) {
return;
} else {
callback(null, result[0].api_id);
}
});
}
/*** Function which uses the data from the helper function ***/
api.post('/alert', function(request, response) {
var data = JSON.parse(request.body.data);
var token = data.token;
getCompanyId(token, function(err, result) {
// this works
console.log(result);
});
// the problem is that I need result here so that I can use it else where in this function.
});
As you can see I have access to the return value from getCompanyId() so long as I stay within the scope of the callback but I need to use that value outside of the callback. I was able to get around this in another function by just sticking all the logic inside of that callback but that will not work in this case. Any insight on how to better structure this would be most appreciated. I am really enjoying Node.js thus far but obviously I have a lot of learning to do.
Short answer - you can't do that without violating the asynchronous nature of Node.js.
Think about the consequences of trying to access result outside of your callback - if you need to use that value, and the callback hasn't run yet, what will you do? You can't sleep and wait for the value to be set - that is incompatible with Node's single threaded, event-driven design. Your entire program would have to stop executing whilst waiting for the callback to run.
Any code that depends on result should be inside the getCompanyId callback:
api.post('/alert', function(request, response) {
var data = JSON.parse(request.body.data);
var token = data.token;
getCompanyId(token, function(err, result) {
//Any logic that depends on result has to be nested in here
});
});
One of the hardest parts about learning Node.js (and async programming is general) is learning to think asynchronously. It can be difficult at first but it is worth persisting. You can try to fight and code procedurally, but it will inevitably result in unmaintainable, convoluted code.
If you don't like the idea of multiple nested callbacks, you can look into promises, which let you chain methods together instead of nesting them. This article is a good introduction to Q, one implementation of promises.
If you are concerned about having everything crammed inside the callback function, you can always name the function, move it out, and then pass the function as the callback:
getCompanyId(token, doSomethingAfter); // Pass the function in
function doSomethingAfter(err, result) {
// Code here
}
My "aha" moment came when I began thinking of these as "fire and forget" methods. Don't look for return values coming back from the methods, because they don't come back. The calling code should move on, or just end. Yes, it feels weird.
As #joews says, you have to put everything depending on that value inside the callback(s).
This often requires you passing down an extra parameter(s). For example, if you are doing a typical HTTP request/response, plan on sending the response down every step along the callback chain. The final callback will (hopefully) set data in the response, or set an error code, and then send it back to the user.
If you want to avoid callback smells you need to use Node's Event Emitter Class like so:
at top of file require event module -
var emitter = require('events').EventEmitter();
then in your callback:
api.post('/alert', function(request, response) {
var data = JSON.parse(request.body.data);
var token = data.token;
getCompanyId(token, function(err, result) {
// this works
console.log(result);
emitter.emit('company:id:returned', result);
});
// the problem is that I need result here so that I can use it else where in this function.
});
then after your function you can use the on method anywhere like so:
getCompanyId(token, function(err, result) {
// this works
console.log(result);
emitter.emit('company:id:returned', result);
});
// the problem is that I need result here so that I can use it else where in this function.
emitter.on('company:id:returned', function(results) {
// do what you need with results
});
just be careful to set up good namespacing conventions for your events so you don't get a mess of on events and also you should watch the number of listeners you attach, here is a good link for reference:
http://www.sitepoint.com/nodejs-events-and-eventemitter/

referencing the same object inside another callback from forEach

I'm trying to add a new property to an object. It seems that it works correctly from this scope:
rows.forEach(function (row, i) {
row.foo = i;
});
If I do console.log(rows) I can see that foo was added with the correct value. If I have another callback within the forEach, I don't see the change any more. Why?
rows.forEach(function (row, i) {
getUserById(row.user_id, function(user) {
row.foo = i;
});
});
Yes, the callback get's fired correctly. Here is the getUserById
function getUserById(userId, callback) {
connection.query('select * from t_user where id = ?', [userId], function(err, results) {
if (err) {
console.log("repo error");
} else {
if (results.length == 0) {
callback(null);
} else {
callback(results[0]);
}
}
});
}
I can only image that you would be seeing this issue if getUserById defers calling the callback function you pass it. On that assumption, your problem is a timing issue.
Each time you call getUserById and pass it a callback, you are basically saying "Call this function when you get the user for this id", and then going along with the rest of your program. That means that when console.log is called, your callbacks have not yet been called. Instead, you have to wait for all of your callbacks to finish, and then call any code that relies on the values you are setting in your callback.
The general approach to waiting for all your callbacks to finish is to craft a special callback function which will delegate to each callback and also keep track of once they've all been called.
Here is a question with an answer using the async library, which I'm a fan of (though there are a number of libraries to tackle this control flow problem).
Idiomatic way to wait for multiple callbacks in Node.js
Here's a raw JavaScript solution that is robust enough to work in the general case where not all the callbacks are the same (that is, if you wanted a different callback for each user id).
Multiple Asynchronous Callbacks - Using an array to fire function when complete

Categories

Resources