I have this code for managing dashboard which contains approximate 100 of independent checks.
Check results are received via AJAX call.
There is one initial request for each check at start. After result is received for particular check, the code recursively waits for set timeout and repeats the request again for that check again.
One promise = one check.
I am wondering why promises starts to resolve only after each one of them is pending (none of them is in timeout period). And that is even if response from server is "instantaneous", they just wait for the last promise in cycle.
const TIMEOUT = 4000;
function checkForUpdate(environment, application, check) {
Dashboard.setCheckPending(environment, application, check);
return Communicator.getStatus(environment, application, check)
.then(status => {
Dashboard.updateCheckCell(environment, application, check, status);
Dashboard.updateEnvironmentCell(environment, application);
setTimeout(() => {
return checkForUpdate(environment, application, check)
},
TIMEOUT
);
});
}
Communicator.getEnvMatrix()
.then(data => {
Dashboard.create(data);
$.each(data, (environment, applications) => {
$.each(applications, (application, checks) => {
$.each(checks, (key, check) => {
checkForUpdate(environment, application, check);
});
});
});
});
The question is also how to rewrite that so each of the checks waits just for its own result to be delivered and for set timeout.
EDIT (clarification):
Each of the 100 checks are independent, that is why I want to run AJAX for each of them as soon as I can (inside the $.each() loops).
The check is dependent only on itself. I don't want it to wait on any other check.
After the result of a check is received it has to wait for set timeout before it tries to retrieve its status again. That is why I encapsulated the recursive function within the setTimeout().
Even if I rewrite (see below) the setTimeout() as promise, the behavior stays the same unfortunately.
function delay(timeout) {
return new Promise(resolve => {
setTimeout(resolve, timeout);
});
}
function checkForUpdate(environment, application, check) {
Dashboard.setCheckPending(environment, application, check);
let promise = Communicator.getStatus(environment, application, check).promise();
return promise
.then(status => {
Dashboard.updateCheckCell(environment, application, check, status);
Dashboard.updateEnvironmentCell(environment, application);
return delay(TIMEOUT).then(() => {
return checkForUpdate(environment, application, check);
});
});
}
Your code runs $.each() synchronously. That means it is going to call every checkForUpdate() before anything else can run. Since standards-conforming promises are always resolved asynchronously (on some future tick), that means that every single request here will get started before ANY promise can run its .then() handler. That's how promises work. Only once the $.each() loop is done can the Javascript interpreter start to process the .then() handlers of resolved promises.
Also, it is unclear why you are trying to do a return checkForUpdate(environment, application, check) inside the setTimeout(). The return there does nothing. It's just returning to the setTimeout() callback which does nothing. The parent function has long since already returned so this is not chaining the next checkForUpdate() to the previous promise chain. If you wanted to chain them together, then you need to make a delay with a promise and return that promise like is shown in these references:
using setTimeout on promise chain
Delays between promises in promise chain
Delay chained promise
The unexpected thing is that even if all 100 requests are sent immediately after page loads, they are being resolved just few at a time and (roughly) in the order they've been sent. Roughly means 3, 2, 5, 1, 8, .... But I'd expect something like 3, 89, 12, 76, 21, 94, .... There seems to be some limit on how many promises can be run concurrently and in what order.
Another thing that will influence your ajax calls is that each browser has a limit on how many concurrent ajax calls it will allow to the same host. If you exceed that limit, it will queue them and not run subsequent ones until some earlier ones finish. Each browser sets its own limit and they've changed over time so I don't know exactly what the current limits are, but they are lowish. I know Chrome used to be something like 6 at at time to the same host. So, that will also affect the exact order that things complete.
When you hit this limit, Ajax calls will be sent in the order they were called by your code. So, if the limit was 6 per host, then your first 6 would be sent and the 7th request would only be sent when one of the first 6 finished and so on. That still doesn't guarantee a finish order, but it does affect the ability for a later request to finish before an earlier request.
Related
I know JS is single threaded. But I have a function which takes time for the calculation. I would like it to work paralleled, so this function would not freeze next statement. Calculation inside of function will take approximately 1-2 seconds.
I used to create it using promise, but it still freeze the next statement.
console.log("start");
new Promise((res, rej) => {
/* calculations */
}).then((res) => console.log(res));
console.log("end");
Then I used setTimeout function with time interval 0. LoL
console.log("start");
setTimeout(() => {
/* calculations */
console.log(res);
}, 0);
console.log("end");
Outputs:
start
end
"calculation result"
Both cases shows similar result, but using promise prevented to show console.log("end") before calculation finishes. Using setTimeout works as I wanted, and shows console.log("end") before calculation, so it was not freeze till calculation done.
I hope it was clear enough. Now for me using setTimeout is the best solution, but I would be happy to hear your ideas or any other method calculating concurrently without setTimeout.
The code you write under new Promise(() => {..code here..}) is not asynchronous. The is a very common misconception that everything under the Promise block would run asynchronously.
Instead, this JS API just let's get us a hook of some deferred task to be done once the promise is resolved. MDN
Promises are a comparatively new feature of the JavaScript language that allow you to defer further actions until after a previous action
has completed, or respond to its failure. This is useful for setting
up a sequence of async operations to work correctly.
new Promise(() => {
// whatever I write here is synchromous
// like console.log, function call, setTimeout()/fetch()/async web apis
// if there are some async tasks like fetch, setTimeout.
// they can be async by themselves but their invocation is still sync
//
})
setTimeout is not the correct option either. Code under setTimeout would run when the event stack is empty and once it enters, it would block the main thread again.
The right approach to this would be to use Web Workers.
I am trying to understand the mechanism of async functions. I found some code on MDN docs MDN docs, made some modifications and... cannot fully understand how it works.
var resolveAfter2Seconds = function() {
console.log("starting slow promise");
return new Promise(resolve => {
setTimeout(function() {
resolve(20);
console.log("slow promise is done");
}, 6000);
});
};
var resolveAfter1Second = function() {
console.log("starting fast promise");
return new Promise(resolve => {
setTimeout(function() {
resolve(10);
console.log("fast promise is done");
}, 4000);
});
};
var sequentialStart = async function() {
console.log('==SEQUENTIAL START==');
const slow = await resolveAfter2Seconds();
const fast = await resolveAfter1Second();
console.log(fast);
console.log('why?');
console.log(slow);
}
sequentialStart();
For now I know that if we run this code we will immediately receive '==SEQUENTIAL START==' on the console, then "starting slow promise" then we have a Promise with setTimeout which indicates that 'slow promise is done' will appear after 6 seconds and the resolve(20) callback will be kept in api container since execution stack will be empty.JS gives us 'starting fast promise' and after four seconds we get 'fast promise is done' and then immediately 10, 'why?', 20.
I do not understand: what happens in the background exactly - I know that resolve(20) is kept in api container and the rest of the code are executed, then resolve(10) is also kept in api container and when the execution stack is empty they are returned as the results of resolving their Promises.
But:
What with the timer? 10,why,20 appears long after their timeout passes - resolve 20 appears on the screen long after 6 seconds.
What with order? It seems like they (resolve20 and resolve 10) are ready to be executed and kept in memory until I use them - print them in console in this case? TIME APPEARING and ORDER
I am very determined to understand it correctly.
Perhaps this will help clear things up. Async-await is just syntactic sugar, so your sequentialStart function is the exact same thing as:
var sequentialStart = function() {
console.log('==SEQUENTIAL START==');
resolveAfter2Seconds().then(function(slow) {
resolveAfter1Second().then(function(fast) {
console.log(fast);
console.log('why?');
console.log(slow);
});
});
}
I know that resolve(20) is kept in api container and the rest of the code are executed, then resolve(10) is also kept in api container and when the execution stack is empty they are returned as the results of resolving their Promises
That's not what's happening when using async-await, what your code is doing is the following in this specific order:
Call resolveAfter2Seconds()
await it to resolve and then assign the resolved value to the constant slow
Call resolveAfter1Second()
await it to resolve and then assign the resolved value to the constant fast
Call console.log(fast), then console.log('why?'), then console.log(slow)
It seems like you're expecting the promises to resolve in parallel, as they would if you weren't using async-await but the whole purpose of async-await is to be able to write code with promises as if it was synchronous (i.e. blocking) code without creating a nested mess of then callbacks.
I have the following:
for (let job of jobs) {
resets.push(
new Promise((resolve, reject) => {
let oldRef = job.ref
this._sequenceService.attachRef(job).then(() => {
this._dbService.saveDoc('job', job).then(jobRes => {
console.log('[%s / %s]: %s', oldRef, jobRes['jobs'][0].ref, this._moment.unix(job.created).format('DD/MM/YYYY HH:mm'))
resolve()
}).catch(error => {
reject(error)
})
}).catch(error => {
reject(error)
})
})
)
}
return Promise.all(resets).then(() => {
console.log('Done')
})
this._sequenceService.attachRef has a console.log() call.
When this runs, I am seeing all the console logs from the this._sequenceService.attachRef() call and then I see all the logs in the saveDoc.then() call. I was expecting to see them alternate. I understand that according to this article, promises don't resolve in order but I wouldn't expect my promise to resolve until I call resolve() so would still expect alternating logs, even if not in order.
Why is this not the case?
Your code can be written a lot cleaner by avoiding the promise anti-pattern of wrapping a promise in a new manually created promise. Instead, you just push the outer promise into your array and chain the inner promises to the outer one by returning them from inside the .then() handler. It can all be done as simply as this:
for (let job of jobs) {
let oldRef = job.ref;
resets.push(this._sequenceService.attachRef(job).then(() => {
// chain this promise to parent promise by returning it
// inside the .then() handler
return this._dbService.saveDoc('job', job).then(jobRes => {
console.log('[%s / %s]: %s', oldRef, jobRes['jobs'][0].ref, this._moment.unix(job.created).format('DD/MM/YYYY HH:mm'));
});
}));
}
return Promise.all(resets).then(() => {
console.log('Done')
}).catch(err => {
console.log(err);
});
Rejections will automatically propagate upwards so you don't need any .catch() handlers inside the loop.
As for sequencing, here's what happens:
The for loop is synchronous. It just immediately runs to completion.
All the calls to .attachRef() are asynchronous. That means that calling them just initiates the operation and then they return and the rest of your code continues running. This is also called non-blocking.
All .then() handlers are asynchronous. The earliest they can run is on the next tick.
As such this explains why the first thing that happens is all calls to .attachRef() execute since that's what the loop does. It immediately just calls all the .attachRef() methods. Since they just start their operation and then immediately return, the for loop finishes it's work pretty quickly of launching all the .attachRef() operations.
Then, as each .attachRef() finishes, it will trigger the corresponding .saveDoc() to be called.
The relative timing between when .saveDoc() calls finish is just a race depending upon when they got started (how long their .attachRef() that came before them took) and how long their own .saveDoc() call took to execute. This relative timing of all these is likely not entirely predictable, particularly if there's a multi-threaded database behind the scenes that can process multiple requests at the same time.
The fact that the relative timing is not predictable should not be a surprise. You're running multiple two-stage async operations purposely in parallel which means you don't care what order they run or complete in. They are all racing each other. If they all took exactly the same time to execute with no variation, then they'd probably finish in the same order they were started, but any slight variation in execution time could certainly change the completion order. If the underlying DB gets has lock contention between all the different requests in flight at the same time, that can drastically change the timing too.
So, this code is designed to do things in parallel and notify you when they are all done. Somewhat by definition, that means that you aren't concerned about controlling the precise order things run or complete in, only when they are all done.
Clarifying my comments above:
In this case the promises are at four levels.
Level 0: Promise created with Promise.all
Level 1: Promise created with new Promise for each Job
Level 2: Promise as generated by this._sequenceService.attachRef(job)
Level 3: Promise as generated by this._dbService.saveDoc('job', job)
Let's say we have two jobs J1 and J2
One possible order of execution:
L0 invoked
J1-L1 invoked
J2-L1 invoked
J1-L2 invoked
J2-L2 invoked
J1-L2 resolves, log seen at L2 for J1
J1-L3 invoked
J2-L2 resolves, log seen at L2 for J2
J2-L3 invoked
J1-L3 resolves, log seen at L3 for J1
J1-L1 resolves
J2-L3 resolves, log seen at L3 for J2
J2-L1 resolves
L0 resolves, 'Done' is logged
Which is probably why you see all L2 logs, then all L3 logs and then finally the Promise.all log
I am trying to create an array of Promises, then resolve them with Promise.all(). I am using got, which returns a promise.
My code works, but I don't fully understand how. Here it is:
const got = require('got');
const url = 'myUrl';
const params = ['param1', 'param2', 'param3'];
let promiseArray = [];
for (param of params) {
promiseArray.push(got(url + param));
}
// Inspect the promises
for (promise of promiseArray) {
console.log(JSON.stringify(promise));
// Output: promise: {"_pending":true,"_canceled":false,"_promise":{}}
}
Promise.all(promiseArray).then((results) => {
// Operate on results - works just fine
}).catch((e) => {
// Error handling logic
});
What throws me off is that the Promises are marked as "pending" when I add them into the array, which means they have already started.
I would think that they should lie inactive in promiseArray, and Promise.all(promiseArray) would both start them and resolve them.
Does this mean I am starting them twice?
You're not starting them twice. Promises start running as soon as they're created - or as soon as the JS engine finds enough resources to start them. You have no control on when they actually start.
All Promise.all() does is wait for all of them to settle (resolve or reject). Promise.all() does not interfere with nor influence the order/timing of execution of the promise itself.
Promises don't run at all. They are simply a notification system for communicating when asynchronous operations are complete.
So, as soon as you ran this:
promiseArray.push(got(url + param));
Your asynchronous operation inside of got() is already started and when it finishes, it will communicate that back through the promise.
All Promise.all() does is monitor all the promises and tell you when the first one rejects or when all of them have completed successfully. It does not "control" the async operations in any way. Instead, you start the async operations and they communicate back through the promises. You control when you started the async operations and the async operations then run themselves from then on.
If you break down your code a bit into pieces, here's what happens in each piece:
let promiseArray = [];
for (param of params) {
promiseArray.push(got(url + param));
}
This calls got() a bunch of times starting whatever async operation is in that function. got() presumably returns a promise object which is then put into your promiseArray. So, at this point, the async operations are all started already and running on their own.
// Inspect the promises
for (promise of promiseArray) {
console.log(JSON.stringify(promise));
// Output: promise: {"_pending":true,"_canceled":false,"_promise":{}}
}
This loop, just looks at all the promises to see if any of them might already be resolved, though one would not expect them to be because their underlying async operations were just started in the prior loop.
Promise.all(promiseArray).then((results) => {
// Operate on results - works just fine
}).catch((e) => {
// Error handling logic
});
Then, with Promise.all(), you're just asking to monitor the array of promises so it will tell you when either there's a rejected promise or when all of them complete successfully.
Promises "start" when they are created, i.e. the function that gives you the promise, has already launched the (often asynchronous) operations that will eventually lead into an asynchronous result. For instance, if a function returns a promise for a result of an HTTP request, it has already launched that HTTP request when returning you the promise object.
No matter what you do or not do with that promise object, that function (got) has already created a callback function which it passed on to an asynchronous API, such as a HTTP Request/Response API. In that callback function (which you do not see unless you inspect the source of got) the promise will be resolved as soon as it gets called back by that API. In the HTTP request example, the API calls that particular callback with the HTTP response, and the said callback function then resolve the promise.
Given all this, it is a bit strange to think of promises as things that "start" or "run". They are merely created in a pending state. The remaining thing is a pending callback from some API that will hopefully occur and then will change the state of the promise object, triggering then callbacks.
Please note that fetching an array of urls with Promise.all has got some possible problems:
If any of the urls fail to fetch your resolve is never called (so
one fails and your resolve function is never called.
If your array is very large you will clobber the site and your network with requests, you may want to throttle the maximum open requests and or requests made in a certain timeframe.
The first problem is easily solved, you process the failed requests and add them to the results. In the resolve handler you can decide what to do with the failed requests:
const got = require('got');
const url = 'myUrl';
const params = ['param1', 'param2', 'param3'];
const Fail = function(details){this.details = details;};
Promise.all(
params.map(
param =>
got(url + param)
.then(
x=>x,//if resolved just pass along the value
reject=>new Fail([reject,url+param])
)
)
).then((results) => {
const successes = results.filter(result=>(result && result.constructor)!==Fail),
const failedItems = results.filter(result=>(result && result.constructor)===Fail);
}).catch((e) => {
// Error handling logic
});
Point 2 is a bit more complicated, throttling can be done with this helper function and would look something like this:
... other code
const max5 = throttle(5);
Promise.all(
params.map(
param =>
max5(got)(url + param)
.then(
x=>x,//if resulved just pass along the value
reject=>new Fail([reject,url+param])
)
)
)
Recently I made a webscraper in nodejs using 'promise'. I created a Promise for each url I wanted to scrape and then used all method:
var fetchUrlArray=[];
for(...){
var mPromise = new Promise(function(resolve,reject){
(http.get(...))()
});
fetchUrlArray.push(mPromise);
}
Promise.all(fetchUrlArray).then(...)
There were thousands of urls but only a few of them got timed out. I got the impression that it was handling 5 promises in parallel at a time.
My question is how exactly does promise.all() work. Does it:
Call each promise one by one and switch to the next one till the previous one is resolved.
Or does in process the promises in a batch of a few from the array.
Or does it fire all promises
What is the best way to solve this problem in nodejs. Because as it stands I can solve this problem way faster in Java/C#
What you pass Promise.all() is an array of promises. It knows absolutely nothing about what is behind those promises. All it knows is that those promises will get resolved or rejected sometime in the future and it will create a new master promise that follows the sum of all the promises you passed it. This is one of the nice things about promises. They are an abstraction that lets you coordinate any type of action (usually asynchronous) without regard for what type of action it is. As such, promises have literally nothing to do with the actual action. All they do is monitor the completion or error of the action and report that back to those agents following the promise. Other code actually runs the action.
In your particular case, you are immediately calling http.get() in a tight loop and your code (nothing to do with promises) is launching a zillion http.get() operations at once. Those will get fired as fast as the underlying transport can do them (likely subject to connection limits).
If you want them to be launched serially or in batches of say 10 at a time, then you have to code it that way yourself. Promises have nothing to do with that.
You could use promises to help you code them to launch serially or in batches, but it would take extra of your code to do that either way to make that happen.
The Async library is specifically built for running things in parallel, but with a maximum number in flight at any given time because this is a common scheme where you either have connection limits on your end or you don't want to overwhelm the receiving server. You may be interested in the parallelLimit option which lets you run a number of async operations in parallel, but with a maximum number in flight at any given time.
I would do it like this
Personally, I'm not a big fan of Promises. I think the API is extremely verbose and the resulting code is very hard to read. The method defined below results in very flat code and it's much easier to immediately understand what's going on. At least imo.
Here's a little thing I created for an answer to this question
// void asyncForEach(Array arr, Function iterator, Function callback)
// * iterator(item, done) - done can be called with an err to shortcut to callback
// * callback(done) - done recieves error if an iterator sent one
function asyncForEach(arr, iterator, callback) {
// create a cloned queue of arr
var queue = arr.slice(0);
// create a recursive iterator
function next(err) {
// if there's an error, bubble to callback
if (err) return callback(err);
// if the queue is empty, call the callback with no error
if (queue.length === 0) return callback(null);
// call the callback with our task
// we pass `next` here so the task can let us know when to move on to the next task
iterator(queue.shift(), next);
}
// start the loop;
next();
}
You can use it like this
var urls = [
"http://example.com/cat",
"http://example.com/hat",
"http://example.com/wat"
];
function eachUrl(url, done){
http.get(url, function(res) {
// do something with res
done();
}).on("error", function(err) {
done(err);
});
}
function urlsDone(err) {
if (err) throw err;
console.log("done getting all urls");
}
asyncForEach(urls, eachUrl, urlsDone);
Benefits of this
no external dependencies or beta apis
reusable on any array you want to perform async tasks on
non-blocking, just as you've come to expect with node
could be easily adapted for parallel processing
by writing your own utility, you better understand how this kind of thing works
If you just want to grab a module to help you, look into async and the async.eachSeries method.
First, a clarification: A promise does represent the future result of a computation, nothing else. It does not represent the task or computation itself, which means it cannot be "called" or "fired".
Your script does create all those thousands of promises immediately, and each of those creations does call http.get immediately. I would suspect that the http library (or something it depends on) has a connection pool with a limit of how many requests to make in parallel, and defers the rest implicitly.
Promise.all does not do any "processing" - it's not responsible for starting the tasks and resolving the passed promises. It only listens to them and checks whether they all are ready, and returns a promise for that eventual result.