Best practice for sequential url requests nodejs

Best practice for sequential url requests nodejs - javascript

I've got a list of urls i need to request from an API, however in order to avoid causing a lot of load i would ideally like to perform these requests with a gap of x seconds. Once all the requests are completed, certain logic that doesnt matter follows.
There are many ways to go about it, i've implemented a couple.
A) Using a recursive function that goes over an array that holds all the urls and calls itself when each request is done and a timeout has happened
B) Setting timeouts for every request in a loop with incremental delays and returning promises which upon resolution using Promise.all execute the rest of the logic and so on.
These both work. However, what would you say is the recommended way to go about this? This is more of an academic type of question and as im doing this to learn i would rather avoid using a library that abstracts the juice.

Your solutions are almost identical. Thought I would choose a bit different approach. I would make initial promise and sleep promise function, then I would chain them together.
function sleep(time){
return new Promise(resolve => setTimeout(resolve, ms));
}
ApiCall()
.then(sleep(1000))
.then(nextApiCall())...
Or more modular version
var promise = Promise.resolve()
myApiCalls.forEach(call => {
promise = promise.then(call).then(() => sleep(1000))
})
In the end, go with what you understand, what make you most sense and what you will understand in month. The one that you can read best is you preferred solution, performance won’t matter here.

You could use something like this to throttle per period.
If you want all urls to be processed even when some fail you could catch the failed ones and pick them out in the result.
Code would look something like this:
const Fail = function(details){this.details=details;};
twoPerSecond = throttlePeriod(2,1000);
urls = ["http://url1","http://url2",..."http://url100"];
Promise.all(//even though a 100 promises are created only 2 per second will be started
urls.map(
(url)=>
//pass fetch function to twoPerSecond, twoPerSecond will return a promise
// immediately but will not start fetch untill there is an available timeslot
twoPerSecond(fetch)(url)
.catch(e=>new Fail([e,url]))
)
)
.then(
results=>{
const failed = results.map(result=>(result&&result.constuctor)===Fail);
const succeeded = results.map(result=>(result&&result.constuctor)!==Fail);
}
)

Related

Control Flow. Promise Logic. How to deviate without the deviation eloping with the control flow, leaving behind basically just an IOU (Promise)?

I'm having trouble understanding control flow with asynchronous programming in JS. I come from classic OOP background. eg. C++. Your program starts in the "main" -- top level -- function and it calls other functions to do stuff, but everything always comes back to that main function and it retains overall control. And each sub-function retains control of what they're doing even when they call sub functions. Ultimately the program ends when that main function ends. (That said, that's about as much as I remember of my C++ days so answers with C++ analogies might not be helpful lol).
This makes control flow relatively easy. But I get how that's not designed to handle event driven programming as needed on something like a web server. While Javascript (let's talk node for now, not browser) handles event-driven web servers with callbacks and promises, with relative ease... apparently.
I think I've finally got my head around the idea that with event-driven programming the entry point of the app might do little more than set up a bunch of listeners and then get out of the way (effectively end itself). The listeners pick up all the action and respond.
But sometimes stuff still has to be synchronous, and this is where I keep getting unstuck.
With callbacks, promises, or async/await, we can effectively build synchronous chains of events. eg with Promises:
doSomething()
.then(result => doSomethingElse(result))
.then(newResult => doThirdThing(newResult))
.then(finalResult => {
console.log(`Got the final result: ${finalResult}`);
})
.catch(failureCallback);
});
Great. I've got a series of tasks I can do in order -- kinda like more traditional synchronous programming.
My question is: sometimes you need to deviate from the chain. Ask some questions and act differently depending on the answers. Perhaps conditionally there's some other function you need to call to get something else you need along the way. You can't continue without it. But what if it's an async function and all it's going to give me back is a promise? How do I get the actual result without the control flow running off and eloping with that function and never coming back?
Example:
I want to call an API in a database, get a record, do something with the data in that record, then write something back to the database. I can't do any of those steps without completing the previous step first. Let's assume there aren't any sync functions that can handle this API. No problem. A Promise chain (like the above) seems like a good solution.
But... Let's say when I call the database the first time, the authorization token I picked up earlier for it has expired and I have to get a new one. I don't know that until I make that first call. I don't want to get (or even test for the need for) a new auth token every time. I just want to be able to respond when a call fails because I need one.
Ok... In synchronous pseudo-code that might look something like this:
let Token = X
Step 1: Call the database(Token). Wait for the response.
Step 2: If response says need new token, then:
Token = syncFunctionThatGetsAndReturnsNewToken().
// here the program waits till that function is done and I've got my token.
Repeat Step 1
End if
Step 3: Do the rest of what I need to do.
But now we need to do it in Javascript/node with only async functions, so we can use a promise (or callback) chain?
let Token = X
CallDatabase(Token)
.then(check if response says we need new token, and if so, get one)
.then(...
Wait a sec. That "if so, get one" is the part that's screwing me. All this asynchronicity in JS/node isn't going to wait around for that. That function is just going to "promise" me a new token sometime in the future. It's an IOU. Great. Can't call the database with an IOU. Well ok, I'd be happy to wait, but node and JS won't let me, because that's blocking.
That's it in a (well, ok, rather large) nutshell. What am I missing? How do I do something like the above with callbacks or Promises?
I'm sure there's a stupid "duh" moment in my near future here, thanks to one or more of you wonderful people. I look forward to it. 😉 Thanks in advance!

What you do with the .then call is to attach a function which will run when the Promise resolves in a future task. The processing of that function is itself synchronous, and can use all the control flows you'd want:
getResponse()
.then(response => {
if(response.needsToken)
return getNewToken().then(getResponse);
})
.then(() => /* either runs if token is not expired or token was renewed */)
If the token is expired, instead of directly scheduling the Promise returned by .then, a new asynchronous action gets started to retrieve a new token. If that asynchronous action is done, in a new task it'll resolve the Promise it returns, and as that Promise was returned from the .then callback, this will also then resolve the outer Promise and the Promise chain continues.
Note that these Promise chains can get complicated very quick, and with async functions this can be written more elegantly (though under the hood it is about the same):
do {
response = await getResponse();
if(response.needsToken)
await renewToken();
} while(response.needsToken)

Fist of all, I would recommend against using then and catch method to listen to Promise result. They tend to create a too nested code which is hard to read and maintain.
I worked a prototype for your case which makes use of async/await. It also features a mechanism to keep track of attempts we are making to authenticate to database. If we reach max attempts, it would be viable to send an emergency alert to administrator etc for notification purposes. This avoid the endless loop of trying to authenticate and instead helps you to take proper actions.
'use strict'
var token;
async function getBooks() {
// In case you are not using an ORM(Sequelize, TypeORM), I would suggest to use
// at least a query builder like Knex
const query = generateQuery(options);
const books = executeQuery(query)
}
async function executeQuery(query) {
let attempts = 0;
let authError = true;
if (!token) {
await getDbAuthToken();
}
while (attemps < maxAttemps) {
try {
attempts++;
// call database
// return result
}
catch(err) {
// token expired
if (err.code == 401) {
await getDbAuthToken();
}
else {
authError = false;
}
}
}
throw new Error('Crital error! After several attempts, authentication to db failed. Take immediate steps to fix this')
}
// This can be sync or async depending on the flow
// how the auth token is retrieved
async function getDbAuthToken() {
}

AngularFire firestore get / snapshotchanges / valuechanges action on observable is not async?

Hi I am having a very strange behaviour.
I am iterating over some documents and setting some promises that when the documents are fetched the UI is updated.
However, while the promises are atomic, the firestore / AngularFire waits for all the promises.
Example:
for (const event of events) {
this.eventService.getEventActivitiesAndSomeStreams(this.user,
event.getID(),
[DataLatitudeDegrees.type, DataLongitudeDegrees.type])
.pipe(take(1)).toPromise().then((fullEvent) => {
this.logger.info(`Promise completed`)
})
}
One would expect that slowly for each promise as the data comes it would print the promise completed.
However they are all printed as once. It doesn't look that those promises come one by one but "all at once". There is a big waiting time till the first console log is printed and then all promises print that.
So I would expect if I have a progress bar to increase little but little but increases at once
The inner call this.eventService.getEventActivitiesAndSomeStreams
return this.afs
.collection('users')
.doc(userID)
.collection('events')
.doc(eventID)
.collection('activities')
.doc(activityID)
.collection('streams', ((ref) => {
return ref.where('type', 'in', typesBatch);
}))
.get()
.pipe(map((documentSnapshots) => {
return documentSnapshots.docs.reduce((streamArray: StreamInterface[], documentSnapshot) => {
streamArray.push(this.processStreamDocumentSnapshot(documentSnapshot)); // Does nothing rather to create an class of the JSON object passed back from the firestore
return streamArray;
}, []);
}))
Now, if I put an await inside the for loop of course this works as it should going and completing the promises as it should, but then it takes a lot of time.
I also tried to not use AngularFire and use the native JS SDK with the same effect.
I am suspecting that the IndexedDB can be causing this or some other Firebase logic.
What am I missing here, and how can I have the desired behaviour if possible?
You could repro this via a ["users" -> "events" -> "something"] firestore collections, were each "user" has lets say 500 "events" and each of those events has 2 more docs.
So get all the events for the user and try to make for each one a promise that will return 2 documents of "something" inside a for array )

This behavior is pretty expected and has nothing at all to do with firebase. you're iterating over an array and sending out requests. there is no waiting or delay between items, so the for loop (without await statements) will finish in an imperceptibly small amount of time, which means all of the requests are being sent out within milliseconds of each other, or basically at the same time. So their responses should be expected to arrive at basically the same time as well.
You've stated that you don't want to use await statements and iterate one by one, so it's tough to know exactly what you do want or expect to happen. maybe you want them to be spaced .5 second apart? If so, you need to write that logic:
timer(0, 500).pipe( // put whatever ms time between requests you want here?
take(events.length),
switchMap(i => {
return this.eventService.getEventActivitiesAndSomeStreams(this.user,
events[i].getID(),
[DataLatitudeDegrees.type, DataLongitudeDegrees.type]).pipe(take(1))
})
).subscribe(fullEvent => {
this.logger.info(`Promise completed`)
})
(removed promises cause idk why they're being used in the first place and this kind of control is easier with rxjs IMO)

Order of promise array in Promise.allSettled and order in which database transactions are created?

In the following code,
Promise.allSettled( [ entry_save(), save_state(), get_HTML() ] ).then( ... );
promises entry_save and save_state are both readwrite database transactions and get_HTML is readonly. The two readwrite transactions could be combined together but that complicates the undo/redo chain that is maintained and it ties the success and rollback of the two together which is undesired.
The entry_save transaction needs to write before the save_state transaction. Before moving entry_save into the Promise.allSettled that is how it worked because the entry_save transaction was created prior to those of the others. This MDN article explains how the order in which requests are performed is based upon when the transactions are created independently of the order in which the requests are made.
My question is does the synchronous code of each promise process in the order in which it is placed in the array, such that placing entry_save first will always result in its transaction being created first and guaranteeing its database requests will be performed first?
Although it works and is quick enough, I'd prefer to not do this:
entry_save().then( () => { Promise.allSettled( [ save_state(), get_HTML() ] ) } ).then( ... );
If it matters, that's not the exactly the way it is written, it's more consistent with:
entry_save().then( intermediate ); where intermediate invokes the Promise.allSettled.
Thank you.
To clarify a bit, below is the example given in the above cited MDN document.
var trans1 = db.transaction("foo", "readwrite");
var trans2 = db.transaction("foo", "readwrite");
var objectStore2 = trans2.objectStore("foo")
var objectStore1 = trans1.objectStore("foo")
objectStore2.put("2", "key");
objectStore1.put("1", "key");
After the code is executed the object store should contain the value "2", since trans2 should run after trans1.
If entry_save creates trans1 and save_state create trans2, and all in the synchronous code of the functions, meaning not within an onsuccess or onerror handler of a database request or something similar, will the MDN example not hold?
Thus, where #jfriend00 writes,
The functions are called in the order they are placed in the array,
but that only determines the order in which the asynchronous are
started.
will this order the timing of the write requests by that of the creation of the transactions, since the transactions are created in the synchronous code before the asynchronous can commence?
I'd like to test it but I'm not sure how. If two nearly identical promises are used in a Promise.allSettled, how can the write request of the first created transaction be delayed such that it takes place after the write request of the second created transaction, to test if it will be written first? A setTimeout should terminate the transaction. Perhaps a long-running synchronous loop placed before the request.
The code at the very end of this question may better illustrate more precisely what I have attempted to ask. It takes the MDN example in the article cited above and spreads it across two promises placed in a Promise.allSettled, both of which attempt to write to the same object store from within the onsuccess event of a get request.
The question was will the same principle in the article of the first transaction created writing before the second transaction created, regardless of the order the requests are made, still hold in this set up. Since the synchronous portions of the promises will process in the order the promises are placed in the array, the transaction in promise p_1 will be created before that of p_2. However, the put request in the onsuccess event of the get request in p_1 is delayed by the loop building a large string. The question is will p_1 still write before p_2?
In experimenting with this, I cannot get p_2 to write before p_1. Thus, it appears that the MDN example applies even in this type of set up. However, I cannot be sure of why because I don't understand how the JS code is really interpreted/processed.
For example, why can the req.onsuccess function be defined after the request is made? I asked that question sometime ago but still don't know enough to be sure that it doesn't affect the way I attempted to add in a delay here. I know that it won't work the other way around; but my point is I'm not sure how the browser handles that synchronous loop before the put request is made in p_1 to really know for sure that this example demonstrates that the MDN article ALWAYS holds in this set up. However, I can observe that it takes longer for the requests to complete as the number of loop iterations is increased; and, in all cases I have observed, p_1 always writes before p_2. The only way p_2 writes before p_1 is if p_1 doesn't write at all because of the string taking up to much memory causing the transaction in p_1 to be aborted.
That being said, and returning to the fuller set up of my question concerning three promises in the array of the Promise.allSettled compared to requiring entry_save to complete before commencing a Promise.allSettled on the two remaining promises, in the full code of my project, for reasons I am not sure of, the latter is quicker than the former, that is, waiting for entry_save to complete is quicker than including it in the Promise.allSettled.
I was expecting it to be the other way around. The only reason I can think of at this point is that, since entry_save and save_state are both writing to the same object store, perhaps whatever the browser does equivalent to locking the object store until the first transaction, which is that in entry_save, completes and removing the lock takes longer than requiring that entry_save complete before the Promise.allSettled commences and not involving a lock. I thought that everything would be ready "in advance" just waiting for the two put requests to take place in transaction order. They took place in order but more slowly or at least not as quick as using:
entry_save().then( () => { Promise.allSettled( [ save_state(), get_HTML() ] ) } ).then( ... );
instead of:
Promise.allSettled( [ entry_save(), save_state(), get_HTML() ] ).then( ... );
function p_all() { Promise.allSettled( [ p_1(), p_2() ] ); }
function p_1()
{
return new Promise( ( resolve, reject ) =>
{
let T = DB.transaction( [ 'os_1', 'os_2' ], 'readwrite' ),
q = T.objectStore( 'os_1' ),
u = T.objectStore( 'os_2' ),
req, i, t ='', x = '';
req = q.get( 1 );
req.onsuccess = () =>
{
let i, t, r = req.result;
for ( i = 1; i < 10000000; i++ ) t = t + 'This is a string';
r.n = 'p1';
u.put( r );
console.log( r );
};
}); }
function p_2()
{
return new Promise( ( resolve, reject ) =>
{
let T = DB.transaction( [ 'os_1', 'os_2' ], 'readwrite' ),
q = T.objectStore( 'os_1' ),
u = T.objectStore( 'os_2' ),
req;
req = q.get( 1 );
req.onsuccess = () =>
{
let r = req.result;
r.n = 'p2';
u.put( r );
console.log( r );
};
}); }

indexedDB will maintain the order of the transactions in order created, except when those transactions do not overlap (e.g. do not involve the same store out of the set of stores each one involves). this is pretty much regardless of what you do at the higher promise layer.
at the same time, maybe it is unwise to rely on that behavior, because it is implicit and a bit confusing. so maybe it is ok to linearize with promises. the only reach catch is when you need maximum performance, which I doubt applies.
see https://www.w3.org/TR/IndexedDB-2/#transaction-construct
see are indexeddb/localforage reads resolved from a synchronous buffer?
moreover, promises begin execution at the time they are created. they just do not necessarily end at that time, they end eventually instead of immediately. that means that the calls happen in the order you 'create' the promises that are wrapping the indexedDB calls. which means that it relies on the order in which you create the transactions.
regardless of which promise wins the race. regardless of using promise.all.
also, promise.all will retain order even if promises complete out of order, just fyi, but do not let that throw you off.

When you do this:
Promise.allSettled( [ entry_save(), save_state(), get_HTML() ] ).then(...)
It's equivalent to this:
const p1 = entry_save();
const p2 = save_state();
const p3 = get_HTML();
Promise.allSettled([p1, p2, p3]).then(...);
So, the individual function calls you issue such as save_state() are STARTED in the order specified. But, each of those calls are asynchronous so the internal order of what happens before something else really depends upon what they do inside as they can all be in flight at the same time and parts of their execution can be interleaved in an indeterminate order.
Imagine that entry_save() actually consists of multiple asynchronous pieces such as first reading some data from disk, then modifying the data, then writing it to the database. It would call the first asynchronous operation to read some data from disk and then immediately return a promise. Then, save_state() would get to start executing. If save_state() just immediately issued a write to the database, then it very well may write to the database before entry_save() writes to the database. In fact, the sequencing of the two database writes is indeterminate and racy.
If you need entry_save() to complete before save_state(), then the above is NOT the way to code it at all. Your code is not guaranteeing that all of entry_save() is done before any of save_state() runs.
Instead, you SHOULD do what you seem to already know:
entry_save().then( () => { Promise.allSettled( [ save_state(), get_HTML() ] ) } ).then( ... );
Only that guarantees that entry_save() will complete before save_state() gets to run. And, this assumes that you're perfectly OK with save_state() and get_HTML() running concurrently and in an unpredictable order.
My question is does the synchronous code of each promise process in the order in which it is placed in the array, such that placing entry_save first will always result in its transaction being created first and guaranteeing its database requests will be performed first?
The functions are called in the order they are placed in the array, but that only determines the order in which the asynchronous are started. After that, they are all in-flight at the same time and the internal timing between them depends upon how long their individual asynchronous operations take and what those asynchronous operations do. If order matters, you can't just put them all in an indeterminate race. That's call a "race condition". Instead, you would need to structure your code to guarantee that the desired operation goes first before the ones that need to execute after it.

Javascript: How to determine whether to promisefy a function?

Consider small helper functions, some of which are obvious candidates for async / promise (as I think I understand them):
exports.function processFile(file){
return new Promise(resolve => {
// super long processing of file
resolve(file);
});
}
While this:
exports.function addOne(number){
return new Promise(resolve => {
resolve(number + 1);
});
}
Seems overkill.
What is the rule by which one determines whether they should promise-fy their functions?

I generally use a promise if I need to execute multiple async functions in a certain order. For example, if I need to make three requests and wait for them all to come back before I can combine the data and process them, that's a GREAT use case for promises. In your case, let's say you have a file, and you need to wait until the file and metadata from a service are both loaded, you could put them in a promise.
If you're looking to optimize operations on long running processes, I'd look more into web workers than using promises. They operate outside the UI thread and use messages to pass their progress back into the UI thread. It's event based, but no need for a promise. The only caveat there is that you can't perform actions on the DOM inside a web worker.
More on web workers here: https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers

What is the rule by which one determines whether they should promise-fy their functions?
It's quite simple actually:
When the function does something asynchronous, it should return a promise for the result
When the function does only synchronous things, it should not return a promise

How to wrap jsonP callback in native javascript Promise?

I'm playin with native Promise to combine a bunch of XmlHttpRequests into one result and I think I got it working, see http://jsfiddle.net/pjs06hdo/
(random calls to flickr api, see the console for what's actually going on in which order)
There might be shorter implementations but with this code I can understand what's going on.
But then there comes the stupid JSONP :-( as it turns out the actual target site does not allow Cross-site requests and I have to use a provided jsonP endpoint (again simulated with flickr) And here I'm stuck: that stupid global callback does not fit into my basic understanding of Promise
I think the solution has to do with explanations in How do I convert an existing callback API to promises?.
I tried to implement this but it works only partially: http://jsfiddle.net/b33bj9k1/ There is no actual output, only console messages, sorry. But there you can see that there are three calls to create the promises but the resolve(), the jsonFlickrApiAsync() gets called only once.
What would be the right way to handle jsonP callbacks with Promise so I can have an Promise.all() to deal with the results as in the XmlHttpRequest version above?
No jQuery please - I want to understand whats really going

This is not a problem with promises, this is a problem with JSONP. Since it uses global callbacks, you need to use different callbacks - with different names - for each request. For Flickr that means you have to use their jsoncallback url parameter. The parameter name may vary for your actual endpoint.
However, your use of promises is indeed weird. Typically you'd use one promise per request, to represent that request's result. You are intentionally creating only one global promise, which cannot work.
function loadJSONP(url, parameter="callback") {
var prop = "loadJSONP.back" + loadJSONP.counter++;
var script = document.createElement("script");
function withCleanUp(r) {
return (x) => {
loadJSONP[prop] = null;
document.head.removeChild(script);
r(x);
}
}
return new Promise((resolve, reject) => {
loadJSONP[prop] = withCleanUp(resolve);
script.onerror = withCleanUp(reject);
// setTimeout(script.onerror, 5000); might be advisable
script.src = url+"&"+parameter+"="+prop;
document.head.appendChild(script);
});
}
loadJSONP.counter = 0;

Develop Reference

JavaScript is the programming language of the Web.

Best practice for sequential url requests nodejs - javascript

Related

Control Flow. Promise Logic. How to deviate without the deviation eloping with the control flow, leaving behind basically just an IOU (Promise)?

AngularFire firestore get / snapshotchanges / valuechanges action on observable is not async?

Order of promise array in Promise.allSettled and order in which database transactions are created?

Javascript: How to determine whether to promisefy a function?

How to wrap jsonP callback in native javascript Promise?

Categories

Resources