Asynchronous loop inside an asynchronous loop: a good idea? - javascript

I have some javascript code which does the following:
Read a .txt file and fill up an array of objects
Loop through these itens
Loop through an array of links inside each of these itens and make a request using nightmarejs
Write the result in Sql Server
My code is like this:
const Nightmare = require('nightmare');
const fs = require('fs');
const async = require('async');
const sql = require('mssql');
var links = recuperarLinks();
function recuperarLinks(){
//Read the txt file and return an array
}
const bigFunction = () => {
var aparelho = '';
async.eachSeries(links, async function (link) {
console.log('Zip Code: ' + link.zipCode);
async.eachSeries(link.links, async function(url){
console.log('URL: ' + url);
try {
await nightmare.goto(link2)
.evaluate(function () {
//return some elements
})
.end()
.then(function (result) {
//ajust the result
dadosAjustados.forEach(function (obj) {
//save the data
saveDatabase(obj, link.cep);
});
});
} catch (e) {
console.error(e);
}
}
}, function(err){
console.log('Erro: ');
console.log(err);
})
}, function (erro) {
if (erro) {
console.log('Erro: ');
console.log(erro);
}
});
}
async function salvarBanco(dados, cep){
const pool = new sql.ConnectionPool({
user: 'sa',
password: 'xxx',
server: 'xxx',
database: 'xxx'
});
pool.connect().then(function(){
const request = new sql.Request(pool);
const insert = "some insert"
request.query(insert).then(function(recordset){
console.log('Dado inserido');
pool.close();
}).catch(function(err){
console.log(err);
pool.close();
})
}).catch(function(err){
console.log(err);
});
}
lerArquivo();
It works fine, but i'm finding this async loop inside another async loop like a hack of some sort.
My outputs are something like this:
Fetching Data from cep 1
Fetching Data from url 1
Fetching Data from cep 2
Fetching Data from url 2
Fetching Data from cep 3
Fetching Data from url 3
Then it starts making the requests. Is there a better (and possibly a correct way) of doing this?

If you want to serialize your calls to nightmare.goto() and you want to simplify your code which is what you seem to be trying to do with await, then you can avoid mixing the callback-based async library with promises and accomplish your goal by only using promises like this:
async function bigFunction() {
var aparelho = '';
for (let link of links) {
for (let url of link.links) {
try {
let result = await nightmare.goto(url).evaluate(function () {
//return some elements
}).end();
//ajust the result
await Promise.all(dadosAjustados.map(obj => saveDatabase(obj, link.cep)));
} catch (e) {
// log error and continue processing
console.error(e);
}
}
}
}
Asynchronous loop inside an asynchronous loop: a good idea?
It's perfectly fine and necessary sometimes to nest loops that involve asynchronous operations. But, the loops have to be designed carefully to both work appropriately and to be clean, readable and maintainable code. Your bigFunction() does not seem to be either to me with your mix of async coding styles.
It works fine, but i'm finding this async loop inside another async loop like a hack of some sort.
If I were teaching a junior programmer or doing a code review for code from any level of developer, I would never allow code that mixes promises and the callback-based async library. It's just mixing two completely different programming styles for both control flow and error handling and all you get is a very hard to understand mess. Pick one model or the other. Don't mix. Personally, it seems to me that the future of the language is Promises for both async control flow and error propagation so that's what I would use.
Note: This appears to be pseudo-code because some references used in this code are not used or defined such as result and dadosAjustados. So, you will have to adapt this concept to your real code. For the future, we can offer you much more complete answers and often make suggested improvements that you're not even aware of if you include your real code, not an abbreviated pseudo-code.

Related

Asynchrone sequence in nodejs with multiple levels

I'm struggling trying to have a javascript asynchrone sequence, although I have seen a lot of tutorials online.
Maybe you could help me :-)
I have an array of 6 functions (F1..F6)
Those function needs to be ran sequentially, forwarding a shared parameter from one function to the other.
And inside F3, I wanted to call another little function multiple times (currently using an array.map) ==> let's call them F3a, F3b, F3c
F3a, F3b, F3c needs to be sequential as well ; at least F4 must start AFTER all F3 are completed - and still those multilevel function are sharing the same global parameter.
I'm not sure I understand the full concept of Promises but I bet I need to learn more on them. :-)
The expected scheme I have in mind:
F1---|
******F2------|
***************F3a-|
********************F3b-|
*************************F3c-|
******************************F4------|
******************************F5------|
******************************F6------|
It might looks easy for all of you, but as a non developer, it's a bit harder for me to have the code working.
I have something partially working (using async and some Promises) and I guess (not sure) it is on the following sequencing; so not as expected :
F1---|
******F2------|
***************F3a-|
***************F3b-|
***************F3c-|
********************F4------|
********************F5------|
********************F6------|
I can't really show you the current code ; it's a total mess and I'd prefer to explain you the idea rather than getting the solution from you.
Should I use a nodejs library or should I code it using native js? Any library suggestion?
Any thought?
Thanks a lot.
You can do something like this without any external library
const sharedVariable = {}
async function doTask(data) {
const data1 = await f1(data, sharedVariable);
const data2 = await f2(data1, sharedVariable);
const data3 = await f3(data2, sharedVariable);
const data4 = await f4(data3, sharedVariable);
const data5 = await f5(data4, sharedVariable);
const data6 = await f6(data5, sharedVariable);
return data6;
}
async function f3(data) {
const data1 = await f3a(data, sharedVariable);
const data2 = await f3b(data1, sharedVariable);
const data3 = await f3c(data2, sharedVariable);
return data3;
}
doTask();
I'm not sure I understand the full concept of Promises but I bet I need to learn more on them. :-)
Yes, go ahead! What you describe is really easy to do using promises with async/await syntax:
async function f3(arg) {
await f3a();
await f3b();
await f3c();
return res;
}
async function main(val) {
await f6(await f5(await f4(await f3(await f2(await f1(val))))));
}
You can also easily use an actual loop if you have the functions in an array:
async function main(val) {
for (const f of fs)
val = await f(val);
}
d3-queue does a pretty good job for me. I also used it to create multiple nested queues. The API's easy to understand.
in your case it can look like this:
// all functions f1-f6 should have the same signarure
// function fN(sharedPArameter, done) ...
// done - is a callback you should call when done
function F3(sharedParameter, done) {
let nestedQ = d3.queue(1);
nestedQ.defer(f3a, sharedParameter);
nestedQ.defer(f3b, sharedParameter);
nestedQ.defer(f3c, sharedParameter);
nestedQ.awaitAll(error => {
done(error);
});
}
let q = d3.queue(1); // put 1 for sequential execution
q.defer(f1, sharedParameter);
q.defer(f2, sharedParameter);
q.defer(f3, sharedParameter);
q.defer(f4, sharedParameter);
q.awaitAll((error, result) => {
// when everything's done
});

Mongoose Cursor: http bulk request from collection

I have a problem which is relevant to rxJS and bulk HTTP requests from the huge collection (1M+ docs)
I have the following code, with quite simple logic. I push all the docs from the collection to allplayers array and making bulk 20 HTTP requests to API at once (guess you understand why it's limited) So, the code works fine, but I guess it's time to refactor it from this:
const cursor = players_db.find(query).lean().cursor();
cursor.on('data', function(player) { allPlayers.push(player); });
cursor.on('end', function() {
logger.log('warng',`S,${allPlayers.length}`);
from(allPlayers).pipe(
mergeMap(player => getPlayer(player.name, player.realm),20),
).subscribe({
next: player => console.log(`${player.name}#${player.realm}`),
error: error => console.error(error),
complete: () => console.timeEnd(`${updatePlayer.name}`),
});
});
As for now, I'm using find with cursor with (batchSize), but if I understood this right (via .length), and according to this question: {mongoose cursor batchSize} batchSize is just a way of optimization and it's not return me array of X docs.
So what should I do now and what operator should I choose for rxJS?
For example I could form arrays with necessary length (like 20) and transfer it to rxJS as I use it before. But I guess there should be another way, where I could use rxJS inside this for promise loop
const players = await players_db.find(query).lean().cursor({batchSize: 10});
for (let player = await players.next(); player != null; player = await players.next()) {
//do something via RxJS inside for loop
}
Also I found this question {Best way to query all documents from a mongodb collection in a reactive way w/out flooding RAM} which also relevant to my problem and I understand the logic, but don't the syntax of it. I also know that cursor variable isn't a doc I cann't do anything useful with it. Or actually I could?
rxJS's bufferCount is a quite interesting operator
https://gist.github.com/wellcaffeinated/f908094998edf54dc5840c8c3ad734d3 probable solution?
So, in the end I found that rxJS isn't needed (but can be used) for this case.
The solution was quite simple and using just MongoCursor:
async function BulkRequest (bulkSize = 10) {
try {
let BulkRequest_Array = [];
const cursor = collection_db.find({}).lean().cursor({batchSize: bulkSize});
cursor.on('data', async (doc) => {
BulkRequest_Array.push(/*any function or axios instance*/)
if (BulkRequest_Array.length >= bulkSize) {
cursor.pause();
console.time(`========================`);;
await Promise.all(BulkRequest_Array);
BulkRequest_Array.length = 0;
cursor.resume();
console.timeEnd(`========================`);
}
}
} catch (e) {
console.error(e)
}
}
BulkRequest();

Node.js - How to return callback with array from for loop with MySQL query?

I'm trying to get list of virtual communities on my Node.js app and then return it with callback function. When i call a getList() method with callback it returns a empty array.
const mysqli = require("../mysqli/connect");
class Communities{
getList(callback){
var list = [];
mysqli.query("SELECT * FROM communities", (err, communities) => {
for(let i = 0; i < communities.length; i++){
mysqli.query("SELECT name FROM users WHERE id='"+ communities[i].host +"'", (err, host) => {
list.push({
"id": communities[i].id,
"name": communities[i].name,
"hostID": communities[i].host,
"hostName": host[0].name,
"verified": communities[i].verified,
"people": communities[i].people
});
});
}
callback(list);
});
}
}
new Communities().getList((list) => {
console.log(list);
});
I need to make for loop to asynchronous and call callback when for loop ends. Please let me know how to do this. Thanks.
Callbacks get really ugly if you have to combine multiple of them, thats why Promises were invented to simplify that. To use Promises in your case you have to create a Promise first when querying the database¹:
const query = q => new Promise((resolve, reject) => mysqli.query(q, (err, result) => err ? reject(err) : resolve(result)));
Now doing multiple queries will return multiple promises, that can be combined using Promise.all to one single promise²:
async getList(){
const communities = await query("SELECT * FROM communities");
const result = await/*³*/ Promise.all(communities.map(async community => {
const host = await query(`SELECT name FROM users WHERE id='${community.host}'`);/*⁴*/
return {
...community,
hostName: host[0].name,
};
}));
return result;
}
Now you can easily get the result with:
new Communities().getList().then(list => {
console.log(list);
});
Read on:
Working with Promises - Google Developers
Understanding async / await - Ponyfoo
Notes:
¹: If you do that more often, you should probably rather use a mysql library that does support promises natively, that safes a lot of work.
²: Through that the requests are done in parallel, which means, that it is way faster than doing one after another (which could be done using a for loop & awaiting inside of it).
³: That await is superfluous, but I prefer to keep it to mark it as an asynchronous action.
⁴: I guess that could also be done using one SQL query, so if it is too slow for your usecase (which I doubt) you should optimize the query itself.

Mongoose inserting same data three times instead of iterating to next data

I am trying to seed the following data to my MongoDB server:
const userRole = {
role: 'user',
permissions: ['readPost', 'commentPost', 'votePost']
}
const authorRole = {
role: 'author',
permissions: ['readPost', 'createPost', 'editPostSelf', 'commentPost',
'votePost']
}
const adminRole = {
role: 'admin',
permissions: ['readPost', 'createPost', 'editPost', 'commentPost',
'votePost', 'approvePost', 'approveAccount']
}
const data = [
{
model: 'roles',
documents: [
userRole, authorRole, adminRole
]
}
]
When I try to iterate through this object / array, and to insert this data into the database, I end up with three copies of 'adminRole', instead of the three individual roles. I feel very foolish for being unable to figure out why this is happening.
My code to actually iterate through the object and seed it is the following, and I know it's actually getting every value, since I've done the console.log testing and can get all the data properly:
for (i in data) {
m = data[i]
const Model = mongoose.model(m.model)
for (j in m.documents) {
var obj = m.documents[j]
Model.findOne({'role':obj.role}, (error, result) => {
if (error) console.error('An error occurred.')
else if (!result) {
Model.create(obj, (error) => {
if (error) console.error('Error seeding. ' + error)
console.log('Data has been seeded: ' + obj)
})
}
})
}
}
Update:
Here is the solution I came up with after reading everyone's responses. Two private functions generate Promise objects for both checking if the data exists, and inserting the data, and then all Promises are fulfilled with Promise.all.
// Stores all promises to be resolved
var deletionPromises = []
var insertionPromises = []
// Fetch the model via its name string from mongoose
const Model = mongoose.model(data.model)
// For each object in the 'documents' field of the main object
data.documents.forEach((item) => {
deletionPromises.push(promiseDeletion(Model, item))
insertionPromises.push(promiseInsertion(Model, item))
})
console.log('Promises have been pushed.')
// We need to fulfil the deletion promises before the insertion promises.
Promise.all(deletionPromises).then(()=> {
return Promise.all(insertionPromises).catch(()=>{})
}).catch(()=>{})
I won't include both promiseDeletion and promiseInsertion as they're functionally the same.
const promiseDeletion = function (model, item) {
console.log('Promise Deletion ' + item.role)
return new Promise((resolve, reject) => {
model.findOneAndDelete(item, (error) => {
if (error) reject()
else resolve()
})
})
}
Update 2: You should ignore my most recent update. I've modified the result I posted a bit, but even then, half of the time the roles are deleted and not inserted. It's very random as to when it will actually insert the roles into the server. I'm very confused and frustrated at this point.
You ran into a very common problem when using Javascript: You shouldn't define (async) functions in a regular for (-in) loop. What happens, is that while you loop through the three values the first async find is being called. Since your code is async, nodejs does not wait for it to finish, before it continues to the next loop iteration and counts up to the third value, here the admin rule.
Now, since you defined your functions in the loop, when the first async call is over, the for-loop already looped to the last value, which is why admin is being inserted three times.
To avoid this, you can just move the async functions out of the loop to force a call by value rather than reference. Still, this can bring up a lot of other problems, so I'd recommend you to rather have a look at promises and how to chain them (e.g. Put all mongoose promises in an array and the await them using Promise.all) or use the more modern async/await syntax together with the for-of loop that allows for both easy readability as well as sequential async command instructions.
Check this very similar question: Calling an asynchronous function within a for loop in JavaScript
Note: for-of is being discussed as to performance heavy, so check if this applies to your use-case or not.
When using async functions in loops could cause some problems.
You should change the way you work with findOne to make it synchronous function
First you need to set your function to async, and then use the findOne like so:
async function myFucntion() {
let res = await Model.findOne({'role':obj.role}).exec();//Exec will fire the function and give back a promise which the await can handle.
//do what you need to do here with the result..
}

Javascript/NodeJS: Array empty after pushing values in forEach loop

I got a little bit of a problem. Here is the code:
Situation A:
var foundRiders = [];
riders.forEach(function(rider){
Rider.findOne({_id: rider}, function(err, foundRider){
if(err){
console.log("program tried to look up rider for the forEach loop finalizing the results, but could not find");
} else {
foundRiders.push(foundRider);
console.log(foundRiders);
}
});
});
Situation B
var foundRiders = [];
riders.forEach(function(rider){
Rider.findOne({_id: rider}, function(err, foundRider){
if(err){
console.log("program tried to look up rider for the forEach loop finalizing the results, but could not find");
} else {
foundRiders.push(foundRider);
}
});
});
console.log(foundRiders);
So in Situation A when I console log I get that foundRiders is an array filled with objects. In situation B when I put the console.log outside the loop, my roundRiders array is completely empty...
How come?
As others have said, your database code is asynchronous. That means that the callbacks inside your loop are called sometime later, long after your loop has already finishes. There are a variety of ways to program for an async loop. In your case, it's probably best to move to the promise interface for your database and then start using promises to coordinate your multiple database calls. You can do that like this:
Promise.all(riders.map(rider => {
return Rider.findOne({_id: rider}).exec();
})).then(foundRiders => {
// all found riders here
}).catch(err => {
// error here
});
This uses the .exec() interface to the mongoose database to run your query and return a promise. Then, riders.map() builds and returns an array of these promises. Then,Promise.all()monitors all the promises in the array and calls.then()when they are all done or.catch()` when there's an error.
If you want to ignore any riders that aren't found in the database, rather than abort with an error, then you can do this:
Promise.all(riders.map(rider => {
return Rider.findOne({_id: rider}).exec().catch(err => {
// convert error to null result in resolved array
return null;
});
})).then(foundRiders => {
foundRiders = foundRiders.filter(rider => rider !== null);
console.log(founderRiders);
}).catch(err => {
// handle error here
});
To help illustrate what's going on here, this is a more old fashioned way of monitoring when all the database callbacks are done (with a manual counter):
riders.forEach(function(rider){
let cntr = 0;
Rider.findOne({_id: rider}, function(err, foundRider){
++cntr;
if(err){
console.log("program tried to look up rider for the forEach loop finalizing the results, but could not find");
} else {
foundRiders.push(foundRider);
}
// if all DB requests are done here
if (cntr === riders.length) {
// put code here that wants to process the finished foundRiders
console.log(foundRiders);
}
});
});
The business of maintaining a counter to track multiple async requests is what Promise.all() has built in.
The code above assumes that you want to parallelize your code and to run the queries together to save time. If you want to serialize your queries, then you could use await in ES6 with a for loop to make the loop "wait" for each result (this will probably slow things down). Here's how you would do that:
async function lookForRiders(riders) {
let foundRiders = [];
for (let rider of riders) {
try {
let found = await Rider.findOne({_id: rider}).exec();
foundRiders.push(found);
} catch(e) {
console.log(`did not find rider ${rider} in database`);
}
}
console.log(foundRiders);
return foundRiders;
}
lookForRiders(riders).then(foundRiders => {
// process results here
}).catch(err => {
// process error here
});
Note, that while this looks like it's more synchronous code like you may be used to in other languages, it's still using asynchronous concepts and the lookForRiders() function is still returning a promise who's result you access with .then(). This is a newer feature in Javascript which makes some types of async code easier to write.

Categories

Resources