setTimeout in nodejs - for loops - javascript

I have looked over this post: setTimeout in Node.js loop in hopes that it would solve my issue, and in a way it has. However, I am encountering a new issue that I am unable to resolve on my own.
I have been trying to fix this issue for the past couple of hours, but I have had no luck. Here is what I have:
The function that needs to be called in the timeout function:
function searchLDAP(i, app, userUID){
app.process_args(userUID[i]);
}
This is the portion of the code that is not working properly. The code works for the first iteration (userUID[0]), however when it tries to recurse, i becomes undefined.
function doSetTimeout(i, count, app, userUID) {
if(i == count - 1){ callback(); }
searchLDAP(i, app, userUID);
++i;
setTimeout(doSetTimeout, 2000);
}
I am using node's async module
async.series([
function(callback) {
app.readLines(input, callback); // userUID is and array that
// is returned from this function
},
function() {
var count = userUID.length;
var i = 0;
doSetTimeout(i, count, app, userUID);
}
], function(err) {
console.log('all functions complete');
});
Thank you in advance
-Patrick

With setTimeout, you only command which function should be called, but do not pass the set of arguments that the function should use. And here's one possible approach to solve this problem:
function doSetTimeout(i, count, app, userUID) {
if (i == count - 1) {
return; // you should stop the recursion
}
searchLDAP(i, app, userUID);
setTimeout(function() {
doSetTimeout(i + 1, count, app, userUID);
}, 2000);
}

As an alternative to raina77ow's answer, you can also better use async (and dramatically reduce complexity and lines of code) like this:
app.readLines(input, function(err, users) {
async.each(users, function(user, next) {
app.process_args(user);
next();
}, function(err) {
console.log('all functions complete');
});
});
This will iterate over each of your users (the userUID array from you post), and pass the value to the function. No need to mess with tracking the value of i yourself. This isn't rate limited and doens't have the setTimeout which is only necessary if your process_args method cannot handle concurrent calls. If you need make sure that there is only one app.process_args being called at a time, you can use eachSeries instead:
app.readLines(input, function(err, users) {
async.eachSeries(users, function(user, next) {
app.process_args(user);
next();
}, function(err) {
console.log('all functions complete');
});
});
And if for some reason you really need a two second delay between calls, you can do the following:
app.readLines(input, function(err, users) {
async.eachSeries(users, function(user, next) {
setTimeout(function() {
app.process_args(user);
next();
}, 2000);
}, function(err) {
console.log('all functions complete');
});
});

Related

How to manage nearly 5000 records in a for loop in node js server?

I created a server using NodeJS. There's a database in MySQL and nearly 5000 users in it. I have to read the mysql database and update and make a log in MongoDB database. I implemented a code for this.
https://gist.github.com/chanakaDe/aa9d6a511070c3c78ba3ebc018306ad8
Here's the problem. in this code, in line 50, I added this value.
userArray[i].ID]
This is a user ID from for loop and I need to update mysql table with that ID. All those codes in the for loop block. But I am getting this error.
TypeError: Cannot read property 'ID' of undefined
So that I assigned those values to variables at the top. See line 38 and 39.
var selectedUserID = userArray[i].ID;
var selectedUserTelephone = userArray[i].telephone;
When I'm using like this, there's no error. But user ID is not updating. Recent 2 values has same user ID.
What is the solution for this ?
This is a general JavaScript issue related to the concepts of scope and hoisting of variables during asynchronous operations.
var a = 0;
function doThingWithA () {
console.log(a)
}
for (var i=0; i<1000; i++) {
a++;
setTimeout(function () {
doThingWithA();
}, 10);
}
In this example "a" will always log with a value of 1000. The reason for this is that the setTimeout (mimics the slow db operation) takes time and during that time (before the log happens) "a" is increased to 1000 since the for loop does not wait for setTimeout to complete.
The best solution is to use the "async" module.
pool.getConnection(function (err, connection) {
connection.query(query, function (err, users) {
async.eachSeries(users, function (user, next) {
async.parallel([
function updateUserStatus (cb) { /* your current code */ },
function updateUserAccount (cb) { /* current code for this */ }
], next);
}, function (err) { console.log('finished for all users!') })
});
});
You could also use promises. This is a typical async issue in node.js. From reading your code it appears you think each operation runs in series, whereas in node each input/ouput (e.g db call) is triggered, but your code continues to run as shown in my for loop example above.
There is a cool library called ES6-promise-pool
https://www.npmjs.com/package/es6-promise-pool
So it has concurrency option
Like:
var count = 0
var promiseProducer = function () {
if (count < 5) {
count++
return delayValue(count, 1000)
} else {
return null
}
}
var pool = new PromisePool(promiseProducer, 3)
pool.start()
.then(function () {
console.log('Complete')
})

How to make some synchronous code run before some other asynchronous code?

I have a function like this:
var download = function(url, name) {
http.get(url, function(response) {
// part1 : create a new folder if it doesn't exist
dir = './name';
if (!fs.existsSync(dir)){
fs.mkdirSync(dir);
}
// part 2: download and save file into that folder
response.on('data', function (data) {
fs.appendFileSync(dir, data);
});
})
}
I want part 1 to finish before part 2 runs (so that I can have the dir for part 2). How can I do that ?
(In the code above, as I know so far ( i am new to node.js), both parts will run simultaneously, so i'm not sure that part 1 will always finish before part 2 runs).
both parts will run simultaneously
No, they will not. existsSync and mkdirSync are blocking calls. So, only after they are executed the Event handler will be attached.
But, we should take advantage of the asynchronicity whenever applicable. In this case, you can use the exists and mkdir asynchronous counterparts.
So, your code can be loosely refactored like this
function download(url, name) {
function attachAppender(filename, response) {
response.on('data', function (data) {
fs.appendFile(filename, function (err) {
res.statusCode = err ? 500 : 200;
response.end();
});
});
}
http.get(url, function (response) {
var dir = './name';
fs.exists(dir, function (exists) {
if (!exists) {
fs.mkdir(dir, function (err) {
if (err) {
res.statusCode = 500;
res.end();
} else {
// pass the actual full file name
attachAppender(filename, response);
}
});
} else {
attachAppender(filename, response);
}
});
});
}
Note: fs.exists is deprecated and possibly removed soon. Better use fs.stat instead of it.
You are using sync functions, so that the calls are blocking. However, as thefoureye mentioned it is better to use the async versions, for performance reasons.
If you want to avoid the callback hell (i.e your code becomes more and more difficult to read as you chain asynchronous calls), you can use a library such as async.js that is written in the intent of trying to make it easier to write (and of course, easier to read).
Here is an example taken from the unit tests of async.js: each async function is called after the other.
var series = function(test){
var call_order = [];
async.series([
function(callback){
setTimeout(function(){
call_order.push(1);
callback(null, 1);
}, 25);
},
function(callback){
setTimeout(function(){
call_order.push(2);
callback(null, 2);
}, 50);
},
function(callback){
setTimeout(function(){
call_order.push(3);
callback(null, 3,3);
}, 15);
}
],
function(err, results){
test.ok(err === null, err + " passed instead of 'null'");
test.same(results, [1,2,[3,3]]);
test.same(call_order, [1,2,3]);
test.done();
});
}
There are lots of other initiatives in order to make series of async calls easier to read and write (async/await, fibers.js for example)

Asynchronously Write Large Array of Objects to Redis with Node.js

I created a Node.js script that creates a large array of randomly generated test data and I want to write it to a Redis DB. I am using the redis client library and the async library. Initially, I tried executing a redisClient.hset(...) command within the for loop that generates my test data, but after some Googling, I learned the Redis method is asynchronous while the for loop is synchronous. After seeing some questions on StackOverflow, I can't get it to work the way I want.
I can write to Redis without a problem with a small array or larger, such as one with 100,000 items. However, it does not work well when I have an array of 5,000,000 items. I end up not having enough memory because the redis commands seem to be queueing up, but aren't executed until after async.each(...) is complete and the node process does not exit. How do I get the Redis client to actually execute the commands, as I call redisClient.hset(...)?
Here a fragment of the code I am working with.
var redis = require('redis');
var async = require('async');
var redisClient = redis.createClient(6379, '192.168.1.150');
var testData = generateTestData();
async.each(testData, function(item, callback) {
var someData = JSON.stringify(item.data);
redisClient.hset('item:'+item.key, 'hashKey', someData, function(err, reply) {
console.log("Item was persisted. Result: " +reply);
});
callback();
}, function(err) {
if (err) {
console.error(err);
} else {
console.log.info("Items have been persisted to Redis.");
}
});
You could call eachLimit to ensure you are not executing too many redisClient.hset calls at the same time.
To avoid overflowing the call stack you could do setTimeout(callback, 0); instead of calling the callback directly.
edit:
Forget what I said about setTimeout. All you need to do is call the callback at the right place. Like so:
redisClient.hset('item:'+item.key, 'hashKey', someData, function(err, reply) {
console.log("Item was persisted. Result: " +reply);
callback();
});
You may still want to use eachLimit and try out which limit works best.
By the way - async.each is supposed to be used only on code that schedules the invocation of the callback in the javascript event queue (e.g. timer, network, etc) . Never use it on code that calls the callback immediately as was the case in your original code.
edit:
You can implement your own eachLimit function that instead of an array takes a generator as it's first argument. Then you write a generator function to create the test data. For that to work, node needs to be run with "node --harmony code.js".
function eachLimit(generator, limit, iterator, callback) {
var isError = false, j;
function startNextSetOfActions() {
var elems = [];
for(var i = 0; i < limit; i++) {
j = generator.next();
if(j.done) break;
elems.push(j.value);
}
var activeActions = elems.length;
if(activeActions === 0) {
callback(null);
}
elems.forEach(function(elem) {
iterator(elem, function(err) {
if(isError) return;
else if(err) {
callback(err);
isError = true;
return;
}
activeActions--;
if(activeActions === 0) startNextSetOfActions();
});
});
}
startNextSetOfActions();
}
function* testData() {
while(...) {
yield new Data(...);
}
}
eachLimit(testData(), 10, function(item, callback) {
var someData = JSON.stringify(item.data);
redisClient.hset('item:'+item.key, 'hashKey', someData, function(err, reply) {
if(err) callback(err);
else {
console.log("Item was persisted. Result: " +reply);
callback();
}
});
}, function(err) {
if (err) {
console.error(err);
} else {
console.log.info("Items have been persisted to Redis.");
}
});

NodeJS/Waterline, Need to wait for multiple create end before response

In my nodejs app (using sailsjs), I use a controller to receive multiple images uploaded in a .zip, then I save them in my database using waterline:
var images = [];
zipEntries.forEach(function(zipEntry) {
zip.extractEntryTo(zipEntry.entryName, p, false, true);
Image.create({
filename: zipEntry.name,
}).exec(function(err, image){
images.push(image);
});
});
res.json(images);
The problem is I need to send back all the Image Ids generated (auto incremented), but the create method is asynchronous.Is there any way to wait for all create method end before I can send the server response ?
EDIT: So I found a work around by checking the number of images that I get and increment an Index in the .exec method, I can check if this is the last exec and then send the response.
var images = [],
nbValidImage = 0,
i = 0;
zipEntries.forEach(function(zipEntry) {
zip.extractEntryTo(zipEntry.entryName, p, false, true);
nbValidImage++;
Image.create({
filename: zipEntry.name,
}).exec(function(err, image){
i++;
images.push(image);
if(i == nbValidImage) {
res.json(images);
}
});
});
But if someone has a better solution... :)
In this case, I would go with async.each, since zipEntries is a collection (didn't test this code):
var async = require('async');
async.each(zipEntries, function (zipEntry, callback) {
zip.extractEntryTo(zipEntry.entryName, p, false, true);
Image.create({
filename: zipEntry.name
}).exec(function (err, image) {
images.push(image);
callback();
});
}, function (err) {
if (error) throw new Error(error);
res.json(images);
});
I would say you want to checkout promises or async.
Async has a parallel function I find useful in this situation.
https://github.com/caolan/async#paralleltasks-callback
It allows you to run multiple async functions at once then a callback is called when all functions are done.
I think it would work perfectly for your needs.
Here is an example from their docs:
var async = require('async');
async.parallel([
function(callback){
setTimeout(function(){
callback(null, 'one');
}, 200);
},
function(callback){
setTimeout(function(){
callback(null, 'two');
}, 100);
}
],
// optional callback
function(err, results){
// the results array will equal ['one','two'] even though
// the second function had a shorter timeout.
});

Node.js: How to run asynchronous code sequentially

I have this chunk of code
User.find({}, function(err, users) {
for (var i = 0; i < users.length; i++) {
pseudocode
Friend.find({
'user': curUser._id
}, function(err, friends) * * ANOTHER CALLBACK * * {
for (var i = 0; i < friends.length; i++) {
pseudocode
}
console.log("HERE I'm CHECKING " + curUser);
if (curUser.websiteaccount != "None") {
request.post({
url: 'blah',
formData: blah
}, function(err, httpResponse, body) { * * ANOTHER CALLBACK * *
pseudocode
sendMail(friendResults, curUser);
});
} else {
pseudocode
sendMail(friendResults, curUser);
}
});
console.log("finished friend");
console.log(friendResults);
sleep.sleep(15);
console.log("finished waiting");
console.log(friendResults);
}
});
There's a couple asynchronous things happening here. For each user, I want to find their relevant friends and concat them to a variable. I then want to check if that user has a website account, and if so, make a post request and grab some information there. Only thing is, that everything is happening out of order since the code isn't waiting for the callbacks to finish. I've been using a sleep but that doesn't solve the problem either since it's still jumbled.
I've looked into async, but these functions are intertwined and not really separate, so I wasn't sure how it'd work with async either.
Any suggestions to get this code to run sequentially?
Thanks!
I prefer the promise module to q https://www.npmjs.com/package/promise because of its simplicity
var Promises = require('promise');
var promise = new Promises(function (resolve, reject) {
// do some async stuff
if (success) {
resolve(data);
} else {
reject(reason);
}
});
promise.then(function (data) {
// function called when first promise returned
return new Promises(function (resolve, reject) {
// second async stuff
if (success) {
resolve(data);
} else {
reject(reason);
}
});
}, function (reason) {
// error handler
}).then(function (data) {
// second success handler
}, function (reason) {
// second error handler
}).then(function (data) {
// third success handler
}, function (reason) {
// third error handler
});
As you can see, you can continue like this forever. You can also return simple values instead of promises from the async handlers and then these will simply be passed to the then callback.
I rewrote your code so it was a bit easier to read. You have a few choices of what to do if you want to guarantee synchronous execution:
Use the async library. It provides some helper functions that run your code in series, particularly, this: https://github.com/caolan/async#seriestasks-callback
Use promises to avoid making callbacks, and simplify your code APIs. Promises are a new feature in Javascript, although, in my opinion, you might not want to do this right now. There is still poor library support for promises, and it's not possible to use them with a lot of popular libraries :(
Now -- in regards to your program -- there's actually nothing wrong with your code at all right now (assuming you don't have async code in the pseucode blocks). Your code right now will work just fine, and will execute as expected.
I'd recommend using async for your sequential needs at the moment, as it works both server and client side, is essentially guaranteed to work with all popular libraries, and is well used / tested.
Cleaned up code below
User.find({}, function(err, users) {
for (var i = 0; i < users.length; i++) {
Friend.find({'user':curUser._id}, function(err, friends) {
for (var i = 0; i < friends.length; i++) {
// pseudocode
}
console.log("HERE I'm CHECKING " + curUser);
if (curUser.websiteaccount != "None") {
request.post({ url: 'blah', formData: 'blah' }, function(err, httpResponse, body) {
// pseudocode
sendMail(friendResults, curUser);
});
} else {
// pseudocode
sendMail(friendResults, curUser);
}
});
console.log("finished friend");
console.log(friendResults);
sleep.sleep(15);
console.log("finished waiting");
console.log(friendResults);
}
});
First lets go a bit more functional
var users = User.find({});
users.forEach(function (user) {
var friends = Friend.find({
user: user._id
});
friends.forEach(function (friend) {
if (user.websiteaccount !== 'None') {
post(friend, user);
}
sendMail(friend, user);
});
});
Then lets async that
async.waterfall([
async.apply(Users.find, {}),
function (users, cb) {
async.each(users, function (user, cb) {
async.waterfall([
async.apply(Friends.find, { user, user.id}),
function (friends, cb) {
if (user.websiteAccount !== 'None') {
post(friend, user, function (err, data) {
if (err) {
cb(err);
} else {
sendMail(friend, user, cb);
}
});
} else {
sendMail(friend, user, cb);
}
}
], cb);
});
}
], function (err) {
if (err) {
// all the errors in one spot
throw err;
}
console.log('all done');
});
Also, this is you doing a join, SQL is really good at those.
You'll want to look into something called promises. They'll allow you to chain events and run them in order. Here's a nice tutorial on what they are and how to use them http://strongloop.com/strongblog/promises-in-node-js-with-q-an-alternative-to-callbacks/
You can also take a look at the Async JavaScript library: Async It provides utility functions for ordering the execution of asynchronous functions in JavaScript.
Note: I think the number of queries you are doing within a handler is a code smell. This problem is probably better solved at the query level. That said, let's proceed!
It's hard to know exactly what you want, because your psuedocode could use a cleanup IMHO, but I'm going to what you want to do is this:
Get all users, and for each user
a. get all the user's friends and for each friend:
send a post request if the user has a website account
send an email
Do something after the process has finished
You can do this many different ways. Vanilla callbacks or async work great; I'm going to advocate for promises because they are the future, and library support is quite good. I'll use rsvp, because it is light, but any Promise/A+ compliant library will do the trick.
// helpers to simulate async calls
var User = {}, Friend = {}, request = {};
var asyncTask = User.find = Friend.find = request.post = function (cb) {
setTimeout(function () {
var result = [1, 2, 3];
cb(null, result);
}, 10);
};
User.find(function (err, usersResults) {
// we reduce over the results, creating a "chain" of promises
// that we can .then off of
var userTask = usersResults.reduce(function (outerChain, outerResult) {
return outerChain.then(function (outerValue) {
// since we do not care about the return value or order
// of the asynchronous calls here, we just nest them
// and resolve our promise when they are done
return new RSVP.Promise(function (resolveFriend, reject){
Friend.find(function (err, friendResults) {
friendResults.forEach(function (result) {
request.post(function(err, finalResult) {
resolveFriend(outerValue + '\n finished user' + outerResult);
}, true);
});
});
});
});
}, RSVP.Promise.resolve(''));
// handle success
userTask.then(function (res) {
document.body.textContent = res;
});
// handle errors
userTask.catch(function (err) {
console.log(error);
});
});
jsbin

Categories

Resources