I have an interesting case where I need to do a few queries in MongoDB using Mongoose, but the response is returning before I can complete all of them.
I have two document types, list and item. In one particular call, I need to get all of the lists for a particular user, then iterate over each of them and fetch all of the items and append them to the appropriate list before returning.
List.find({'user_id': req.params.user_id}, function(err, docs){
if (!err) {
if (docs) {
var results = [];
_und.each(docs, function(value, key) {
var list = value.toObject();
list.items = [];
Item.find({'list_id': value._id}, function(err, docs) {
if (!err) {
_und.each(docs, function(value, key) { list.items.push(value.toObject()); });
results.push(list);
}
else {
console.log(err);
}
});
});
res.send(results);
(_und is how I've imported underscore.js)
Obviously the issue are the callbacks, and since there's multiple loops I can't return within a callback.
Perhaps this is a case where I would need to get the count in advance and check it on every iteration to decide when to return the results. This doesn't seem elegant though.
Code solution
First of all the issue is with the code. Your sending the results before the Item.find queries finish. You can fix this quite easily
var count = docs.length + 1;
next()
_und.each(docs, function(value, key) {
var list = value.toObject();
list.items = [];
Item.find({
'list_id': value._id
}, function(err, docs) {
if (!err) {
_und.each(docs, function(value, key) {
list.items.push(value.toObject());
});
// push asynchronous
results.push(list);
next()
}
else {
console.log(err);
}
});
});
function next() {
--count === 0 && finish()
}
function finish() {
res.send(results)
}
The easiest way is reference counting, you default count to the number of documents. Then every time your finished getting an item you call next and decrement the count by one.
Once your finished getting all items your count should be zero. Note that we do .length + 1 and call next immediately. This gaurds against the the case where there are no documents, which would otherwise do nothing.
Database solution
The best solution is to use mongo correctly. You should not be doing what is effectively a join in your code, it's slow and inefficient as hell. You should have a nested document and denormalize your list.
so list.items = [Item, Item, ...]
As a further aside, avoid mongoose, it's inefficient, use the native mongo driver.
I use with this module:
https://github.com/caolan/async
Related
I have an async operation in a loop which fetches a result and pushes to an array like this:
arr = []
while (some_limit_reaches) {
async_operation(arg, function(err, data) {
if (!err)
arr.push(data)
})
}
// now arr is not completely filled until all async
// operations are finished in the loop above
the problem is that array is not completely filled until all async operations are done, how can I have a fully filled array after loop is over without using setTimeout?
You're trying to make a an asynchronous operation synchronous. You need to check if your desired state is true in your asynchronous callback. Try something like this.
while (some_limit_reaches) {
async_operation(arg, function(err, data) {
if (!err)
arr.push(data);
if(check_if_the_array_is_full){
//Call some function that continues your operation
}
})
}
This way your processing won't continue until all the array is full.
This solution is a little bit verbose, but respects the separation of tasks. Its schema is similar to kriskowal's q.all.
var arr = [], limit = 10, i=0, terminated_operations = 0;
while (i < limit) {
async_operation(arg, function(err, data) {
if (!err) {
arr.push(data);
operationTerminated(arr);
}
});
}
function operationTerminated(data) {
terminated_operations++;
if( terminated_operations === limit - 1) {
doStuff(data);
terminated_operations = 0;
}
}
function doStuff(data) {
console.log('all data returned', data);
}
The first snippet represents the core logic. The second function is only a trigger of the action declared in the third one.
Edit:
In order to answer at the original question in the title
How can I make sure that an async operation does not keep array in my code empty?
I recommend to return data=undefined in case of async_operation failure, so you can accept also [] as valid return value and keep higher control in core logic. This way you can rewrite the loop as:
while (i < limit) {
async_operation(arg, function(err, data) {
if(err) {
console.error('error occurred', err);
break;
}
if(data) {
arr.push(data);
operationTerminated(arr);
}
});
}
I have a situation where I need to perform logic on a distinct set of values from a mongo collection (A) and then save result to another collection (B). However the contents of (A) will change over time and so I only want to perform the logic on those documents in (A) where there is not a corresponding document in (B). As joins aren't supported, I am trying to do this at the Node level. I am querying all items in collection (A) and using findOne to look for the corresponding entry in collection (B). If I find it, I would like to remove it from the array, but I am stuck because findOne uses an asynchronous callback which doesn't seem to work with the array filter method. Is there a better way to do this:
function loadNewDocumentsFromDB(callback){
db.collection('A', function(err, acollection){
bcollection.find().toArray(function(err, arr){
if(arr){
// toQuery is the global array to be used elsewhere
toQuery = arr.map(function(config){
transformed =
{
args: config._id, // these args are a doc representing a unique entry in 'B'
listings: config.value.id.split(',') // used by other functions
};
return transformed;
})
.filter(function(transformed){
db.collection('B', function(err, bcollection){
bcollection.findOne(transformed.args, function(err, doc){
// I want these values returned from the filter function not the callback
if(doc){
return false; // want to remove this from list of toQuery
}else{
return true; // want to keep in my list
});
});
}
callback();
});
});
}
This was how I managed to get it working:
function loadOptionsFromDB(callback){
toQuery = [];
db.collection('list', function(err, list){
db.collection('data', function(err, data){
list.find().each(function(err, doc){
if(doc){
transformed =
{
args: doc._id,
listings: doc.value.id.split(',')
};
(function(obj){
data.findOne(obj.args, function(err, found){
if(found){}
else{
toQuery.push(obj);
}
});
})(transformed);
}else{
//Done finding
setTimeout(callback, 20000);
}
});
});
});
}
A better way would be to do this on the database. Check if 2 executions of http://docs.mongodb.org/manual/core/map-reduce/ would be of any use to you.
See Merging two collections in MongoDB for more information
I am doing a for loop to find the result from mongodb, and concat the array. But I am not getting the final results array when the loop is finished. I am new to node.js, and I think it's not working like objective-c callback.
app.get('/users/self/feed', function(req, res){
var query = Bill.find({user: req.param('userId')});
query.sort('-createdAt');
query.exec(function(err, bill){
if (bill) {
var arr = bill;
Following.findOne({user: req.param('userId')}, function(err, follow){
if (follow) {
var follows = follow.following; //this is a array of user ids
for (var i = 0; i < follows.length; i++) {
var followId = follows[i];
Bill.find({user: followId}, function(err, result){
arr = arr.concat(result);
// res.send(200, arr);// this is working.
});
}
} else {
res.send(400, err);
}
});
res.send(200, arr); //if put here, i am not getting the final results
} else {
res.send(400, err);
}
})
});
While I'm not entirely familiar with MongoDB, a quick reading of their documentation shows that they provide an asynchronous Node.js interface.
That said, both the findOne and find operations start, but don't necessarily complete by the time you reach res.send(200, arr) meaning arr will still be empty.
Instead, you should send your response back once all asynchronous calls complete meaning you could do something like:
var billsToFind = follows.length;
for (var i = 0; i < follows.length; i++) {
var followId = follows[i];
Bill.find({user: followId}, function(err, result){
arr = arr.concat(result);
billsToFind -= 1;
if(billsToFind === 0){
res.send(200, arr);
}
});
}
The approach uses a counter for all of the inner async calls (I'm ignoring the findOne because we're currently inside its callback anyway). As each Bill.find call completes it decrements the counter and once it reaches 0 it means that all callbacks have fired (this works since Bill.find is called for every item in the array follows) and it sends back the response with the full array.
That's true. Your codes inside for will be executed in parallel at the same time (and with the same value of i I think). If you added console.log inside and after your for loop you will found the outside one will be printed before inside one.
You can wrap the code that inside your for into array of functions and execute them by using async module (https://www.npmjs.org/package/async) in parallel or series, and retrieve the final result from async.parallel or async.series's last parameter.
I have a webserver running in node.js and Express which retrieves data from mongodb . In mongodb collections are getting created dynamically and the name of newly created collection will be stored in one metadata collection “project” . My requirement is to firstly iterate to metadata collection to get the collection name and then get inside the each collection to do multiple query based on some condition . Because my collection metadata is dynamic I have tried to do using for loop .
But it is giving wrong data . It is not executing sequent . Before finishing the loop execution it is returning the value .How to perform sequential execution in node.js using node core modules only (Not other library like async..);
exports.projectCount = function (req, res) {
var mongo = require("mongodb"),
Server = mongo.Server,
Db = mongo.Db;
var server = new Server("localhost", 27017, {
auto_reconnect: true
});
var db = new Db("test", server);
// global JSON object to store manipulated data
var projectDetail = {
projectCount: 0,
projectPercent: 0
};
var totalProject = 0;
db.open(function (err, collection) {
//metadata collection
collection = db.collection("project");
collection.find().toArray(function (err, result) {
// Length of metadata collection
projectDetail.projectCount = result.length;
var count = 0;
//iterate through each of the array which is the name of collection
result.forEach(function (item) {
//change collection object to new collection
collection = db.collection(item.keyParameter.wbsName);
// Perform first query based on some condition
collection.find({
$where: "this.status == 'Created'"
}).toArray(function (err, result) {
// based on result of query one increment the value of count
count += result.lenght;
// Perform second query based on some condition
collection.find({
$where: "this.status=='Completed'"
}).toArray(function (err, result) {
count += result.length;
});
});
});
// it is returning the value without finishing the above manipulation
// not waiting for above callback and value of count is coming zero .
res.render('index', {
projectDetail: projectDetail.projectCount,
count: count
});
});
});
};
When you want to call multiple asynchronous functions in order, you should call the first one, call the next one in it's callback and so on. The code would look like:
asyncFunction1(args, function () {
asyncFunction2(args, function () {
asyncFunction3(args, function () {
// ...
})
})
});
Using this approach, you may end up with an ugly hard-to-maintain piece of code.
There are various ways to achieve the same functionality without nesting callbacks, like using async.js or node-fibers.
Here is how you can do it using node.js EventEmitter:
var events = require('events');
var EventEmitter = events.EventEmitter;
var flowController = new EventEmitter();
flowController.on('start', function (start_args) {
asyncFunction1(args, function () {
flowController.emit('2', next_function_args);
});
});
flowController.on('2', function (args_coming_from_1) {
asyncFunction2(args, function () {
flowController.emit('3', next_function_args);
});
});
flowController.on('3', function (args_coming_from_2) {
asyncFunction3(args, function () {
// ...
});
});
flowController.emit('start', start_args);
For loop simulation example:
var events = require('events');
var EventEmitter = events.EventEmitter;
var flowController = new EventEmitter();
var items = ['1', '2', '3'];
flowController.on('doWork', function (i) {
if (i >= items.length) {
flowController.emit('finished');
return;
}
asyncFunction(item[i], function () {
flowController.emit('doWork', i + 1);
});
});
flowController.on('finished', function () {
console.log('finished');
});
flowController.emit('doWork', 0);
Use callbacks or promises or a flow control library. You cannot program servers in node without understanding at the very least one of these approaches, and honestly all halfway decent node programmers thoroughly understand all three of them (including a handful of different flow control libraries).
This is not a something you are going to just get an answer coded for you by someone else on stackoverflow and then move on. This is a fundamental thing you have to go and study and learn generically as it is only going to come up over and over again on a daily basis.
http://howtonode.org/control-flow
http://callbackhell.com/
Per the resources in the answer above me, nesting the callback when you iterate and only calling it if you are on the last iteration will solve you problem.
I am trying to do exactly what this mongo example is doing but in mongoose. It seems more complex to me in mongoose. Possibly i'm trying to fit a square peg in a round hole?
This example is from http://www.codeproject.com/Articles/521713/Storing-Tree-like-Hierarchy-Structures-With-MongoD (tree structure with parent reference)
I'm trying to build a path.
var path=[];
var item = db.categoriesPCO.findOne({_id:"Nokia"});
while (item.parent !== null) {
item=db.categoriesPCO.findOne({_id:item.parent});
path.push(item._id);
}
path.reverse().join(' / ');
Thanks!
Mongoose is an asynchronous library, so
db.categoriesPCO.findOne({_id:"Nokia"});
doesn't return the answer to the query, it just returns a Query object itself. In order to actually run the query, you'll need to either pass in a callback function to findOne() or run exec() on the Query object returned.
db.categoriesPCO.findOne({_id:"Nokia"}, function (err, item) {
});
However, you can't use the same while loop code to generate the path, so you'll need to use recursion instead. Something like this should work:
var path=[];
function addToPath(id, callback) {
db.categoriesPCO.findOne({_id:id}, function (err, item) {
if (err) {
return callback(err);
}
path.push(item._id);
if (item.parent !== null) {
addToPath(item.parent, callback);
}
else {
callback();
}
});
}
addToPath("Nokia", function (err) {
path.reverse().join(' / ');
});
NB In addition, instead of pushing new items onto the end of the path array and then reversing it, you could use path.unshift() which adds the item to the beginning of the array.