Is it possible to load a Node.js module asynchronously?
This is the standard code:
var foo = require("./foo.js"); // waiting for I/O
foo.bar();
But I would like to write something like this:
require("./foo.js", function(foo) {
foo.bar();
});
// doing something else while the hard drive is crunching...
Is there a way how to do this? Or is there a good reason why callbacks in require aren't supported?
While require is synchronous, and Node.js does not provide an asynchronous variant out of the box, you can easily build one for yourself.
First of all, you need to create a module. In my example I am going to write a module that loads data asynchronously from the file system, but of course YMMV. So, first of all the old-fashioned, not wanted, synchronous approach:
var fs = require('fs');
var passwords = fs.readFileSync('/etc/passwd');
module.exports = passwords;
You can use this module as usual:
var passwords = require('./passwords');
Now, what you want to do is turn this into an asynchronous module. As you can not delay module.exports, what you do instead is instantly export a function that does the work asynchronously and calls you back once it is done. So you transform your module into:
var fs = require('fs');
module.exports = function (callback) {
fs.readFile('/etc/passwd', function (err, data) {
callback(err, data);
});
};
Of course you can shorten this by directly providing the callback variable to the readFile call, but I wanted to make it explicit here for demonstration purposes.
Now when you require this module, at first, nothing happens, as you only get a reference to the asynchronous (anonymous) function. What you need to do is call it right away and provide another function as callback:
require('./passwords')(function (err, passwords) {
// This code runs once the passwords have been loaded.
});
Using this approach you can, of course, turn any arbitrary synchronous module initialization to an asynchronous one. But the trick is always the same: Export a function, call it right from the require call and provide a callback that continues execution once the asynchronous code has been run.
Please note that for some people
require('...')(function () { ... });
may look confusing. Hence it may be better (although this depends on your actual scenario) to export an object with an asynchronous initialize function or something like that:
var fs = require('fs');
module.exports = {
initialize: function (callback) {
fs.readFile('/etc/passwd', function (err, data) {
callback(err, data);
});
}
};
You can then use this module by using
require('./passwords').initialize(function (err, passwords) {
// ...
});
which may be slightly better readable.
Of course you can also use promises or any other asynchronous mechanism which makes your syntax look nicer, but in the end, it (internally) always comes down to the pattern I just described here. Basically, promises & co. are nothing but syntactic sugar over callbacks.
Once you build your modules like this, you can even build a requireAsync function that works like you initially suggested in your question. All you have to do is stick with a name for the initialization function, such as initialize. Then you can do:
var requireAsync = function (module, callback) {
require(module).initialize(callback);
};
requireAsync('./passwords', function (err, passwords) {
// ...
});
Please note, that, of course, loading the module will still be synchronous due to the limitations of the require function, but all the rest will be asynchronous as you wish.
One final note: If you want to actually make loading modules asynchronous, you could implement a function that uses fs.readFile to asynchronously load a file, and then run it through an eval call to actually execute the module, but I'd highly recommend against this: One the one hand, you lose all the convenience features of request such as caching & co., on the other hand you'll have to deal with eval - and as we all know, eval is evil. So don't do it.
Nevertheless, if you still want to do it, basically it works like this:
var requireAsync = function (module, callback) {
fs.readFile(module, { encoding: 'utf8' }, function (err, data) {
var module = {
exports: {}
};
var code = '(function (module) {' + data + '})(module)';
eval(code);
callback(null, module);
});
};
Please note that this code is not "nice", and that it lacks any error handling, and any other capabilities of the original require function, but basically, it fulfills your demand of being able to asynchronously load synchronously designed modules.
Anyway, you can use this function with a module like
module.exports = 'foo';
and load it using:
requireAsync('./foo.js', function (err, module) {
console.log(module.exports); // => 'foo'
});
Of course you can export anything else as well. Maybe, to be compatible with the original require function, it may be better to run
callback(null, module.exports);
as last line of your requireAsync function, as then you have direct access to the exports object (which is the string foo in this case). Due to the fact that you wrap the loaded code inside of an immediately executed function, everything in this module stays private, and the only interface to the outer world is the module object you pass in.
Of course one can argue that this usage of evil is not the best idea in the world, as it opens up security holes and so on - but if you require a module, you basically do nothing else, anyway, than eval-uating it. The point is: If you don't trust the code, eval is the same bad idea as require. Hence in this special case, it might be fine.
If you are using strict mode, eval is no good for you, and you need to go with the vm module and use its runInNewContext function. Then, the solution looks like:
var requireAsync = function (module, callback) {
fs.readFile(module, { encoding: 'utf8' }, function (err, data) {
var sandbox = {
module: {
exports: {}
}
};
var code = '(function (module) {' + data + '})(module)';
vm.runInNewContext(code, sandbox);
callback(null, sandbox.module.exports); // or sandbox.moduleā¦
});
};
The npm module async-require can help you to do this.
Install
npm install --save async-require
Usage
var asyncRequire = require('async-require');
// Load script myModule.js
asyncRequire('myModule').then(function (module) {
// module has been exported and can be used here
// ...
});
The module uses vm.runInNewContext(), a technique discussed in the accepted answer. It has bluebird as a dependency.
(This solution appeared in an earlier answer but that was deleted by review.)
Yes - export function accepting callback or maybe even export full featured promise object.
// foo.js + callback:
module.exports = function(cb) {
setTimeout(function() {
console.log('module loaded!');
var fooAsyncImpl = {};
// add methods, for example from db lookup results
fooAsyncImpl.bar = console.log.bind(console);
cb(null, fooAsyncImpl);
}, 1000);
}
// usage
require("./foo.js")(function(foo) {
foo.bar();
});
// foo.js + promise
var Promise = require('bluebird');
module.exports = new Promise(function(resolve, reject) {
// async code here;
});
// using foo + promises
require("./foo.js").then(function(foo) {
foo.bar();
});
Andrey's code below is the simplest answer which works, but his had a small mistake so I'm posting the correction here as answer. Also, I'm just using callbacks, not bluebird / promises like Andrey's code.
/* 1. Create a module that does the async operation - request etc */
// foo.js + callback:
module.exports = function(cb) {
setTimeout(function() {
console.log('module loaded!');
var foo = {};
// add methods, for example from db lookup results
foo.bar = function(test){
console.log('foo.bar() executed with ' + test);
};
cb(null, foo);
}, 1000);
}
/* 2. From another module you can require the first module and specify your callback function */
// usage
require("./foo.js")(function(err, foo) {
foo.bar('It Works!');
});
/* 3. You can also pass in arguments from the invoking function that can be utilised by the module - e.g the "It Works!" argument */
For anyone who uses ESM modules and top level await, this will just work out of the box without using callbacks to commonJS require or installing any packages like async-require.
// In foo.mjs
await doIOstuffHere();
export foo;
// in bar.mjs
import foo from "./foo.mjs";
foo.bar(); // this function would not run till the async work in foo.mjs is finished
Related
I am so new to node and js that I had no idea about Callbacks and async programming until yesterday, so speak to me like I'm an idiot, 'cos I am...
With the death of mixture.io I thought I would write my own little static site builder. I looked at gulp and grunt but plumped for using npm as a build tool.
Building the css, minifying, listing etc.. was super easy, but when it came to building the pages, life quickly descended into callback hell.
A bit of reading and I have a start for the page building script:
var fm = require('front-matter'),
fs = require('fs'),
glob = require('glob'),
md = require('marked');
const SEARCHPATH = "content/pages/";
pages = [];
function searchFiles () {
glob("*.md", { cwd: SEARCHPATH }, readFiles);
}
function readFiles (err, files) {
if(err) throw err;
for (var file of files) {
fs.readFile(SEARCHPATH + file, 'utf8', processFiles);
}
}
function processFiles(err, data) {
if(err) throw err;
var attributes = fm(data).attributes;
var content = md(fm(data).body);
pages.push(attributes, content);
applyTemplate(pages);
}
function applyTemplate(pages) {
console.log(pages);
}
searchFiles();
But it looks for all the world like I'm about to descend into daisy-chain hell where each function calls the next, but I can't access the pages variable without doing so.
It all seems a bit off.
Am I thinking about this right? What would be a better way to structure this programmatically?
Thanks as ever Overflowers.
You broke out all of the callbacks into function declarations instead of inline expressions so that is already +1, because you have function objects that can be exported and tested.
For this response I'm assuming the priority is isolated progrmatic unittests, without mocking require. (Which I usually find myself refactoring legacy node.js towards, when I enter a new project).
When I go down this route of nested nested callbacks I think the least easy way to work with is a nested chain of anonymous expressions as callbacks: (in psuedocode)
function doFiles() {
glob('something', function(files) {
for (f in files) {
fs.readFile(file, function(err, data) {
processFile(data);
}
}
}
}
Testing the above programmatically is pretty convoluted. The only way to do it is to mock requires. To test that readFile callback is working, you have to control all calls before it!!! Which violates isolation in tests.
The 2nd best approach, imo, is to break out callbacks as you have done.
Which allows better isolation for unittests, but still requires mocking requires for fs and glob.
The 3rd best approach, imo, is injecting all a functions dependencies, to allow easy configuration of mock objects. This often looks very weird, but for me the goal is 100% coverage, in isolated unittests without using a mock require library. It makes it so each function is an isolated object that is easy to test, and configure mock objects for, but often makes calling that function more convoluted!
To achieve this:
function searchFiles () {
glob("*.md", { cwd: SEARCHPATH }, readFiles);
}
Would become
function searchFiles (getFiles, getFilesCallback) {
getFiles("*.md", { cwd: SEARCHPATH }, getFilesCallback);
}
Then it could be called with
searchFiles(glob, readFiles)
This looks a little funky because it is a one line function, but illustrates how to inject the dependencies into your functions, so that your tests can configure mock objects and pass them directly to the function. Refactoring readFiles to do this:
function readFiles (err, files, readFile, processFileCb) {
if(err) throw err;
for (var file of files) {
readFile(SEARCHPATH + file, 'utf8', processFileCb);
}
}
readFiles takes in a readFile method (fs.readFile and callback to execute once the file is read. Which allows easy configuration of mock objects in programmatic testing.
Then tests could be, in psuedocode:
it('throws err when error is found', function() {
var error = true;
assert throws readFiles(error)
});
it('calls readFile for every file in files', function() {
var files = ['file1'];
var error = false;
var readFile = createSpyMaybeSinon?();
var spyCallback = createSpy();
readFiles(error, files, readFile, spyCallback);
assert(readFile.calls.count(), files.length)
assert readFile called with searchpath + file1, 'utf8', spyCallback
});
Once you have these functions that require the client to provide all of the functions dependencies, then they require a dance of creative binding of callbacks, or small functional expressions to wrap calls.
The above assumes an endgoal of complete test coverage without mocking requires, which might not be your goal :)
A "cleaner" approach imo is just to use promises from the beginnning, which is a wonderful abstraction over asynchronous calls.
I'm defining a class which instantiates several modules which depend on previous ones. The modules themselves may require an async operation before they are ready (i.e. establishing a mysql connection) so I've provided each constructor with a callback to be called once the module is ready. However I've run into a problem when instantiating classes which are ready immediately:
var async = require('async');
var child = function(parent, cb) {
var self = this;
this.ready = false;
this.isReady = function() {
return self.ready;
}
/* This does not work, throws error below stating c1.isReady is undefined*/
cb(null, true);
/* This works */
setTimeout(function() {
self.ready = true;
cb(null, true);
}, 0);
}
var Parent = function(cb) {
var self = this;
async.series([
function(callback){
self.c1 = new child(self, callback);
},
function(callback){
self.c2 = new child(self, callback);
}
],
function(err, results){
console.log(self.c1.isReady(), self.c2.isReady);
console.log(err, results);
});
}
var P = new Parent();
I'm guessing the issue is calling cb within the constructor means async proceeds to the next function before the constructor finishes. Is there a better approach to this? I considered using promises, but I find this approach easier to understand/follow.
You will have to delay the call to the callback when everything is synchronous because any code in the callback won't be able to reference the object yet (as you've seen). Because the constructor hasn't finished executing, the assignment of it's return value also hasn't finished yet.
You could pass the object to the callback and force the callback to use that reference (which will exist and be fully formed), but you're requiring people to know that they can't use a normally accepted practice so it's much better to make sure the callback is only called asynchronously and people can then write their code in the normal ways they would expect to.
In node.js, it will be more efficient to use process.nextTick() instead of setTimeout() to accomplish your goal. See this article for more details.
var child = function(parent, cb) {
var self = this;
this.ready = false;
this.isReady = function() {
return self.ready;
}
/* This works */
process.nextTick(function() {
self.ready = true;
cb(null, true);
}, 0);
}
Here's a general observation. Many people (myself included) think that it overcomplicates things to put any async operation in a constructor for reasons related to this. Instead, most objects that need to do async operations in order to get themselves set up will offer a .init() or .connect() method or something like that. Then, you construct the object like you normally would in a synchronous fashion and then separately initiate the async part of the initialization and pass it the callback. That gets you away from this issue entirely.
If/when you want to use promises for tracking your async operation (which is great feature direction to go), it's way, way easier to let the constructor return the object and the .init() operation to return a promise. It gets messy to try to return both the object and a promise from the constructor and even messier to code with that.
Then, using promises you can do this:
var o = new child(p);
o.init().then(function() {
// object o is fully initialized now
// put code in here to use the object
}, function(err) {
// error initializing object o
});
If you can't put the object into a Promise, put a promise into the object.
Give it a Ready property that is a promise.
If you have a class hierarchy (I'm talking about Typescript now) of (say) widgets, the base class can define the Ready property and assign it an already resolved Promise object. Doing this is a base class means it's always safe to write code like this
var foo = Foo();
foo.Ready.then(() => {
// do stuff that needs foo to be ready
});
Derived classes can take control of promise resolution by replacing the value of Ready with a new promise object, and resolving it when the async code completes.
Here's a comprehensive workup of Asynchronous Constructor design pattern
In nodejs, a lot of function calls are implemented as non-blocking function. Sometimes there could be dependencies between these function calls. For example, when user login, I want first get the user type of the user and then according to the user type I can get the tasks which are just assigned this user type. In the non-blocking function, I use nested function call:
$http.post(url_login, login_request)
.success(function(req) {
$http.post(url_get_tasks, req.type)
.success(function(req) {
//update tasks
})
})
A lot of libraries provides non-blocking functions only, for example, the node_redis library. Is there any good way to handle the nested non-blocking function calls to solve the dependency problem.
There is a page dedicated to this exact problem: http://callbackhell.com/
To sum up, you have the option
to use named functions, as codebox suggests
use the async library
or use promises
If I am not mistaken,
you are using the Angular $http service, which should support proper
promise chaining, so you could reduce your code to
$http.post(url_login, login_request)
.then(function(req){
return $http.pos(url_get_tasks, req.type);
})
.then(function(req){
// update tasks
})
.catch(function(err){
// errors from the first AND second call land here
})
You can write a simple wrapper for async functions to turn them into promises:
// this assumes standard node.js callbacks with the
// signature (error, data)
function toPromise(fn){
return function() {
var args = Array.prototype.slice.call(arguments);
var scope = this;
return new Promise(function(resolve, reject){
var cb = function(err, data) {
return err ? reject(err) : resolve(data);
};
args.push(cb);
fn.apply(scope, args);
});
};
}
// turn fs.readdir into a promise
var readdir = toPromise(require('fs').readdir);
readdir('.')
.then(function(files){
console.log(files);
})
.catch(function(err){
console.log('reading files failed');
})
There was another question regarding Angular $http promises which might help you understand them better: here
I'm not 100% sure what you are asking, but if you want to reduce nesting in your code you can move the inner functions out like this:
function success1(req) {
//update tasks
}
function success2(req) {
$http.post(url_get_tasks, req.type).success(success1)
}
$http.post(url_login, login_request).success(success2)
obviously you should pick better names than success1 though
I am struggling with getting my head around how to overcome and handle the async nature of Node.JS. I have done quite a bit of reading on it and tried to make Node do what I want by either using a message passing solution or callback functions.
My problem is I have a object where I want to constructor to load a file and populate an array. Then I want all calls to this function use that loaded data. So I need the original call to wait for the file to be loaded and all subsequent calls to use the already loaded private member.
My issue is that the function to load load the data and get the data is being executed async even if it return a function with a callback.
Anyways, is there something simple I am missing? Or is there an easier pattern I could use here? This function should return part of the loaded file but returns undefined. I have checked that the file is actually being loaded, and works correctly.
function Song() {
this.verses = undefined;
this.loadVerses = function(verseNum, callback) {
if (this.verses === undefined) {
var fs = require('fs'),
filename = 'README.md';
fs.readFile(filename, 'utf8', function(err, data) {
if (err) {
console.log('error throw opening file: %s, err: %s', filename, err);
throw err;
} else {
this.verses = data;
return callback(verseNum);
}
});
} else {
return callback(verseNum);
}
}
this.getVerse = function(verseNum) {
return this.verses[verseNum + 1];
}
}
Song.prototype = {
verse: function(input) {
return this.loadVerses(input, this.getVerse);
}
}
module.exports = new Song();
Update:
This is how I am using the song module from another module
var song = require('./song');
return song.verse(1);
"My issue is that the function to load the data and get the data is being executed async even if it return a function with a callback."
#AlbertoZaccagni what I mean by that scentence is that this line return this.loadVerses(input, this.getVerse); returns before the file is loaded when I expect it to wait for the callback.
That is how node works, I will try to clarify it with an example.
function readFile(path, callback) {
console.log('about to read...');
fs.readFile(path, 'utf8', function(err, data) {
callback();
});
}
console.log('start');
readFile('/path/to/the/file', function() {
console.log('...read!');
});
console.log('end');
You are reading a file and in the console you will likely have
start
about to read...
end
...read!
You can try that separately to see it in action and tweak it to understand the point. What's important to notice here is that your code will keep on running skipping the execution of the callback, until the file is read.
Just because you declared a callback does not mean that the execution will halt until the callback is called and then resumed.
So this is how I would change that code:
function Song() {
this.verses = undefined;
this.loadVerses = function(verseNum, callback) {
if (this.verses === undefined) {
var fs = require('fs'),
filename = 'README.md';
fs.readFile(filename, 'utf8', function(err, data) {
if (err) {
console.log('error throw opening file: %s, err: %s', filename, err);
throw err;
} else {
this.verses = data;
return callback(verseNum);
}
});
} else {
return callback(verseNum);
}
}
}
Song.prototype = {
verse: function(input, callback) {
// I've removed returns here
// I think they were confusing you, feel free to add them back in
// but they are not actually returning your value, which is instead an
// argument of the callback function
this.loadVerses(input, function(verseNum) {
callback(this.verses[verseNum + 1]);
});
}
}
module.exports = new Song();
To use it:
var song = require('./song');
song.verse(1, function(verse) {
console.log(verse);
});
I've ignored
the fact that we're not treating the error as first argument of the callback
the fact that calling this fast enough will create racing conditions, but I believe this is another question
[Collected into an answer and expanded from my previous comments]
TL;DR You need to structure your code such that the result of any operation is only used inside that operation's callback, since you do not have access to it anywhere else.
And while assigning it to an external global variable will certainly work as expected, do so will only occur after the callback has fired, which happens at a time you cannot predict.
Commentary
Callbacks do not return values because by their very nature, they are executed sometime in the future.
Once you pass a callback function into a controlling asynchronous function, it will be executed when the surrounding function decides to call it. You do not control this, and so waiting for a returned result won't work.
Your example code, song.verse(1); cannot be expected to return anything useful because it is called immediately and since the callback hasn't yet fired, will simply return the only value it can: null.
I'm afraid this reliance on asynchronous functions with passed callbacks is an irremovable feature of how NodeJS operates; it is at the very core of it.
Don't be disheartened though. A quick survey of all the NodeJS questions here shows quite clearly that this idea that one must work with the results of async operations only in their callbacks is the single greatest impediment to anyone understanding how to program in NodeJS.
For a truly excellent explanation/tutorial on the various ways to correctly structure NodeJS code, see Managing Node.js Callback Hell with Promises, Generators and Other Approaches.
I believe it clearly and succinctly describes the problem you face and provides several ways to refactor your code correctly.
Two of the features mentioned there, Promises and Generators, are programming features/concepts, the understanding of which would I believe be of great use to you.
Promises (or as some call them, Futures) is/are a programming abstraction that allows one to write code a little more linearly in a if this then that style, like
fs.readFileAsync(path).then(function(data){
/* do something with data here */
return result;
}).catch(function(err){
/* deal with errors from readFileAsync here */
}).then(function(result_of_last_operation){
/* do something with result_of_last_operation here */
if(there_is_a_problem) throw new Error('there is a problem');
return final_result;
})
.catch(function(err){
/* deal with errors when there_is_a_problem here */
}).done(function(final_result){
/* do something with the final result */
});
In reality, Promises are simply a means of marshaling the standard callback pyramid in a more linear fashion. (Personally I believe they need a new name, since the idea of "a promise of some value that might appear in the future" is not an easy one to wrap one's head around, at least it wasn't for me.)
Promises do this by (behind the scenes) restructuring "callback hell" such that:
asyncFunc(args,function callback(err,result){
if(err) throw err;
/* do something with the result here*/
});
becomes something more akin to:
var p=function(){
return new Promise(function(resolve,reject){
asyncFunc(args,function callback(err,result){
if(err) reject(err)
resolve(result);
});
});
});
p();
where any value you provide to resolve() becomes the only argument to the next "then-able" callback and any error is passed via rejected(), so it can be caught by any .catch(function(err){ ... }) handlers you define.
Promises also do all the things you'd expect from the (somewhat standard) async module, like running callbacks in series or in parallel and operating over the elements of an array, returning their collected results to a callback once all the results have been gathered.
But you will note that Promises don't quite do what you want, because everything is still in callbacks.
(See bluebird for what I believe is the simplest and thus, best Promises package to learn first.)
(And note that fs.readFileAsync is not a typo. One useful feature of bluebird is that it can be made to add this and other Promises-based versions of fs's existing functions to the standard fs object. It also understands how to "promisify" other modules such as request and mkdirp).
Generators are the other feature described in the tutorial above, but are only available in the new, updated but not yet officially released version of JavaScript (codenamed "Harmony").
Using generators would also allow you to write code in a more linear manner, since one of the features it provides is the ability of waiting on the results of an asynchronous operation in a way that doesn't wreak havoc with the JavaScript event-loop. (But as I said, it's not a feature in general use yet.)
You can however use generators in the current release of node if you'd like, simply add "--harmony" to the node command line to tell it to turn on the newest features of the next version of JavaScript.
I currently have a database connection module containing the following:
var mongodb = require("mongodb");
var client = mongodb.MongoClient;
client.connect('mongodb://host:port/dbname', { auto_reconnect: true },
function(err, db) {
if (err) {
console.log(err);
} else {
// export db as member of exports
module.exports.db = db;
}
}
);
I can then successfully access it doing the following:
users.js
var dbConnection = require("./db.js");
var users = dbConnection.db.collection("users");
users.find({name: 'Aaron'}).toArray(function(err, result) {
// do something
});
However, if I instead export module.exports = db, i.e., try to assign the exports object to the db object instead of making it a member of exports, and try to access it in users.js via var db = require("./db.js"); the object is undefined, why?
If it is because there is a delay in setting up the connection (shouldn't require() wait until the module finishes running its code before assigning the value of module.exports?), then why do neither of these examples work?
one.js
setTimeout(function() {
module.exports.x = {test: 'value'};
}, 500);
two.js
var x = require("./one");
console.log(x.test);
OR
one.js
setTimeout(function() {
module.exports.x = {test: 'value'};
}, 500);
two.js
setTimeout(function() {
var x = require("./one");
console.log(x.test);
}, 1000);
Running $ node two.js prints undefined in both cases instead of value.
There are 3 key points to understand here and then I will explain them in detail.
module.exports is an object and objects are passed by copy-of-reference in JavaScript.
require is a synchronous function.
client.connect is an asynchronous function.
As you suggested, it is a timing thing. node.js cannot know that module.exports is going to change later. That's not it's problem. How would it know that?
When require runs, it finds a file that meets its requirements based on the path you entered, reads it and executes it, and caches module.exports so that other modules can require the same module and not have to re-initialize it (which would mess up variable scoping, etc.)
client.connect is an asynchronous function call, so after you execute it, the module finishes execution and the require call stores a copy of the module.exports reference and returns it to users.js. Then you set module.exports = db, but it's too late. You are replacing the module.exports reference with a reference to db, but the module export in the node require cache is pointing to the old object.
It's better to define module.exports as a function which will get a connection and then pass it to a callback function like so:
var mongodb = require("mongodb");
var client = mongodb.MongoClient;
module.exports = function (callback) {
client.connect('mongodb://host:port/dbname', { auto_reconnect: true },
function(err, db) {
if (err) {
console.log(err);
callback(err);
} else {
// export db as member of exports
callback(err, db);
}
}
)
};
Warning: though it's outside the scope of this answer, be very careful with the above code to make sure you close/return the connections appropriately, otherwise you will leak connections.
Yes, dbConnection.db is undefined because the connection is made asynchronously which means by definition the node code just continues to execute without waiting for the DB connection to be established.
shouldn't require() wait until the module finishes running its code before assigning the value of module.exports?
Nope, it just doesn't work that way. require is for code that is always there. Database connections aren't code and aren't always there. Best not to confuse these two types of resources and how to reference them from you program.
shouldn't `require() wait until the module finishes running its code
before assigning the value of module.exports?
module.exports.db is setting in callback, this operation is async, so in user.js you can't get db.collection.
It will be better to add collections in connect callback.
You can use this answer to change you code and use shared connection in other modules.
And what is the question? This is how require works - it gets the module synchronously and pass you the exports.
You suggestion to 'wait until code is run' could be answered two ways:
It waits until the code is run. The setTimeout has successfully finished. Learn to separate asynchronous callbacks aimed for future from the actual thread.
If you mean "until all of the asynchronous callbacks are run", that's nonsense - what if some of them is not run at all, because it wait for, I don't know, mouse click, but user does not have mouse attached? (and how do you even define 'all code has run?' That every statement was run at least once? What about if (true) { thisruns(); } else { thiswontrunever(); }?)