Behavior of require in node.js - javascript

I currently have a database connection module containing the following:
var mongodb = require("mongodb");
var client = mongodb.MongoClient;
client.connect('mongodb://host:port/dbname', { auto_reconnect: true },
function(err, db) {
if (err) {
console.log(err);
} else {
// export db as member of exports
module.exports.db = db;
}
}
);
I can then successfully access it doing the following:
users.js
var dbConnection = require("./db.js");
var users = dbConnection.db.collection("users");
users.find({name: 'Aaron'}).toArray(function(err, result) {
// do something
});
However, if I instead export module.exports = db, i.e., try to assign the exports object to the db object instead of making it a member of exports, and try to access it in users.js via var db = require("./db.js"); the object is undefined, why?
If it is because there is a delay in setting up the connection (shouldn't require() wait until the module finishes running its code before assigning the value of module.exports?), then why do neither of these examples work?
one.js
setTimeout(function() {
module.exports.x = {test: 'value'};
}, 500);
two.js
var x = require("./one");
console.log(x.test);
OR
one.js
setTimeout(function() {
module.exports.x = {test: 'value'};
}, 500);
two.js
setTimeout(function() {
var x = require("./one");
console.log(x.test);
}, 1000);
Running $ node two.js prints undefined in both cases instead of value.

There are 3 key points to understand here and then I will explain them in detail.
module.exports is an object and objects are passed by copy-of-reference in JavaScript.
require is a synchronous function.
client.connect is an asynchronous function.
As you suggested, it is a timing thing. node.js cannot know that module.exports is going to change later. That's not it's problem. How would it know that?
When require runs, it finds a file that meets its requirements based on the path you entered, reads it and executes it, and caches module.exports so that other modules can require the same module and not have to re-initialize it (which would mess up variable scoping, etc.)
client.connect is an asynchronous function call, so after you execute it, the module finishes execution and the require call stores a copy of the module.exports reference and returns it to users.js. Then you set module.exports = db, but it's too late. You are replacing the module.exports reference with a reference to db, but the module export in the node require cache is pointing to the old object.
It's better to define module.exports as a function which will get a connection and then pass it to a callback function like so:
var mongodb = require("mongodb");
var client = mongodb.MongoClient;
module.exports = function (callback) {
client.connect('mongodb://host:port/dbname', { auto_reconnect: true },
function(err, db) {
if (err) {
console.log(err);
callback(err);
} else {
// export db as member of exports
callback(err, db);
}
}
)
};
Warning: though it's outside the scope of this answer, be very careful with the above code to make sure you close/return the connections appropriately, otherwise you will leak connections.

Yes, dbConnection.db is undefined because the connection is made asynchronously which means by definition the node code just continues to execute without waiting for the DB connection to be established.
shouldn't require() wait until the module finishes running its code before assigning the value of module.exports?
Nope, it just doesn't work that way. require is for code that is always there. Database connections aren't code and aren't always there. Best not to confuse these two types of resources and how to reference them from you program.

shouldn't `require() wait until the module finishes running its code
before assigning the value of module.exports?
module.exports.db is setting in callback, this operation is async, so in user.js you can't get db.collection.
It will be better to add collections in connect callback.
You can use this answer to change you code and use shared connection in other modules.

And what is the question? This is how require works - it gets the module synchronously and pass you the exports.
You suggestion to 'wait until code is run' could be answered two ways:
It waits until the code is run. The setTimeout has successfully finished. Learn to separate asynchronous callbacks aimed for future from the actual thread.
If you mean "until all of the asynchronous callbacks are run", that's nonsense - what if some of them is not run at all, because it wait for, I don't know, mouse click, but user does not have mouse attached? (and how do you even define 'all code has run?' That every statement was run at least once? What about if (true) { thisruns(); } else { thiswontrunever(); }?)

Related

NodeJS & express catch write function of HTTP module using a Promise

I am writing an express application using
NodeJS v8
express (latest version)
After looking at the onHeaders module and finding out how the module rewrites the HTTP head, I wanted to make use of that function of JavaScript.
I wanted to write a small session system using my SQL server. I am aware of the session module from express, but this module is not able to handle the specific tables and customization, I need.
For convenience reasons I wanted the session to be inserted into the request before the controllers and saved after all controllers finished. (e.g. the writeHead method has been called)
My code in the session looks like:
core = async function(req, res, next) {
res.writeHead = hijackHead(res.writeHead); // Hijack the writeHead method to apply our header at the last
}
//[...](Omitted code)
hijackHead = function(writeFunction) {
let fired = false;
return function hackedHead(statusCode) {
if ( fired ) {
return;
}
//[...](Omitted code)
debug("Session data has changed. Writing");
sessionManager.storeSessionData(session.identifier, session).then(() => { // Promise taking ~60ms to resolve
debug("Finished writing...");
callMe(); // Current idea of calling the writeHead of the original res
});
let that = this, // Arguments to apply to the original writeHead
args = arguments
function callMe() {
debug("Finished! Give control to http, statuscode: %i", statusCode);
writeFunction.apply(that, args); // Call the original writeHead from the response
debug("Sent!")
}
} // End of hackedHead
} // End of hijackHead
The core function is being passed to express as a middleware.
Additionally sessionManager.storeSessionData is a Promise storing data and fulfilling after that, taking ~60ms. The Promise has been testes and works perfectly.
When I now make a request using this Controller, the Node net Module returns the error:
TypeError: Invalid data, chunk must be a string or buffer, not object
at Socket.write (net.js:704:11)
at ServerResponse._flushOutput (_http_outgoing.js:842:18)
at ServerResponse._writeRaw (_http_outgoing.js:258:12)
at ServerResponse._send (_http_outgoing.js:237:15)
at write_ (_http_outgoing.js:667:15)
at ServerResponse.end (_http_outgoing.js:751:5)
at Array.write (/usr/lib/node_modules/express/node_modules/finalhandler/index.js:297:9)
at listener (/usr/lib/node_modules/express/node_modules/on-finished/index.js:169:15)
at onFinish (/usr/lib/node_modules/express/node_modules/on-finished/index.js:100:5)
at callback (/usr/lib/node_modules/express/node_modules/ee-first/index.js:55:10)
Since the new function needs about 30ms to react and return the Promise, the function finishes earlier causing Node to crash.
I already tried blocking the Node loop with a while, timeout or even a recursive function. Neither of them worked.
I tries to simplfy the code as much as possible and I hope that I didn't simplify it too much.
Now I am asking if anybody can help me, how to call the writeHead function properly after the Promise has resolved?
The issue with this is, that net.js directly responds to those methods when writeHead has finished. Even though the head has not been written, it tries to write the body.
Instead it is possible to catch the end()method which will await everything and then close the connection.

Keeping Node function parameters safe in an async environment

This is something that has been bothering me and I cant seem to find a straight answer.
Here is a form of Node function that I use a lot, it handles a web request and does a bit of IO:
function handleRequest(req, res) {
doSomeIo()
.then(function(result) {
res.send(result)
})
}
This function gets called and the res parameter is set to the current response object and it goes of into the land of IO, while its playing out there a second request comes through and sets the res to a new object? Now the first request comes back from IO, and uses the res object, is this now the first instance or the second i.e. does node essentially make a separate copy of everything each time the handleRequest function is called, with its own parameter values or is there only one instance of it and its parameters? Are the parameters in the the above function safe in an async environment or would it be better to do something like this:
function handleRequest(req, res) {
doSomeIo()
.then(function(res) {
return function(result) {
res.send(result)
}
}(res))
}
Or am I just completely ignorant of how Node and Java Script works, which seems pretty likely.
You don't have to worry at all about this case. The req object contains information about the HTTP request that the server received. Everything is sandboxed in per request basis. This will be helpful for you: What are "res" and "req" parameters in Express functions?
You can expect the current req and res object to remain the same among multiple events (I/O responses are essentially events) unless you do something to them, or you have other middleware that does something. There's no need to do anything like your second code snippet.
If you're new to JavaScript and/or the concept of closures, that's probably why you're uneasy with the syntax.
Each call to the function handleRequest() will not use the same variable values of previous calls to handleRequests(). An easy way to see this behavior would be to log the number of times the method was called, and the value of an incrementer in handleRequest().
Example:
In app.js (or whatever js file you initialize your server in), add the following:
var app = express(),
calls = 0; app.set('calls', calls);
Then in your handler add the following:
function handleRequest(req, res) {
doSomeIo()
.then(function(res) {
req.calls++;
return function(result) {
res.send(req.calls)
}
}(res))
}
You should notice that each call to the endpoint, no matter how quickly you make each call, count increases by 1 each time (get ready for race conditions).

Node.js return variable used in callback

In some modules I saw this strange way of initializing variable used in a callback.
This particular example is from mssql module:
var sql = require('mssql');
var connection = new sql.Connection(config, function (err) {
var request = new sql.Request(connection);
request.query('select 1 as number', function (err, recordset) {
// do something
});
});
What appears strange to me is that connection is used inside callback as if it is already initialized, and in fact it is.
However I would thought that callback should be run before function sql.Connection() does return. In fact there is no way to run anything after it returns.
So how does this thing work?
The callback is asynchronous, meaning it doesn't run immediately. Because of this, it gets placed in a queue and run whenever the interpreter isn't doing anything. For example, try this:
var connection = new sql.Connection(config, function(err) {
console.log('I run second');
});
console.log('I run first');

Q.js, promises, classes and "this", what is the context?

I'm completely confused about the context inside a Q promise. I don't think that's Q specific, but general with all promises. What the hell is the context of this inside a class?
This code uses TypeScript, everything is static now, because I basically failed to do anything non-static. This code works fine.
I tried to add a private _config; instance variable and use the _getConfig method to set the _config in the constructor. But when I used this._config inside the method checkMongodbConnection, well, it wasn't the same object as what was returned by the _getConfig() method. (I watched variable states in debug mode)
So I guess that this, inside a class, because I call the code from a Q promise, don't have the class instance context.
I'm wondering if using promises is a good idea after all, if I run into context issues the code will just be a lot more difficult to understand and to debug. I would appreciate to understand why and made a choice in consequence. I don't want to lost the class instance context, that's way too much tricky. I'm afraid to use a technology that will actually makes things more complicated, I prefer callback hell to that.
///<reference path='./def/defLoader.d.ts'/>
export class App {
/**
* Constructor.
* Load the config.
* #return {}
*/
private static _getConfig(){
if(typeof __config !== "undefined"){
return __config;
}else{
require('./../../shared/lib/globals/services');
return configHelper.load('_serverConfig', require('./../../shared/config/_serverConfig.json').path.config, __dirname + '/../../');
}
}
/**
* Check that the mongoose connection open correctly, meaning that the mongod process is running on the host.
* #return {Q.Promise<T>|Function}
*/
public static checkMongodbConnection(){
var config = App._getConfig();
// Building promise
var deferred: any = Q.defer();
if(config.game.checkMongodb){
// Retrieves the mongoose configuration file, the env doesn't matter here.
var mongodbConfig = require('./../../shared/config/mongodb.json')['development'];
// Try mongoose connexion
mongoose.connect('mongodb://' + mongodbConfig.host + '/' + mongodbConfig.database);
// Bind connexion
var db: mongoose.Connection = mongoose.connection;
// Get errors
db.on('error', function(err) {
deferred.reject('Mongodb is not running, please run the mongod process: \n' + err)
});
// If the connexion seems to be open
db.once('open', function callback () {
// Close it
db.db.close();
// Resolve promise
deferred.resolve();
});
}else{
deferred.resolve();
}
// Get back promise
return deferred.promise;
}
/**
* Check that the redis connection is open, meaning that the redis-server process is running on the host.
* #return {Q.Promise<T>|Function}
*/
public static checkRedisConnection(){
var config = App._getConfig();
// Building promise
var deferred: any = Q.defer();
if(config.game.checkRedis) {
// Create the redis client to test to connexion on server
var redisClient:any = redis.createClient();
// Get client errors
redisClient.on("error", function (err) {
deferred.reject(err);
});
// Try applying a key
redisClient.set("keyTest", true);
// Now key is applied, try getting it
redisClient.get("keyTest", function (err, reply) {
if (err) {
deferred.reject("Redis is not running, please make sure to run redis before to start the server. \n" + err);
} else {
deferred.resolve();
}
});
}else{
deferred.resolve();
}
// Get back promise
return deferred.promise;
}
}
Code that calls the class:
Q.fcall(App.checkRedisConnection)
.then(App.checkMongodbConnection)
.then(function(result) {
// run server
}, console.error);
The promises/A+ specification dictates explicitly that the value of this inside a promise chain is always undefined (strict mode) or the global object via:
2.2.5 onFulfilled and onRejected must be called as functions (i.e. with no this value).
Specified here.
If you're not using a promise library like Bluebird that allows setting this explicitly (via .bind), you can still utilize TypeScript's fat arrows (also in ES6) to call something with lexical this.

Node.js - asynchronous module loading

Is it possible to load a Node.js module asynchronously?
This is the standard code:
var foo = require("./foo.js"); // waiting for I/O
foo.bar();
But I would like to write something like this:
require("./foo.js", function(foo) {
foo.bar();
});
// doing something else while the hard drive is crunching...
Is there a way how to do this? Or is there a good reason why callbacks in require aren't supported?
While require is synchronous, and Node.js does not provide an asynchronous variant out of the box, you can easily build one for yourself.
First of all, you need to create a module. In my example I am going to write a module that loads data asynchronously from the file system, but of course YMMV. So, first of all the old-fashioned, not wanted, synchronous approach:
var fs = require('fs');
var passwords = fs.readFileSync('/etc/passwd');
module.exports = passwords;
You can use this module as usual:
var passwords = require('./passwords');
Now, what you want to do is turn this into an asynchronous module. As you can not delay module.exports, what you do instead is instantly export a function that does the work asynchronously and calls you back once it is done. So you transform your module into:
var fs = require('fs');
module.exports = function (callback) {
fs.readFile('/etc/passwd', function (err, data) {
callback(err, data);
});
};
Of course you can shorten this by directly providing the callback variable to the readFile call, but I wanted to make it explicit here for demonstration purposes.
Now when you require this module, at first, nothing happens, as you only get a reference to the asynchronous (anonymous) function. What you need to do is call it right away and provide another function as callback:
require('./passwords')(function (err, passwords) {
// This code runs once the passwords have been loaded.
});
Using this approach you can, of course, turn any arbitrary synchronous module initialization to an asynchronous one. But the trick is always the same: Export a function, call it right from the require call and provide a callback that continues execution once the asynchronous code has been run.
Please note that for some people
require('...')(function () { ... });
may look confusing. Hence it may be better (although this depends on your actual scenario) to export an object with an asynchronous initialize function or something like that:
var fs = require('fs');
module.exports = {
initialize: function (callback) {
fs.readFile('/etc/passwd', function (err, data) {
callback(err, data);
});
}
};
You can then use this module by using
require('./passwords').initialize(function (err, passwords) {
// ...
});
which may be slightly better readable.
Of course you can also use promises or any other asynchronous mechanism which makes your syntax look nicer, but in the end, it (internally) always comes down to the pattern I just described here. Basically, promises & co. are nothing but syntactic sugar over callbacks.
Once you build your modules like this, you can even build a requireAsync function that works like you initially suggested in your question. All you have to do is stick with a name for the initialization function, such as initialize. Then you can do:
var requireAsync = function (module, callback) {
require(module).initialize(callback);
};
requireAsync('./passwords', function (err, passwords) {
// ...
});
Please note, that, of course, loading the module will still be synchronous due to the limitations of the require function, but all the rest will be asynchronous as you wish.
One final note: If you want to actually make loading modules asynchronous, you could implement a function that uses fs.readFile to asynchronously load a file, and then run it through an eval call to actually execute the module, but I'd highly recommend against this: One the one hand, you lose all the convenience features of request such as caching & co., on the other hand you'll have to deal with eval - and as we all know, eval is evil. So don't do it.
Nevertheless, if you still want to do it, basically it works like this:
var requireAsync = function (module, callback) {
fs.readFile(module, { encoding: 'utf8' }, function (err, data) {
var module = {
exports: {}
};
var code = '(function (module) {' + data + '})(module)';
eval(code);
callback(null, module);
});
};
Please note that this code is not "nice", and that it lacks any error handling, and any other capabilities of the original require function, but basically, it fulfills your demand of being able to asynchronously load synchronously designed modules.
Anyway, you can use this function with a module like
module.exports = 'foo';
and load it using:
requireAsync('./foo.js', function (err, module) {
console.log(module.exports); // => 'foo'
});
Of course you can export anything else as well. Maybe, to be compatible with the original require function, it may be better to run
callback(null, module.exports);
as last line of your requireAsync function, as then you have direct access to the exports object (which is the string foo in this case). Due to the fact that you wrap the loaded code inside of an immediately executed function, everything in this module stays private, and the only interface to the outer world is the module object you pass in.
Of course one can argue that this usage of evil is not the best idea in the world, as it opens up security holes and so on - but if you require a module, you basically do nothing else, anyway, than eval-uating it. The point is: If you don't trust the code, eval is the same bad idea as require. Hence in this special case, it might be fine.
If you are using strict mode, eval is no good for you, and you need to go with the vm module and use its runInNewContext function. Then, the solution looks like:
var requireAsync = function (module, callback) {
fs.readFile(module, { encoding: 'utf8' }, function (err, data) {
var sandbox = {
module: {
exports: {}
}
};
var code = '(function (module) {' + data + '})(module)';
vm.runInNewContext(code, sandbox);
callback(null, sandbox.module.exports); // or sandbox.moduleā€¦
});
};
The npm module async-require can help you to do this.
Install
npm install --save async-require
Usage
var asyncRequire = require('async-require');
// Load script myModule.js
asyncRequire('myModule').then(function (module) {
// module has been exported and can be used here
// ...
});
The module uses vm.runInNewContext(), a technique discussed in the accepted answer. It has bluebird as a dependency.
(This solution appeared in an earlier answer but that was deleted by review.)
Yes - export function accepting callback or maybe even export full featured promise object.
// foo.js + callback:
module.exports = function(cb) {
setTimeout(function() {
console.log('module loaded!');
var fooAsyncImpl = {};
// add methods, for example from db lookup results
fooAsyncImpl.bar = console.log.bind(console);
cb(null, fooAsyncImpl);
}, 1000);
}
// usage
require("./foo.js")(function(foo) {
foo.bar();
});
// foo.js + promise
var Promise = require('bluebird');
module.exports = new Promise(function(resolve, reject) {
// async code here;
});
// using foo + promises
require("./foo.js").then(function(foo) {
foo.bar();
});
Andrey's code below is the simplest answer which works, but his had a small mistake so I'm posting the correction here as answer. Also, I'm just using callbacks, not bluebird / promises like Andrey's code.
/* 1. Create a module that does the async operation - request etc */
// foo.js + callback:
module.exports = function(cb) {
setTimeout(function() {
console.log('module loaded!');
var foo = {};
// add methods, for example from db lookup results
foo.bar = function(test){
console.log('foo.bar() executed with ' + test);
};
cb(null, foo);
}, 1000);
}
/* 2. From another module you can require the first module and specify your callback function */
// usage
require("./foo.js")(function(err, foo) {
foo.bar('It Works!');
});
/* 3. You can also pass in arguments from the invoking function that can be utilised by the module - e.g the "It Works!" argument */
For anyone who uses ESM modules and top level await, this will just work out of the box without using callbacks to commonJS require or installing any packages like async-require.
// In foo.mjs
await doIOstuffHere();
export foo;
// in bar.mjs
import foo from "./foo.mjs";
foo.bar(); // this function would not run till the async work in foo.mjs is finished

Categories

Resources