Node.js global variable property is purged - javascript

my problem is not about "memory leakage", but about "memory purge" of node.js (expressjs) app.
My app should maintain some objects in memory for the fast look-up's during the service. For the time being (one or two days) after starting the app, everthing seemed fine, until suddenly my web client failed to look-up the object bacause it has been purged (undefined). I suspect Javascript GC (garbage collection). However, as you can see in the psedu-code, I assigned the objects to the node.js "global" variable properties to prevent GC from purging them. Please give me some clue what caused this problem.
Thanks much in advance for your kind advices~
My node.js environments are node.js 0.6.12, expressjs 2.5.8, and VMWare cloudfoundry node hosting.
Here is my app.js pseudo-code :
var express = require("express");
var app = module.exports = express.createServer();
// myMethods holds a set of methods to be used for handling raw data.
var myMethods = require("myMethods");
// creates node.js global properties referencing objects to prevent GC from purging them
global.myMethods = myMethods();
global.myObjects = {};
// omited the express configurations
// creates objects (data1, data2) inside the global.myObjects for the user by id.
app.post("/createData/:id", function(req, res) {
// creates an empty object for the user.
var myObject = global.myObjects[req.prams.id] = {};
// gets json data.
var data1 = JSON.parse(req.body.data1);
var data2 = JSON.parse(req.body.data2);
// buildData1 & buildData2 functions transform data1 & data2 into the usable objects.
// these functions return the references to the transformed objects.
myObject.data1 = global.myMethods.buildData1(data1);
myObject.data2 = global.myMethods.buildData2(data2);
res.send("Created new data", 200);
res.redirect("/");
});
// returns the data1 of the user.
// Problem occurs here : myObject becomes "undefined" after one or two days running the service.
app.get("/getData1/:id", function(req, res) {
var myObject = global.myObjects[req.params.id];
if (myObject !== undefined) {
res.json(myObject.data1);
} else {
res.send(500);
}
});
// omited other service callback functions.
// VMWare cloudfoundry node.js hosting.
app.listen(process.env.VCAP_APP_PORT || 3000);

Any kind of cache system (whether is roll-your-own or a third party product) should account for this scenario. You should not rely on the data always being available on an in-memory cache. There are way too many things that can cause in-memory data to be gone (machine restart, process restart, et cetera.)
In your case, you might need to update your code to see if the data is in cache. If it is not in cache then fetch it from a persistent storage (a database, a file), cache it, and continue.

Exactly like Haesung I wanted to keep my program simple, without database. And like Haesung my first experience with Node.js (and express) was to observe this weird purging. Although I was confused, I really didn't accept that I needed a storage solution to manage a json file with a couple of hundred lines. The light bulb moment for me was when I read this
If you want to have a module execute code multiple times, then export a function, and call that function.
which is taken from http://nodejs.org/api/modules.html#modules_caching. So my code inside the required file changed from this
var foo = [{"some":"stuff"}];
export.foo;
to that
export.foo = function (bar) {
var foo = [{"some":"stuff"}];
return foo.bar;
}
And then it worked fine :-)

Then I suggest to use file system, I think 4KB overhead is not a big deal for your goals and hardware. If you familiar with front-end javascript, this could be helpful https://github.com/coolaj86/node-localStorage

Related

Node.js Function not updating value after 1st invoke

I've recently taken interest in the Discord.js framework, and was designing a bot for a server. Apologies in advance for the messiness of the code.
The issue I'm facing is that after I first run the command, the the function is invoked, the value of ticketValue does not update to the update value in my JSON file.
const fs = require("fs");
module.exports = {
commands: ["ticket"],
minArgs: 1,
expectedArgs: "<message>",
callback: (message, arguments, text) => {
// Console Log to notify the function has been invoked.
console.log("FUNCTION RUN")
let jsondata = require("../ticketNum.json")
let ticketValue = jsondata.reportNews.toString()
// Turning the number into a 4 digit number.
for(let i = ticketValue.length; i<4;i++) {
ticketValue = `0${ticketValue}`
}
console.log(`#1 ${ticketValue}`)
// Creating the Discord Chanel
message.guild.channels.create(`report-incident-${ticketValue}`, {
type: 'text',
permissionOverwrites: [
{
id: message.author.id,
deny: ['VIEW_CHANNEL'],
},
],
})
// Adding one to the ticket value and storing it in a JSON file.
ticketValue = Number(ticketValue)+1
console.log(`TICKET VALUE = ${ticketValue}`)
fs.writeFile("./ticketNum.json",JSON.stringify({"reportNews": Number(ticketValue)}), err => {
console.log(`Done writing, value = ${ticketValue}`)
})
console.log(require("../ticketNum.json").reportNews.toString())
},
}
I believe this is due to something called require cache. You can either invalidate the cache for your JSON file each time you write to it, or preferably use fs.readFile to get the up-to-date contents.
Also worth noting that you are requiring from ../ticketNum.json but you are writing to ./ticketNum.json. This could also be a cause.
You seem to be using JSON files as a database and while that is perfectly acceptable depending on your project's scale, I would recommend something a little more polished like lowdb which stills uses local JSON files to store your data but provides a nicer API to work with.
You should only use require when the file is static while the app is running.
The caching require performs is really useful, especially when you are loading tons of modules with the same dependencies.
require also does some special stuff to look for modules locally, globally, etc. So you might see unintended things happen if a file is missing locally and require goes hunting.
These two things mean it's not a good replacement for the fs tools node provides for file access and manipulation.
Given this, you should use fs.readFileSync or one of the other read functions. You're already using fs to write, so that isn't a large lift in terms of changing the line or two where you have require in place or a read.

How to modify the same array using multi thread in node js?

Hi does anyone know how to modify a same array by using 2 worker_threads in node js?
I add value in worker thread 1 and pop it in worker thread 2, but worker thread 2 can't see the value added by 1.
//in a.js
const {isMainThread, parentPort, threadId, MessageChannel, Worker} = require('worker_threads');
global.q = [1,2];
exports.setter_q= function(value){
q.push(value);}
exports.getter_q=function(value){
var v=q.pop()
return v;
}
if(isMainThread) {
var workerSche=new Worker("./w1.js")
var workerSche1=new Worker("./w2.js")
}
//in w1.js
const {isMainThread, parentPort, threadId, MessageChannel, Worker} = require('worker_threads');
if(isMainThread){
// do something
} else{
var miniC1=require("./a.js")
miniC1.setter_q(250);
// do something
}
//in w2.js
const {isMainThread, parentPort, threadId, MessageChannel, Worker} = require('worker_threads');
if(isMainThread){
// do something
} else{
var miniC1=require("./a.js")
var qlast=miniC1.getter_q();
// do something
}
qlast variable in w2.js file is always value '2' instead of 250.
In node.js, to share memory between threads, you have to allocate something like a SharedArrayBuffer that you can then access from multiple threads. The shared buffer objects are allocated differently that allows them to be accessed by multiple V8 threads in nodejs whereas regular arrays cannot.
You will then have to manage concurrency properly so you aren't attempting to update the data simultaneously from more than one thread (creating race conditions). In the cases where I've used shared memory in node.js WorkerThreads, I've designed the code so that only one thread ever had access to the shared memory at once and that is one way of solving concurrency issues. There are also Atomics in node.js that allow you to "control" access such that only one thread is accessing it at a time.
EDIT: This is apparently outdated and wrong. See comment below.
You can't do that in javascript. Objects to worker threads are passed by value. This is by design so you don't have to deal with locking and all the problems that come when multiple threads can mutate an object.
The way to solve this problem in javascript is to send the work result via a message channel (again, by value).
So if you wanted to further process that object, you could pass it in a message channel from worker 1 to worker 2.
Or if you have a worker pool, you could pass messages to the main thread and add them to a result array there for example.

In meteor, can pub/sub be used for arbitrary in-memory objects (not mongo collection)

I want to establish a two-way (bidirectional) communication within my meteor app. But I need to do it without using mongo collections.
So can pub/sub be used for arbitrary in-memory objects?
Is there a better, faster, or lower-level way? Performance is my top concern.
Thanks.
Yes, pub/sub can be used for arbitrary objects. Meteor’s docs even provide an example:
// server: publish the current size of a collection
Meteor.publish("counts-by-room", function (roomId) {
var self = this;
check(roomId, String);
var count = 0;
var initializing = true;
// observeChanges only returns after the initial `added` callbacks
// have run. Until then, we don't want to send a lot of
// `self.changed()` messages - hence tracking the
// `initializing` state.
var handle = Messages.find({roomId: roomId}).observeChanges({
added: function (id) {
count++;
if (!initializing)
self.changed("counts", roomId, {count: count});
},
removed: function (id) {
count--;
self.changed("counts", roomId, {count: count});
}
// don't care about changed
});
// Instead, we'll send one `self.added()` message right after
// observeChanges has returned, and mark the subscription as
// ready.
initializing = false;
self.added("counts", roomId, {count: count});
self.ready();
// Stop observing the cursor when client unsubs.
// Stopping a subscription automatically takes
// care of sending the client any removed messages.
self.onStop(function () {
handle.stop();
});
});
// client: declare collection to hold count object
Counts = new Mongo.Collection("counts");
// client: subscribe to the count for the current room
Tracker.autorun(function () {
Meteor.subscribe("counts-by-room", Session.get("roomId"));
});
// client: use the new collection
console.log("Current room has " +
Counts.findOne(Session.get("roomId")).count +
" messages.");
In this example, counts-by-room is publishing an arbitrary object created from data returned from Messages.find(), but you could just as easily get your source data elsewhere and publish it in the same way. You just need to provide the same added and removed callbacks like the example here.
You’ll notice that on the client there’s a collection called counts, but this is purely in-memory on the client; it’s not saved in MongoDB. I think this is necessary to use pub/sub.
If you want to avoid even an in-memory-only collection, you should look at Meteor.call. You could create a Meteor.method like getCountsByRoom(roomId) and call it from the client like Meteor.call('getCountsByRoom', 123) and the method will execute on the server and return its response. This is more the traditional Ajax way of doing things, and you lose all of Meteor’s reactivity.
Just to add another easy solution. You can pass connection: null to your Collection instantiation on your server. Even though this is not well-documented, but I heard from the meteor folks that this makes the collection in-memory.
Here's an example code posted by Emily Stark a year ago:
if (Meteor.isClient) {
Test = new Meteor.Collection("test");
Meteor.subscribe("testsub");
}
if (Meteor.isServer) {
Test = new Meteor.Collection("test", { connection: null });
Meteor.publish("testsub", function () {
return Test.find();
});
Test.insert({ foo: "bar" });
Test.insert({ foo: "baz" });
}
Edit
This should go under comment but I found it could be too long for it so I post as an answer. Or perhaps I misunderstood your question?
I wonder why you are against mongo. I somehow find it a good match with Meteor.
Anyway, everyone's use case can be different and your idea is doable but not with some serious hacks.
if you look at Meteor source code, you can find tools/run-mongo.js, it's where Meteor talks to mongo, you may tweak or implement your adaptor to work with your in-memory objects.
Another approach I can think of, will be to wrap your in-memory objects and write a database logic/layer to intercept existing mongo database communications (default port on 27017), you have to take care of all system environment variables like MONGO_URL etc. to make it work properly.
Final approach is wait until Meteor officially supports other databases like Redis.
Hope this helps.

Is it possible to declare global variables in Node/Express 4.0

I have multiple routes that need to access a database, for development I use a local database, and obviously production I use a hosted database
The only problem is every time I go to push a release I have to go through each route manually changing the database link
e.g.
var mongodb = require('mongojs').connect('urlhere', ['Collection']);
It would be nice if I could declare a variable in app.js like
app.set('mongoDBAddress', 'urlhere');
then in each file do something like
var mongodb = require('mongojs').connect(app.get('mongoDBAddress'), ['Collection']);
Does anybody know if this is achievable I've been messing around with it for about an hour googling and trying to include different things but I have no luck. thanks.
From the docs:
In browsers, the top-level scope is the global scope. That means that
in browsers if you're in the global scope var something will define a
global variable. In Node this is different. The top-level scope is not
the global scope; var something inside a Node module will be local to
that module.
You have to think a bit differently. Instead of creating a global object, create your modules so they take an app instance, for example:
// add.js
module.exports = function(app) { // requires an `app`
return function add(x, y) { // the actual function to export
app.log(x + y) // use dependency
}
}
// index.js
var app = {log: console.log.bind(console)}
var add = require('./add.js')(app) // pass `app` as a dependency
add(1, 2)
//^ will log `3` to the console
This is the convention in Express, and other libraries. app is in your main file (ie. index.js), and the modules you require have an app parameter.
You can add a global variable to GLOBAL, see this this question, although this is probably considered bad practice.
We have two methods in node.js to share variables within modules.
global
module.export
But your problem seems to be different, what I got is you want to connect your application to different databases without changing code. What you need to do is use command line params
For more ref
server.js
var connectTo = {
dev : "url1"
production : "url2"
}
var mongodb = require('mongojs').connect(connectTo[process.argv[2]], ['Collection']);
Run your server.js as
node server.js dev
// for connecting to development database
or
node server.js production
// for connecting to prodiction database
To share connection across diffrent modules
//Method 1
global.mongodb = require('mongojs').connect(connectTo[process.argv[2]], ['Collection']);
//Method 2
exports.mongodb = require('mongojs').connect(connectTo[process.argv[2]], ['Collection']);
exports.getMongoDBAddress = function() {
return connectTo[process.argv[2]]
}

How to organize models in a node.js website?

I have just started experimenting with building a website using node.js, and I am encountering an issue when organizing the models of my project.
All the real world examples I have found on the Internet are using Mongoose. This library allows you to define your models in a static way. So you can write this:
// models/foo.js
module.exports = require('mongoose').model('Foo', ...);
// app.js
mongoose.connect(...);
// some_controller_1.js
var Foo = require('./models/foo');
Foo.find(...);
// some_controller_2.js
var Foo = require('./models/foo');
Foo.find(...);
But since I don't want to use MongoDB, I need another ORM. And all the other ORMs I have found don't allow this. You first need to create an instance, and then only you can register your models. Also they don't seem to allow access to the list of registered models.
So I tried doing this:
// models/user.js
var registrations = [];
module.exports = function(sequelize) {
var result = null;
registrations.forEach(function(elem) {
if (elem.db == sequelize)
result = elem.value;
});
if (result) return result;
// data definition
var user = sequelize.define("User", ...);
registrations.push({ db: sequelize, value: user });
return user;
};
Which I can use this like:
// some_controller_1.js
var Foo = require('./models/foo')(app.get('database'));
Foo.find(...); // using Foo
But these small header and footer that I have to write on every single model file are a bit annoying and directly violate the "don't repeat youself" principle. Also, while not a huge issue, this is kind of a memory leak since the "sequelize" object will never be freed.
Is there a better way to do, which I didn't think about?
You can find an article about handling models with sequelize here: http://sequelizejs.com/articles/express#the-application
You basically just create a models/index.js as described here: http://sequelizejs.com/articles/express#block-3-line-0
Afterwards you just put your model definitions within files in the models folder as pointed out here: http://sequelizejs.com/articles/express#block-4-line-0

Categories

Resources