I'm currently working with Node.js, and have built a socket that accepts data. I am attempting to process the data in a streaming fashion, meaning that I process the data (nearly) as quickly as I receive it. There is a rather significant bottleneck in my code, however, that is preventing me from processing as quickly as I'd like.
I've distilled the problem into the code below, removing the extraneous information, but it captures my issue well enough:
require('net').createServer(function (socket) {
var foo = [];
socket.on('data', function (data) {
foo.push(data); // Accessing 'foo' causes a bottle neck
});
}).listen(8080);
Changing the code in the data event, improves performance considerably:
var tmpFoo = foo;
tmpFoo.push(data);
// Do work on tmpFoo
The problem is, I eventually need to access the global (?) variable (to save information for the next data event); incurring the performance penalty along with it. I'd much prefer to process the data as I receive it, but there does not appear to be any guarantee that it will be a "complete" message, so I'm required to buffer.
So my questions:
Is there a better way to localize the variable, and limit the performance hit?
Is there a better way to process the data in a streaming fashion?
dont use anonyme functions like that:
createServer(function (socket) {
Define functions seperately and call these as follow:
var foo = [];
function createMyServer(socket) {
socket.on('data', reveiveDataFromSocket);
}
function reveiveDataFromSocket(data) {
foo.push(data);
}
require('net').createServer(createMyServer).listen(8080);
Related
I'm building a simple node.js websocket server and I want to be able to send a request from a client to the server and have it just take care of things (nothing that could cause harm). Ideally the client will pass the server an object with 2 variables, one of them for the object and the other for the specific function in that object to call. Something like this:
var callObject = {
'obj': 'testObject',
'func':'testFunc'
}
var testObject = {
func: function(){
alert('it worked');
}
}
// I would expect to be able to call it with sometihng like.
console.log( window[callObject.obj] );
console.log( window[callObject.obj][callObject.func] );
I tried calling it with global (since node.js doesn't uses it instead of a browsers window) but it won't work, it always tells me that it can't find callObject.func of undefined. If I call a console.log on callObject.obj it shows the objects variable, as a string, as expected. If run a console.log on the object itself I get the object back.
I'm guessing this is something rather simple, but my Google-fu has failed me.
My recommendation is to resist that pattern and not have client code pick any function to call. If you are not careful you have built yourself a nice large security hole. Especially if you are considering using eval.
Instead have a more explicit mapping between data sent by the client and server code. (Similar to what routes in express what give you).
You might have something like this
const commands = { doSomething() { ... } );
// Then you should be able to say:
let clientCommand = 'doSomething'; // from client
commands[clientCommand](param);
This should be pretty close to what you want to achieve.
Just make sure doSomething validates any parameters passed in.
For two levels of indirection:
const commandMap = { room: { join() { ...} }, chat: { add() { ... } }};
// note this is ES6 syntax
let clientCmd = 'room';
let clientFn = 'join';
commandMap[clientCmd][clientFn]();
I think you might just have to find the right place to put the command map. Show your web socket handler code.
I want to define the same method for both client and server, so I have the following code inside common/methods.js.
Meteor.methods({
doSomething: function (obj) {
if(this.isSimulation) {
console.log('client');
}
if(!this.isSimulation) {
console.log('server');
}
console.log('both');
return "some string";
}
});
Then I called this method inside client/dev.js.
Meteor.call('doSomething', someObj, function (e, data) {
console.log(e);
console.log(data);
});
On the server's console, I can read:
I20150622-21:56:40.460(8)? server
I20150622-21:56:40.461(8)? both
On the client's (Chrome for Ubuntu v43.0.2357.125 (64-bit)) console, the e and data arguments are printed, but the console.log() from the Meteor method is not, where I expected it to output the strings
client
both
Why do console.log() not work on the client inside Meteor methods?
To debug, I split the Meteor.methods into separate client and server code. Then introducing a large loop so the server-side operation so it takes a long time to complete, while the client-side is very quick.
server
doSomething: function (obj) {
var x = 0;
for(var i = 0; i< 9999999; i++) {
x++;
}
console.log(x);
return "some string";
}
client
doSomething: function (obj) {
console.log('client');
}
Still, no message is printed on the client.
Thanks to #kainlite for helping me debug this together. It turns out the problem was a simple one of file load order.
I defined my methods in common/methods.js, whereas my client-side calls were made in client/dev.js, which gets loaded first.
So when the call was made the method wasn't defined, hence it won't run. Moving the methods.js file inside the /lib directory fixed the issue.
Methods are only executed on the server, they are the sync way of doing a remote call.
Methods
Methods are server functions that can be called from the client. They
are useful in situations where you want to do something more
complicated than insert, update or remove, or when you need to do data
validation that is difficult to achieve with just allow and deny.
http://docs.meteor.com/#/basic/Meteor-users
I want to establish a two-way (bidirectional) communication within my meteor app. But I need to do it without using mongo collections.
So can pub/sub be used for arbitrary in-memory objects?
Is there a better, faster, or lower-level way? Performance is my top concern.
Thanks.
Yes, pub/sub can be used for arbitrary objects. Meteor’s docs even provide an example:
// server: publish the current size of a collection
Meteor.publish("counts-by-room", function (roomId) {
var self = this;
check(roomId, String);
var count = 0;
var initializing = true;
// observeChanges only returns after the initial `added` callbacks
// have run. Until then, we don't want to send a lot of
// `self.changed()` messages - hence tracking the
// `initializing` state.
var handle = Messages.find({roomId: roomId}).observeChanges({
added: function (id) {
count++;
if (!initializing)
self.changed("counts", roomId, {count: count});
},
removed: function (id) {
count--;
self.changed("counts", roomId, {count: count});
}
// don't care about changed
});
// Instead, we'll send one `self.added()` message right after
// observeChanges has returned, and mark the subscription as
// ready.
initializing = false;
self.added("counts", roomId, {count: count});
self.ready();
// Stop observing the cursor when client unsubs.
// Stopping a subscription automatically takes
// care of sending the client any removed messages.
self.onStop(function () {
handle.stop();
});
});
// client: declare collection to hold count object
Counts = new Mongo.Collection("counts");
// client: subscribe to the count for the current room
Tracker.autorun(function () {
Meteor.subscribe("counts-by-room", Session.get("roomId"));
});
// client: use the new collection
console.log("Current room has " +
Counts.findOne(Session.get("roomId")).count +
" messages.");
In this example, counts-by-room is publishing an arbitrary object created from data returned from Messages.find(), but you could just as easily get your source data elsewhere and publish it in the same way. You just need to provide the same added and removed callbacks like the example here.
You’ll notice that on the client there’s a collection called counts, but this is purely in-memory on the client; it’s not saved in MongoDB. I think this is necessary to use pub/sub.
If you want to avoid even an in-memory-only collection, you should look at Meteor.call. You could create a Meteor.method like getCountsByRoom(roomId) and call it from the client like Meteor.call('getCountsByRoom', 123) and the method will execute on the server and return its response. This is more the traditional Ajax way of doing things, and you lose all of Meteor’s reactivity.
Just to add another easy solution. You can pass connection: null to your Collection instantiation on your server. Even though this is not well-documented, but I heard from the meteor folks that this makes the collection in-memory.
Here's an example code posted by Emily Stark a year ago:
if (Meteor.isClient) {
Test = new Meteor.Collection("test");
Meteor.subscribe("testsub");
}
if (Meteor.isServer) {
Test = new Meteor.Collection("test", { connection: null });
Meteor.publish("testsub", function () {
return Test.find();
});
Test.insert({ foo: "bar" });
Test.insert({ foo: "baz" });
}
Edit
This should go under comment but I found it could be too long for it so I post as an answer. Or perhaps I misunderstood your question?
I wonder why you are against mongo. I somehow find it a good match with Meteor.
Anyway, everyone's use case can be different and your idea is doable but not with some serious hacks.
if you look at Meteor source code, you can find tools/run-mongo.js, it's where Meteor talks to mongo, you may tweak or implement your adaptor to work with your in-memory objects.
Another approach I can think of, will be to wrap your in-memory objects and write a database logic/layer to intercept existing mongo database communications (default port on 27017), you have to take care of all system environment variables like MONGO_URL etc. to make it work properly.
Final approach is wait until Meteor officially supports other databases like Redis.
Hope this helps.
Decided to test out Meteor JS today to see if I would be interested in building my next project with it and decided to start out with the Deps library.
To get something up extremely quick to test this feature out, I am using the 500px API to simulate changes. After reading through the docs quickly, I thought I would have a working example of it on my local box.
The function seems to only autorun once which is not how it is suppose to be working based on my initial understanding of this feature in Meteor.
Any advice would be greatly appreciated. Thanks in advance.
if (Meteor.isClient) {
var Api500px = {
dep: new Deps.Dependency,
get: function () {
this.dep.depend();
return Session.get('photos');
},
set: function (res) {
Session.set('photos', res.data.photos);
this.dep.changed();
}
};
Deps.autorun(function () {
Api500px.get();
Meteor.call('fetchPhotos', function (err, res) {
if (!err) Api500px.set(res);
else console.log(err);
});
});
Template.photos.photos = function () {
return Api500px.get();
};
}
if (Meteor.isServer) {
Meteor.methods({
fetchPhotos: function () {
var url = 'https://api.500px.com/v1/photos';
return HTTP.call('GET', url, {
params: {
consumer_key: 'my_consumer_key_here',
feature: 'fresh_today',
image_size: 2,
rpp: 24
}
});
}
});
}
Welcome to Meteor! A couple of things to point out before the actual answer...
Session variables have reactivity built in, so you don't need to use the Deps package to add Deps.Dependency properties when you're using them. This isn't to suggest you shouldn't roll your own reactive objects like this, but if you do so then its get and set functions should return and update a normal javascript property of the object (like value, for example), rather than a Session variable, with the reactivity being provided by the depend and changed methods of the dep property. The alternative would be to just use the Session variables directly and not bother with the Api500px object at all.
It's not clear to me what you're trying to achieve reactively here - apologies if it should be. Are you intending to repeatedly run fetchPhotos in an infinite loop, such that every time a result is returned the function gets called again? If so, it's really not the best way to do things - it would be much better to subscribe to a server publication (using Meteor.subscribe and Meteor.publish), get this publication function to run the API call with whatever the required regularity, and then publish the results to the client. That would dramatically reduce client-server communication with the same net result.
Having said all that, why would it only be running once? The two possible explanations that spring to mind would be that an error is being returned (and thus Api500px.set is never called), or the fact that a Session.set call doesn't actually fire a dependency changed event if the new value is the same as the existing value. However, in the latter case I would still expect your function to run repeatedly as you have your own depend and changed structure surrounding the Session variable, which does not implement that self-limiting logic, so having Api500px.get in the autorun should mean that it reruns when Api500px.set returns even if the Session.set inside it isn't actually doing anything. If it's not the former diagnosis then I'd just log everything in sight and the answer should present itself.
my problem is not about "memory leakage", but about "memory purge" of node.js (expressjs) app.
My app should maintain some objects in memory for the fast look-up's during the service. For the time being (one or two days) after starting the app, everthing seemed fine, until suddenly my web client failed to look-up the object bacause it has been purged (undefined). I suspect Javascript GC (garbage collection). However, as you can see in the psedu-code, I assigned the objects to the node.js "global" variable properties to prevent GC from purging them. Please give me some clue what caused this problem.
Thanks much in advance for your kind advices~
My node.js environments are node.js 0.6.12, expressjs 2.5.8, and VMWare cloudfoundry node hosting.
Here is my app.js pseudo-code :
var express = require("express");
var app = module.exports = express.createServer();
// myMethods holds a set of methods to be used for handling raw data.
var myMethods = require("myMethods");
// creates node.js global properties referencing objects to prevent GC from purging them
global.myMethods = myMethods();
global.myObjects = {};
// omited the express configurations
// creates objects (data1, data2) inside the global.myObjects for the user by id.
app.post("/createData/:id", function(req, res) {
// creates an empty object for the user.
var myObject = global.myObjects[req.prams.id] = {};
// gets json data.
var data1 = JSON.parse(req.body.data1);
var data2 = JSON.parse(req.body.data2);
// buildData1 & buildData2 functions transform data1 & data2 into the usable objects.
// these functions return the references to the transformed objects.
myObject.data1 = global.myMethods.buildData1(data1);
myObject.data2 = global.myMethods.buildData2(data2);
res.send("Created new data", 200);
res.redirect("/");
});
// returns the data1 of the user.
// Problem occurs here : myObject becomes "undefined" after one or two days running the service.
app.get("/getData1/:id", function(req, res) {
var myObject = global.myObjects[req.params.id];
if (myObject !== undefined) {
res.json(myObject.data1);
} else {
res.send(500);
}
});
// omited other service callback functions.
// VMWare cloudfoundry node.js hosting.
app.listen(process.env.VCAP_APP_PORT || 3000);
Any kind of cache system (whether is roll-your-own or a third party product) should account for this scenario. You should not rely on the data always being available on an in-memory cache. There are way too many things that can cause in-memory data to be gone (machine restart, process restart, et cetera.)
In your case, you might need to update your code to see if the data is in cache. If it is not in cache then fetch it from a persistent storage (a database, a file), cache it, and continue.
Exactly like Haesung I wanted to keep my program simple, without database. And like Haesung my first experience with Node.js (and express) was to observe this weird purging. Although I was confused, I really didn't accept that I needed a storage solution to manage a json file with a couple of hundred lines. The light bulb moment for me was when I read this
If you want to have a module execute code multiple times, then export a function, and call that function.
which is taken from http://nodejs.org/api/modules.html#modules_caching. So my code inside the required file changed from this
var foo = [{"some":"stuff"}];
export.foo;
to that
export.foo = function (bar) {
var foo = [{"some":"stuff"}];
return foo.bar;
}
And then it worked fine :-)
Then I suggest to use file system, I think 4KB overhead is not a big deal for your goals and hardware. If you familiar with front-end javascript, this could be helpful https://github.com/coolaj86/node-localStorage