Hi does anyone know how to modify a same array by using 2 worker_threads in node js?
I add value in worker thread 1 and pop it in worker thread 2, but worker thread 2 can't see the value added by 1.
//in a.js
const {isMainThread, parentPort, threadId, MessageChannel, Worker} = require('worker_threads');
global.q = [1,2];
exports.setter_q= function(value){
q.push(value);}
exports.getter_q=function(value){
var v=q.pop()
return v;
}
if(isMainThread) {
var workerSche=new Worker("./w1.js")
var workerSche1=new Worker("./w2.js")
}
//in w1.js
const {isMainThread, parentPort, threadId, MessageChannel, Worker} = require('worker_threads');
if(isMainThread){
// do something
} else{
var miniC1=require("./a.js")
miniC1.setter_q(250);
// do something
}
//in w2.js
const {isMainThread, parentPort, threadId, MessageChannel, Worker} = require('worker_threads');
if(isMainThread){
// do something
} else{
var miniC1=require("./a.js")
var qlast=miniC1.getter_q();
// do something
}
qlast variable in w2.js file is always value '2' instead of 250.
In node.js, to share memory between threads, you have to allocate something like a SharedArrayBuffer that you can then access from multiple threads. The shared buffer objects are allocated differently that allows them to be accessed by multiple V8 threads in nodejs whereas regular arrays cannot.
You will then have to manage concurrency properly so you aren't attempting to update the data simultaneously from more than one thread (creating race conditions). In the cases where I've used shared memory in node.js WorkerThreads, I've designed the code so that only one thread ever had access to the shared memory at once and that is one way of solving concurrency issues. There are also Atomics in node.js that allow you to "control" access such that only one thread is accessing it at a time.
EDIT: This is apparently outdated and wrong. See comment below.
You can't do that in javascript. Objects to worker threads are passed by value. This is by design so you don't have to deal with locking and all the problems that come when multiple threads can mutate an object.
The way to solve this problem in javascript is to send the work result via a message channel (again, by value).
So if you wanted to further process that object, you could pass it in a message channel from worker 1 to worker 2.
Or if you have a worker pool, you could pass messages to the main thread and add them to a result array there for example.
Related
Let's assume I have an endpoint /print. Whenever a request is made to this endpoint, it executes a function printSomething(). While printSomething() is processing, if the user or another user hits this endpoint it will execute printSomething() again. If this occurs multiple times the method will be executed multiple times in parallel.
app.get('/print', (req, res) => {
printSomething(someArguments);
res.send('processing receipts');
});
The issue with the default behavior is that, inside printSomething() I create some local files which are needed during the execution of printSomething() and when another call is made to the printSomething() it will override those files, hence none of the call's return the desired result.
What I want to do is to make sure printSomething() execution is completed, before another execution of printSomething() is started. or to stop the current execution of printSomething() once a new request is made to the endpoint.
You have different option based on what you need, how much request you expect to receive and what you do with those file you are creating
Solution 1
If you expect little traffic on this endpoint
you can add some unique key to someArguments based on your req or randomly generated and create the files you need in a different directory
Solution 2
If you think that it will cause some performance issue you have to create some sort of queue and worker to handle the tasks.
In this way you can handle how many task can be executed simultaneously
If you are building a REST system with distributed calls this sounds problematic. Usually you don't want one request to block another.
If the Order of operation is crucial (FIFO) then it looks like a classing Queue problem.
There are many different way to implement the Queue, you could use an array or something or implement a singleton class extending eventEmitter.
const myQueue = new Q()
const crypto = require('crypto');
const route = (req,res) => {
const uniqeID =crypto.randomUUID()
myQueue.once(uniqeID, (data) =>{
res.send(data)
})
myQueue.process(uniqID, req.someDaTa)
}
You could use app.locals, as the docs suggest: Once set, the value of app.locals properties persist throughout the life of the application.
You need to use req.app.locals, something like
req.app.locals.isPrinting = true;
----
//Then check it
if (req.app.locals.isPrinting)
req.app.locals is the way express helps you access app.locals
I'm trying to refactor code that uses let to declare a module scoped Auth instance which is later reassigned (in the same module) due to a configuration change. The original implementation looks like this.
let client = new Auth({ config });
// ...later in the same module
export const updateConfig = (_config) => { client = new Auth(_config) };
My first question, is the original Client released after updateConfig(). How would you prove that?
Are there any drawbacks to this approach?
My proposed refactor aims to make this a little less magical, by wrapping the Auth module in a singleton with an implicit constructor. However, it requires a getter for the instance. But, in essence it does the same thing by re-assigning a reference when a new configuration is applied.
function Client(options) {
if (Client._instance) {
return Client._instance;
}
if (!(this instanceof Client)) {
return new Client(options);
}
this._instance = new Auth(options);
}
Client.prototype.getInstance = function() { return this._instance };
Client.prototype.updateConfig = function(opts) {
this._instance = new Client(opts).getInstance();
}
// usage
const client = new Client({ foo: 'bar'});
client.updateConfig({bar: 'baz'});
console.log(client.getInstance()); // eg. { bar: 'baz' } works!
Same questions apply, from a code safety and memory management perspective, which solution is more appropriate? These are Authentication classes, so I want to make sure they are collected properly and not potentially abused.
My first question, is the original Client released after updateConfig()?
Maybe, if client is the only variable that references it.
How would you prove that?
Make a memory dump in the console and search for those client objects.
Are there any drawbacks to this approach?
No, as long as no one is referencing the client which you expect to update:
const referenced = client;
updateConfig();
console.log(referenced === client); // false
My proposed refactor aims to make this a little less magical ... but, in essence it does the same thing
Why is it "less magical" if you hide that change behind 20 lines of code? If I would be the reviewer, I would reject this change, because it introduces some unexpected behaviour, and provides no benefit whatsoever:
console.log(new Client === new Client); // true, wtf
How I would refactor that (good comments are underestimated):
// Note: client gets re-set in updateConfig(), do not reference this!
let client = new Auth({ config });
From a code safety and memory management perspective, which solution is more appropriate?
"but, in essence it does the same thing ". Wise words.
When we call new on a constructor function, it will always return a new object which means that when client was mutated later, it now definitely holds the new value. That is one thing.
The other thing is that javascript runtime environment garbage collector and looks for the objects which are in memory but are not reference from any variable and if found remove them.
So basically when I do this
let obj = {name: 'Hello'}
obj referes to some object with 2ABCS memory address and when I mutate it
let obj = {name: 'World'}
It now refers to object with address 1ABCS which makes 2ABCS orphan which means it will be removed by garbage collector
For more read https://javascript.info/garbage-collection
In Javascript, GC is not a big concern for potentially abusing the information available in objects. It is the objects themselves. With modern day developer tools, one can easily get into any part of front end code and make sense out of it unless it is obfuscated. IMO, Obfuscation is pretty much necessary these days. One it reduces the file size and second makes it bit difficult to nerds using the code in production.
Now coming to the actual question. Once a new instance new Auth is assigned to client, the old instance is no more hard referenced by client so it is eligible to garbage collection provided no other references are held. There is no guarantee on how quickly the memory is reclaimed.
And advantage of using let is it's scope. It is restricted to its block. However, it is not uncommon to have huge blocks. Compared to global vars, let offers you a small scope and hence may get released soon after the block ends. It may also be the case that Javascript runtime may utilize method stack for let variables and as soon as block ends (method), it will drop the stack and hence the references are also dropped.
Finally, it is absolutely fine to have the way it is and your implementation does not offer any advantage over the previous one.
I want to establish a two-way (bidirectional) communication within my meteor app. But I need to do it without using mongo collections.
So can pub/sub be used for arbitrary in-memory objects?
Is there a better, faster, or lower-level way? Performance is my top concern.
Thanks.
Yes, pub/sub can be used for arbitrary objects. Meteor’s docs even provide an example:
// server: publish the current size of a collection
Meteor.publish("counts-by-room", function (roomId) {
var self = this;
check(roomId, String);
var count = 0;
var initializing = true;
// observeChanges only returns after the initial `added` callbacks
// have run. Until then, we don't want to send a lot of
// `self.changed()` messages - hence tracking the
// `initializing` state.
var handle = Messages.find({roomId: roomId}).observeChanges({
added: function (id) {
count++;
if (!initializing)
self.changed("counts", roomId, {count: count});
},
removed: function (id) {
count--;
self.changed("counts", roomId, {count: count});
}
// don't care about changed
});
// Instead, we'll send one `self.added()` message right after
// observeChanges has returned, and mark the subscription as
// ready.
initializing = false;
self.added("counts", roomId, {count: count});
self.ready();
// Stop observing the cursor when client unsubs.
// Stopping a subscription automatically takes
// care of sending the client any removed messages.
self.onStop(function () {
handle.stop();
});
});
// client: declare collection to hold count object
Counts = new Mongo.Collection("counts");
// client: subscribe to the count for the current room
Tracker.autorun(function () {
Meteor.subscribe("counts-by-room", Session.get("roomId"));
});
// client: use the new collection
console.log("Current room has " +
Counts.findOne(Session.get("roomId")).count +
" messages.");
In this example, counts-by-room is publishing an arbitrary object created from data returned from Messages.find(), but you could just as easily get your source data elsewhere and publish it in the same way. You just need to provide the same added and removed callbacks like the example here.
You’ll notice that on the client there’s a collection called counts, but this is purely in-memory on the client; it’s not saved in MongoDB. I think this is necessary to use pub/sub.
If you want to avoid even an in-memory-only collection, you should look at Meteor.call. You could create a Meteor.method like getCountsByRoom(roomId) and call it from the client like Meteor.call('getCountsByRoom', 123) and the method will execute on the server and return its response. This is more the traditional Ajax way of doing things, and you lose all of Meteor’s reactivity.
Just to add another easy solution. You can pass connection: null to your Collection instantiation on your server. Even though this is not well-documented, but I heard from the meteor folks that this makes the collection in-memory.
Here's an example code posted by Emily Stark a year ago:
if (Meteor.isClient) {
Test = new Meteor.Collection("test");
Meteor.subscribe("testsub");
}
if (Meteor.isServer) {
Test = new Meteor.Collection("test", { connection: null });
Meteor.publish("testsub", function () {
return Test.find();
});
Test.insert({ foo: "bar" });
Test.insert({ foo: "baz" });
}
Edit
This should go under comment but I found it could be too long for it so I post as an answer. Or perhaps I misunderstood your question?
I wonder why you are against mongo. I somehow find it a good match with Meteor.
Anyway, everyone's use case can be different and your idea is doable but not with some serious hacks.
if you look at Meteor source code, you can find tools/run-mongo.js, it's where Meteor talks to mongo, you may tweak or implement your adaptor to work with your in-memory objects.
Another approach I can think of, will be to wrap your in-memory objects and write a database logic/layer to intercept existing mongo database communications (default port on 27017), you have to take care of all system environment variables like MONGO_URL etc. to make it work properly.
Final approach is wait until Meteor officially supports other databases like Redis.
Hope this helps.
I have written my code across several files for my node server.
If I have a file, say basket.js:
var Basket = {
fruits : 0,
addFruit : function() {
fruits++;
},
removeFruit : function() {
fruits--;
},
printFruit : function() {
console.log(this.fruits);
}
}
module.export = Basket;
And I have another file called give.js:
var Basket1 = require("./basket.js");
Basket1.addFruit();
Basket1.printFruit();
And another file called take.js:
var Basket2 = require("./basket.js");
Basket2.removeFruit();
Basket2.printFruit();
Will both files write into the same instance of Basket?
In other words, will they both have control over the property, fruits?
Does node manage race conditions on its own? i.e. if two commands to modify fruit come in at the same time from add and sub, does node know how to handle it?
If I want to make a way in which two files can look at a singleton at the same time and access it, is this the way to go?? Or how else does one do it?
Yes, they will access the same object.
Modules are cached after the first time they are loaded. This means (among other things) that every call to require('foo') will get exactly the same object returned, if it would resolve to the same file.
– Modules docs
No, node does not manage race conditions on its own, because race conditions will not be caused by node itself. Node is single-threaded and thus no code can be executed at the same time as other code. See for example this answer for some more explanation.
I'm a beginner but I think the correct syntax is module.exports not modules.export - if you may correct so that people don't wonder why it does not work like I just did :)
my problem is not about "memory leakage", but about "memory purge" of node.js (expressjs) app.
My app should maintain some objects in memory for the fast look-up's during the service. For the time being (one or two days) after starting the app, everthing seemed fine, until suddenly my web client failed to look-up the object bacause it has been purged (undefined). I suspect Javascript GC (garbage collection). However, as you can see in the psedu-code, I assigned the objects to the node.js "global" variable properties to prevent GC from purging them. Please give me some clue what caused this problem.
Thanks much in advance for your kind advices~
My node.js environments are node.js 0.6.12, expressjs 2.5.8, and VMWare cloudfoundry node hosting.
Here is my app.js pseudo-code :
var express = require("express");
var app = module.exports = express.createServer();
// myMethods holds a set of methods to be used for handling raw data.
var myMethods = require("myMethods");
// creates node.js global properties referencing objects to prevent GC from purging them
global.myMethods = myMethods();
global.myObjects = {};
// omited the express configurations
// creates objects (data1, data2) inside the global.myObjects for the user by id.
app.post("/createData/:id", function(req, res) {
// creates an empty object for the user.
var myObject = global.myObjects[req.prams.id] = {};
// gets json data.
var data1 = JSON.parse(req.body.data1);
var data2 = JSON.parse(req.body.data2);
// buildData1 & buildData2 functions transform data1 & data2 into the usable objects.
// these functions return the references to the transformed objects.
myObject.data1 = global.myMethods.buildData1(data1);
myObject.data2 = global.myMethods.buildData2(data2);
res.send("Created new data", 200);
res.redirect("/");
});
// returns the data1 of the user.
// Problem occurs here : myObject becomes "undefined" after one or two days running the service.
app.get("/getData1/:id", function(req, res) {
var myObject = global.myObjects[req.params.id];
if (myObject !== undefined) {
res.json(myObject.data1);
} else {
res.send(500);
}
});
// omited other service callback functions.
// VMWare cloudfoundry node.js hosting.
app.listen(process.env.VCAP_APP_PORT || 3000);
Any kind of cache system (whether is roll-your-own or a third party product) should account for this scenario. You should not rely on the data always being available on an in-memory cache. There are way too many things that can cause in-memory data to be gone (machine restart, process restart, et cetera.)
In your case, you might need to update your code to see if the data is in cache. If it is not in cache then fetch it from a persistent storage (a database, a file), cache it, and continue.
Exactly like Haesung I wanted to keep my program simple, without database. And like Haesung my first experience with Node.js (and express) was to observe this weird purging. Although I was confused, I really didn't accept that I needed a storage solution to manage a json file with a couple of hundred lines. The light bulb moment for me was when I read this
If you want to have a module execute code multiple times, then export a function, and call that function.
which is taken from http://nodejs.org/api/modules.html#modules_caching. So my code inside the required file changed from this
var foo = [{"some":"stuff"}];
export.foo;
to that
export.foo = function (bar) {
var foo = [{"some":"stuff"}];
return foo.bar;
}
And then it worked fine :-)
Then I suggest to use file system, I think 4KB overhead is not a big deal for your goals and hardware. If you familiar with front-end javascript, this could be helpful https://github.com/coolaj86/node-localStorage