Performance heavy algorithms on Node.js

Performance heavy algorithms on Node.js - javascript

I'm creating some algorithms that are very performance heavy, e.g. evolutionary and artificial intelligence. What matters to me is that my update function gets called often (precision), and I just can't get setInterval to update faster than once per millisecond.
Initially I wanted to just use a while loop, but I'm not sure that those kinds of blocking loops are a viable solution in the Node.js environment. Will Socket.io's socket.on("id", cb) work if I run into an "infinite" loop? Does my code somehow need to return to Node.js to let it check for all the events, or is that done automatically?
And last (but not least), if while loops will indeed block my code, what is another solution to getting really low delta-times between my update functions? I think threads could help, but I doubt that they're possible, my Socket.io server and other classes need to somehow communicate, and by "other classes" I mean the main World class, which has an update method that needs to get called and does the heavy lifting, and a getInfo method that is used by my server. I feel like most of the time the program is just sitting there, waiting for the interval to fire, wasting time instead of doing calculations...
Also, I'd like to know if Node.js is even suited for these sorts of tasks.

You can execute havy algorithms in separate thread using child_process.fork and wait results in main thread via child.on('message', function (message) { });
app.js
var child_process = require('child_process');
var child = child_process.fork('./heavy.js', [ 'some', 'argv', 'params' ]);
child.on('message', function(message) {
// heavy results here
});
heavy.js
while (true) {
if (Math.random() < 0.001) {
process.send({ result: 'wow!' });
}
}

Related

JavaScript - Persistent memory leaks

I have been getting persistent memory leaks with my TypeScript application (3PG). The memory management appears to be flawed.
Two Applications:
2PG -> https://github.com/theADAMJR/2pg [does not have memory leaks]
3PG -> the application in question, [extension of 2PG, has set intervals etc.]
Here is a class of 3PG that uses lots of intervals and could be a cause: https://pastebin.com/Z6K8a2vK
private schedule(uuid: string, savedGuild: GuildDocument, interval: number) {
const task = this.findTask(uuid, savedGuild.id);
if (!task.timer) return;
task.status = 'ACTIVE';
task.timeout = setInterval(
async() => await this.sendTimer(task, savedGuild), interval);
}
I was wondering if this code could be the issue, and if so, what I am doing or should avoid doing in JavaScript that would build up memory usage. Thanks.
Update: The problem was due to discord.js calling the ready event multiple times which built memory up over time. It was also my mistake of not providing enough info to help precisely answer the question.

The problem is probably the fact that all those setInterval calls, will set up a new execution. There are 2 things that you need to do
Clear Intervals for anything that has already done its job and is no longer needed.Hold a reference to the interval objects and clear them when the component is no longer in use
Without knowing exacly the business behind your code, consider if setTimeout could be more appropriate for this task

How to initialize a child process with passed in functions in Node.js

Context
I'm building a general purpose game playing A.I. framework/library that uses the Monte Carlo Tree Search algorithm. The idea is quite simple, the framework provides the skeleton of the algorithm, the four main steps: Selection, Expansion, Simulation and Backpropagation. All the user needs to do is plug in four simple(ish) game related functions of his making:
a function that takes in a game state and returns all possible legal moves to be played
a function that takes in a game state and an action and returns a new game state after applying the action
a function that takes in a game state and determines if the game is over and returns a boolean and
a function that takes in a state and a player ID and returns a value based on wether the player has won, lost or the game is a draw. With that, the algorithm has all it needs to run and select a move to make.
What I'd like to do
I would love to make use of parallel programming to increase the strength of the algorithm and reduce the time it needs to run each game turn. The problem I'm running into is that, when using Child Processes in NodeJS, you can't pass functions to the child process and my framework is entirely built on using functions passed by the user.
Possible solution
I have looked at this answer but I am not sure this would be the correct implementation for my needs. I don't need to be continually passing functions through messages to the child process, I just need to initialize it with functions that are passed in by my framework's user, when it initializes the framework.
I thought about one way to do it, but it seems so inelegant, on top of probably not being the most secure, that I find myself searching for other solutions. I could, when the user initializes the framework and passes his four functions to it, get a script to write those functions to a new js file (let's call it my-funcs.js) that would look something like:
const func1 = {... function implementation...}
const func2 = {... function implementation...}
const func3 = {... function implementation...}
const func4 = {... function implementation...}
module.exports = {func1, func2, func3, func4}
Then, in the child process worker file, I guess I would have to find a way to lazy load require my-funcs.js. Or maybe I wouldn't, I guess it depends how and when Node.js loads the worker file into memory. This all seems very convoluted.
Can you describe other ways to get the result I want?

child_process is less about running a user's function and more about starting a new thread to exec a file or process.
Node is inherently a single-threaded system, so for I/O-bound things, the Node Event Loop is really good at switching between requests, getting each one a little farther. See https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/
What it looks like you're doing is trying to get JavaScript to run multiple threads simultaniously. Short answer: can't ... or rather it's really hard. See is it possible to achieve multithreading in nodejs?
So how would we do it anyway? You're on the right track: child_process.fork(). But it needs a hard-coded function to run. So how do we get user-generated code into place?
I envision a datastore where you can take userFn.ToString() and save it to a queue. Then fork the process, and let it pick up the next unhandled thing in the queue, marking that it did so. Then write to another queue the results, and this "GUI" thread then polls against that queue, returning the calculated results back to the user. At this point, you've got multi-threading ... and race conditions.
Another idea: create a REST service that accepts the userFn.ToString() content and execs it. Then in this module, you call out to the other "thread" (service), await the results, and return them.
Security: Yeah, we just flung this out the window. Whether you're executing the user's function directly, calling child_process#fork to do it, or shimming it through a service, you're trusting untrusted code. Sadly, there's really no way around this.

Assuming that security isn't an issue you could do something like this.
// Client side
<input class="func1"> // For example user inputs '(gamestate)=>{return 1}'
<input class="func2">
<input class="func3">
<input class="func4">
<script>
socket.on('syntax_error',function(err){alert(err)});
submit_funcs_strs(){
// Get function strings from user input and then put into array
socket.emit('functions',[document.getElementById('func1').value,document.getElementById('func2').value,...
}
</script>
// Server side
// Socket listener is async
socket.on('functions',(funcs_strs)=>{
let funcs = []
for (let i = 0; i < funcs_str.length;i++){
try {
funcs.push(eval(funcs_strs));
} catch (e) {
if (e instanceof SyntaxError) {
socket.emit('syntax_error',e.message);
return;
}
}
}
// Run algorithm here
}

How do I simulate multiple simultaneous slow Meteor publications?

I want to simulate multiple slow subscriptions. The client subscribes to two or more publications at the same time, and the result arrive later.
The goal is to be able to see how network latencies and randomness can affect my application (it bugs because I expected a publication to be ready before another, ...).
Using the following short setup for the publications:
// server/foo.js
Meteor.publish('foo', function() {
console.log('publishing foo');
Meteor._sleepForMs(2000);
console.log('waking up foo');
this.ready();
});
// server/bar.js is the same with a different name
Meteor.publish('bar', function() {
console.log('publishing bar');
Meteor._sleepForMs(2000);
console.log('waking up bar');
this.ready();
});
Both publications are slowed down thanks to Meteor._sleepForMs as seen in this amazing answer.
The client then subscribes to each publication:
Meteor.subscribe('bar'); // /client/bar.js
Meteor.subscribe('foo'); // /client/foo.js
From there I expected to see both 'publishing' logs first, then both 'waking up'.
However, this appears in the console:
15:37:45? publishing bar
15:37:47? waking up bar
15:37:47? publishing foo
15:37:49? waking up foo
(I removed some irrelevant fluff like the day)
So obviously it runs in a synchronous fashion. I thought that two things can cause that: the server waitForMs which would entirely block the server (fairly weird), or the client subscription design.
To make sure that it wasn't the server I added a simple heartbeat:
Meteor.setInterval(function() { console.log('beep'); }, 500);
And it did not stop beeping, so the server isn't fully blocked.
I thus suspect that the issue lies within the client subscription model, which maybe waits for the subscription to be ready before calling another..?
Thus, two questions:
Why doesn't my experiment run the way I wanted it to?
How should I modify it to achieve my desired goal (multiple slow publications) ?

Meteor processes DDP messages (which include subscriptions) in a sequence. This ensures that you can perform some action like deleting an object and then inserting it back in the correct order, and not run into any errors.
There is support for getting around this in Meteor.methods using this.unblock() to allow the next available DDP message to process without waiting for the previous one to finish executing. Unfortunately this is not available for Meteor.publish in the Meteor core. You can see discussion (and some workarounds) about this issue here: https://github.com/meteor/meteor/issues/853
There is also a package that adds this functionality to publications:
https://github.com/meteorhacks/unblock/

Why doesn't my experiment run the way I wanted it to?
Meteor._sleepForMs is blocking from the way it is implemented:
Meteor._sleepForMs = function (ms) {
var fiber = Fiber.current;
setTimeout(function() {
fiber.run();
}, ms);
Fiber.yield();
};
Calling it prevents the next line from executing inside the fiber until the duration passes. However, this does not block the Node server from handling other events (i.e. executing another publication) due to the way fiber works.
Here is a talk about Fibers in Meteor: https://www.youtube.com/watch?v=AWJ8LIzQMHY
How should I modify it to achieve my desired goal (multiple slow publications) ?
Try using Meteor.setTimeout to simulate latency asynchronously.
Meteor.publish('foo', function() {
console.log('publishing foo');
var self = this;
Meteor.setTimeout(function () {
console.log('waking up foo');
self.ready();
}, 2000);
});

I believe it's because the publications are blocking.
You can use meteorhacks:unblock to unblock publications:
https://atmospherejs.com/meteorhacks/unblock
It could be a good idea to use this.unblock() at the start of every publication (once you've added meteorhacks:unblock).

Conflicting purposes of IndexedDB transactions

As I understand it, there are three somewhat distinct reasons to put multiple IndexedDB operations in a single transaction rather than using a unique transaction for each operation:
Performance. If you’re doing a lot of writes to an object store, it’s much faster if they happen in one transaction.
Ensuring data is written before proceeding. Waiting for the “oncomplete” event is the only way to be sure that a subsequent IndexedDB query won’t return stale data.
Performing an atomic set of DB operations. Basically, “do all of these things, but if one of them fails, roll it all back”.
#1 is fine, most databases have the same characteristic.
#2 is a little more unique, and it causes issues when considered in conjunction with #3. Let’s say I have some simple function that writes something to the database and runs a callback when it's over:
function putWhatever(obj, cb) {
var tx = db.transaction("whatever", "readwrite");
tx.objectStore("whatever").put(obj);
tx.oncomplete = function () { cb(); };
}
That works fine. But now if you want to call that function as a part of a group of operations you want to atomically commit or fail, it's impossible. You'd have to do something like this:
function putWhatever(tx, obj, cb) {
tx.objectStore("whatever").put(obj).onsuccess = function () { cb(); };
}
This second version of the function is very different than the first, because the callback runs before the data is guaranteed to be written to the database. If you try to read back the object you just wrote, you might get a stale value.
Basically, the problem is that you can only take advantage of one of #2 or #3. Sometimes the choice is clear, but sometimes not. This has led me to write horrible code like:
function putWhatever(tx, obj, cb) {
if (tx === undefined) {
tx = db.transaction("whatever", "readwrite");
tx.objectStore("whatever").put(obj);
tx.oncomplete = function () { cb(); };
} else {
tx.objectStore("whatever").put(obj).onsuccess = function () { cb(); };
}
}
However even that still is not a general solution and could fail in some scenarios.
Has anyone else run into this problem? How do you deal with it? Or am I simply misunderstanding things somehow?

The following is just opinion as this doesn't seem like a 'one right answer' question.
First, performance is an irrelevant consideration. Avoid this factor entirely, unless later profiling suggests a material problem. Chances of perf issues are ridiculously low.
Second, I prefer to organize requests into transactions solely to maintain integrity. Integrity is paramount. Integrity as I define it here simply means that the database at any one point in time does not contain conflicting or erratic data. Essentially the database is never able to enter into a 'bad' state. For example, to impose a rule that cross-store object references point to valid and existing objects in other stores (a.k.a. referential integrity), or to prevent duplicated requests such as a double add/put/delete. Obviously, if the app were something like a bank app that credits/debits accounts, or a heart-attack monitor app, things could go horribly wrong.
My own experience has led me to believe that code involving indexedDB is not prone to the traditional facade pattern. I found that what worked best, in terms of organizing requests into different wrapping functions, was to design functions around transactions. I found that quite often there are very few DRY violations because every request is nearly always unique to its transactional context. In other words, while a similar 'put object' request might appear in more than one transaction, it is so distinct in its behavior given its separate context that it merits violating DRY.
If you go the function per request route, I am not sure why you are checking if the transaction parameter is undefined. Have the caller create the function and then pass it to the requests in turn. Expect the tx to always be defined and do not over-zealously guard against it. If it is ever not defined there is either a serious bug in indexedDB or in your calling function.
Explicitly, something like:
function doTransaction1(db, onComplete) {
var tx = db.transaction(...);
tx.onComplete = onComplete;
doRequest1(tx);
doRequest2(tx);
doRequest3(tx);
}
function doRequest1(tx) {
var store = tx.objectStore(...);
// ...
}
// ...
If the requests should not execute in parallel, and must run in a series, then this indicates a larger and more difficult design issue.

Threads (or something like) in javascript

I need to let a piece of code always run independently of other code. Is there a way of creating a thread in javascript to run this function?
--why setTimeout doesn't worked for me
I tried it, but it runs just a single time. And if I call the function recursively it throws the error "too much recursion" after some time. I need it running every 100 milis (it's a communication with a embedded system).
--as you ask, here goes some code
function update(v2) {
// I removed the use of v2 here for simplicity
dump("update\n"); // this will just print the string
setTimeout(new function() { update(v2); }, 100); // this try doesn't work
}
update(this.v);
It throws "too much recursion".

I am assuming you are asking about executing a function on a different thread. However, Javascript does not support multithreading.
See: Why doesn't JavaScript support multithreading?
The Javascript engine in all current browsers execute on a single thread. As stated in the post above, running functions on a different thread would lead to concurrency issues. For example, two functions modifying a single HTML element simultaneously.

As pointed out by others here, perhaps multi-threading is not what you actually need for your situation. setInterval might be adequate.
However, if you truly need multi-threading, JavaScript does support it through the web workers functionality. Basically, the main JavaScript thread can interact with the other threads (workers) only through events and message passing (strings, essentially). Workers do not have access to the DOM. This avoids any of the concurrency issues.
Here is the web workers spec: http://www.whatwg.org/specs/web-workers/current-work/
A more tutorial treatment: http://ejohn.org/blog/web-workers/

Get rid of the new keyword for the function you're passing to setTimeout(), and it should work.
function update(v2) {
try {
dump("update\n");
} catch(err) {
dump("Fail to update " + err + "\n");
}
setTimeout(function() { update(v2); }, 100);
}
update(this.v);
Or just use setInterval().
function update(v2) {
try {
dump("update\n");
} catch(err) {
dump("Fail to update " + err + "\n");
}
}
var this_v = this.v;
setInterval(function() {update(this_v);}, 100);
EDIT: Referenced this.v in a variable since I don't know what the value of this is in your application.

window.setTimeout() is what you need.

maybe you should to view about the javascirpt Workers (dedicated Web Workers provide a simple means for web content to run scripts in background threads), here a nice article, which explain how this works and how can we to use it.
HTML5 web mobile tutororial

U can try a loop instead of recursivity

Develop Reference

JavaScript is the programming language of the Web.

Performance heavy algorithms on Node.js - javascript

Related

JavaScript - Persistent memory leaks

How to initialize a child process with passed in functions in Node.js

How do I simulate multiple simultaneous slow Meteor publications?

Conflicting purposes of IndexedDB transactions

Threads (or something like) in javascript

Categories

Resources