Is it possible to pause/resume a web worker externally? - javascript

I've seen that web workers have a terminate() function, but is there a way to pause and resume web workers from the host thread similar to interrupts?
The critical factor with this question is that using the terminate() call on a web worker will break the state internal to the worker (and any operations taking place may in turn be corrupted in a half-complete state.) I'm looking for a mechanism by which to switch between hundreds-thousands of tasks held in however many web workers equal the total cores on the machine, without losing state within those tasks or relying on cooperative multithreading styles like fibers.

Is it possible to pause/resume a web worker externally?
No. Browser javascript does not have a means of externally pausing web workers without some cooperation from the webWorker itself.
A work-around (with cooperation from the webWorker) would involve the webWorker running a bit of code, then allowing a message to arrive from the outside world that tells it to pause and then the webWorker itself not calling the next unit of work until it receives a message to resume processing. But, because even webWorkers are event driven, you can't receive a message from the outside world until you finish what you were doing and return back to the event system.
This would probably involve executing a chunk of work, setting a timer for only a few ms from now which allows messages to arrive, then when a message arrives, you store the new pause/resume state and then when the timer fires, you check the state and if its pause, you don't call the next unit of work. If it's not the pause state, you execute the next unit of work and when it finishes, you again do setTimeout() with a really short time (allowing any pending messages to process) and keep going over and over again.
Unfortunately, this technique requires you to chunk your work which negates some of the benefits of webWorkers in the first place (though you still retain the benefit of using more than one CPU to process work).
I'm looking for a mechanism by which to switch between hundreds-thousands of tasks held in however many web workers equal the total cores on the machine, without losing state within those tasks
You would need to have each currently running webWorker cooperate and allow a message from the outside world to get processed so they could see when the outside world wants them to stop doing more work and then have them cooperatively not call the next unit of work so they wouldn't be doing anything until the next message arrives telling them to start doing their work again.
If you're going to have thousands of these, you probably don't want them all to be individual webWorkers anyway as that's a bit heavy weight of a system. You probably want a set of webWorkers that are working on a work queue. You queue up work units to be processed and each webWorker grabs the next unit of work from the queue, processes it, grabs the next unit and so on. Your main process then just keeps the queue fed with whatever work you want the webWorkers to be doing. This also requires chunking the work into proper units of work.

Considering you are targeting recent browsers, with access to SharedArrayBuffer and Atomics. The simplest solution is to pass SAB reference to the worker:
main.js
const sab = new SharedArrayBuffer(4);
const int32 = new Int32Array(sab);
const start = () => {
const worker = new Worker("worker.js");
worker.postMessage({kind:"init", sab});
}
const pause = () => {
Atomics.store(int32, 0, 1);
Atomics.notify(int32, 0);
}
const resume = () => {
Atomics.store(int32, 0, 0);
Atomics.notify(int32, 0);
}
...worker loop will wait for the byte to have particular value 1/0.
worker.js
onmessage = event => {
if(event.data.kind === "init"){
while(true) {
Atomics.wait(new Int32Array(event.data.sab), 0, 1);
... your bitcoin miner goes here
}
}
}
In case your worker does not contain synchronous main loop but rather consist of async operations, you can intercept the event loop (block any other line of code to be executed) as follows:
main.js
const sab = new SharedArrayBuffer(4);
const int32 = new Int32Array(sab);
const start = () => {
const worker = new Worker("worker.js");
}
const pause = () => {
Atomics.store(int32, 0, 1);
Atomics.notify(int32, 0);
worker.postMessage({kind:"pause", sab});
}
const resume = () => {
Atomics.store(int32, 0, 0);
Atomics.notify(int32, 0);
}
worker.js
... async/await, fetch(), setTimeout()
onmessage = event => {
if(event.data.kind === "pause")
Atomics.wait(new Int32Array(event.data.sab), 0, 1);
}

Related

Asynchronously stopping a loop from outside node.js

I am using node.js 14 and currently have a loop that is made by a recursive function and a setTimeout, something like this:
this.timer = null;
async recursiveLoop() {
//Do Stuff
this.timer = setTimeout(this.recursiveLoop.bind(this), rerun_time);
}
But sometimes this loop gets stuck and I want it to automatically notice it, clean up and restart. So I tried doing something like this:
this.timer = null;
async recursiveLoop() {
this.long_timer = setTimeout(() => throw new Error('Taking too long!'), tooLong);
//Do Stuff
this.timer = setTimeout(this.recursiveLoop.bind(this), rerun_time);
}
main() {
//Do other asynchronous stuff
recursiveLoop()
.then()
.catch((e) => {
console.log(e.message);
cleanUp();
recursiveLoop();
}
}
I can't quite debug where it gets stuck, because it seems quite random and the program runs on a virtual machine. I still couldn't reproduce it locally.
This makeshift solution, instead of working, keeps crashing the whole node.js aplication, and now I am the one stuck. I have the constraint of working with node.js 14, without using microservices, and I never used child process before. I am a complete beginner. Please help me!
If you have a black box of code (which is all you've given us) with no way to detect errors on it and you just want to know when it is no longer generating results, you can put it in a child_process and ask the code in the child process to send you a message every time it runs an iteration. Then, in your main process, you can set a timer that resets itself every time it gets one of these "health" messages from the child. If the timer fires without getting a health message, then the child must be "stuck" because you haven't heard from it within your timeout time. You can then kill the child process at that point and restart it.
But, that is a giant hack. You should FIX the code that gets stuck or at least understand what's going on. Probably you're either leaking memory, file handles, database handles, running code that uses locks and messes up or there are unhandled errors happening. All are indications of code that should be fixed.

How to initialize a child process with passed in functions in Node.js

Context
I'm building a general purpose game playing A.I. framework/library that uses the Monte Carlo Tree Search algorithm. The idea is quite simple, the framework provides the skeleton of the algorithm, the four main steps: Selection, Expansion, Simulation and Backpropagation. All the user needs to do is plug in four simple(ish) game related functions of his making:
a function that takes in a game state and returns all possible legal moves to be played
a function that takes in a game state and an action and returns a new game state after applying the action
a function that takes in a game state and determines if the game is over and returns a boolean and
a function that takes in a state and a player ID and returns a value based on wether the player has won, lost or the game is a draw. With that, the algorithm has all it needs to run and select a move to make.
What I'd like to do
I would love to make use of parallel programming to increase the strength of the algorithm and reduce the time it needs to run each game turn. The problem I'm running into is that, when using Child Processes in NodeJS, you can't pass functions to the child process and my framework is entirely built on using functions passed by the user.
Possible solution
I have looked at this answer but I am not sure this would be the correct implementation for my needs. I don't need to be continually passing functions through messages to the child process, I just need to initialize it with functions that are passed in by my framework's user, when it initializes the framework.
I thought about one way to do it, but it seems so inelegant, on top of probably not being the most secure, that I find myself searching for other solutions. I could, when the user initializes the framework and passes his four functions to it, get a script to write those functions to a new js file (let's call it my-funcs.js) that would look something like:
const func1 = {... function implementation...}
const func2 = {... function implementation...}
const func3 = {... function implementation...}
const func4 = {... function implementation...}
module.exports = {func1, func2, func3, func4}
Then, in the child process worker file, I guess I would have to find a way to lazy load require my-funcs.js. Or maybe I wouldn't, I guess it depends how and when Node.js loads the worker file into memory. This all seems very convoluted.
Can you describe other ways to get the result I want?
child_process is less about running a user's function and more about starting a new thread to exec a file or process.
Node is inherently a single-threaded system, so for I/O-bound things, the Node Event Loop is really good at switching between requests, getting each one a little farther. See https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/
What it looks like you're doing is trying to get JavaScript to run multiple threads simultaniously. Short answer: can't ... or rather it's really hard. See is it possible to achieve multithreading in nodejs?
So how would we do it anyway? You're on the right track: child_process.fork(). But it needs a hard-coded function to run. So how do we get user-generated code into place?
I envision a datastore where you can take userFn.ToString() and save it to a queue. Then fork the process, and let it pick up the next unhandled thing in the queue, marking that it did so. Then write to another queue the results, and this "GUI" thread then polls against that queue, returning the calculated results back to the user. At this point, you've got multi-threading ... and race conditions.
Another idea: create a REST service that accepts the userFn.ToString() content and execs it. Then in this module, you call out to the other "thread" (service), await the results, and return them.
Security: Yeah, we just flung this out the window. Whether you're executing the user's function directly, calling child_process#fork to do it, or shimming it through a service, you're trusting untrusted code. Sadly, there's really no way around this.
Assuming that security isn't an issue you could do something like this.
// Client side
<input class="func1"> // For example user inputs '(gamestate)=>{return 1}'
<input class="func2">
<input class="func3">
<input class="func4">
<script>
socket.on('syntax_error',function(err){alert(err)});
submit_funcs_strs(){
// Get function strings from user input and then put into array
socket.emit('functions',[document.getElementById('func1').value,document.getElementById('func2').value,...
}
</script>
// Server side
// Socket listener is async
socket.on('functions',(funcs_strs)=>{
let funcs = []
for (let i = 0; i < funcs_str.length;i++){
try {
funcs.push(eval(funcs_strs));
} catch (e) {
if (e instanceof SyntaxError) {
socket.emit('syntax_error',e.message);
return;
}
}
}
// Run algorithm here
}

rxjs pausableBuffered multiple subscriptions

I'm trying to write a websocket rxjs based wrapper.
And I'm struggling with my rxjs understanding.
I have a pause stream which is supposed to pause the pausable buffered streams when an error occures and resume them once i get a "ok" form the websocket.
Somehow only the first subscription on my pauseable buffered streams are fired. From then on only the queue stacks up higher.
I have prepared a jsbin to reproduce the issue.
https://jsbin.com/mafakar/edit?js,console
There the "msg recived" stream only fires for the first subscription. And then the q and observer begin stacking up.
I somehow have the feeling this is about hot and cold obserables but I cannot grasp the issues. I would appreciate any help.
Thank you in advance!
It is not the cold/hot issue. What you do in your onMessage is subscribe, then dispose. The dispose terminates the sequence. The onMessageStream should be subscribed to only once, for example, in the constructor:
this.onmessageStream.subscribe(message => console.log('--- msg --- ', message.data));
The subscribe block, including the dispose should be removed.
Also, note that you used replaySubject without a count, this means that the queue holds all previous values. Unless this is a desired behavior, considered changing it to .replaySubject(1)
Here is a working jsbin.
As #Meir pointed out dispose in a subscribe block is a no no since its behavior is non-deterministic. In general I would avoid the use of Subjects and rely on factory methods instead. You can see a refactored version here: https://jsbin.com/popixoqafe/1/edit?js,console
A quick breakdown of the changes:
class WebSocketWrapper {
// Inject the pauser from the external state
constructor(pauser) {
// Input you need to keep as a subject
this.input$ = new Rx.Subject();
// Create the socket
this._socket = this._connect();
// Create a stream for the open event
this.open$ = Rx.Observable.fromEvent(this._socket, 'open');
// This concats the external pauser with the
// open event. The result is a pauser that won't unpause until
// the socket is open.
this.pauser$ = Rx.Observable.concat(
this.open$.take(1).map(true)
pauser || Rx.Observable.empty()
)
.startWith(false);
// subscribe and buffer the input
this.input$
.pausableBuffered(this.pauser$)
.subscribe(msg => this._socket.send(msg));
// Create a stream around the message event
this.message$ = Rx.Observable.fromEvent(this._socket, 'message')
// Buffer the messages
.pausableBuffered(this.pauser$)
// Create a shared version of the stream and always replay the last
// value to new subscribers
.shareReplay(1);
}
send(request) {
// Push to input
this.input$.onNext(request);
}
_connect() {
return new WebSocket('wss://echo.websocket.org');
}
}
As an aside you should also avoid relying on internal variables like source which are not meant for external consumption. Although RxJS 4 is relatively stable, since those are not meant for public consumption they could be changed out from under you.

Rxjs: What scenario do you want to use scheduler

I don't understand what does it mean by scheduler in rxjs documentation, so I'm trying to understand by scenario its useful in, so I can understand scheduler
&tldr;
In most cases you will never need to concern yourself with Schedulers if only for the fact that for 90% of the cases the default is fine.
Explanation
A Scheduler is simply a way of standardizing time when using RxJS. It effectively schedules events to occur at sometime in the future.
We do this by using the schedule method to queue up new operations that the scheduler will execute in the future. How the Scheduler does this is completely up to the implementation. Often though it is simply about choosing the most efficient means of executing a future action.
Take a simple example whereby we are using the timer operator to execute an action at sometime in the future.
var source = Observable.timer(500);
This is pretty standard fare for RxJS. The Scheduler comes in when you ask the question, what does 500 mean? In the default case it will equal 500 milliseconds, because that is what the convention is and that is what the default Scheduler will do, it will wait 500 milliseconds and then emit an event.
However, there are cases where we may not want the flow of time to operate normally. The most common use case for this is when we are testing. We don't actually want to wait 500 milliseconds for a task to complete, otherwise our test suite would take ages to actually complete!
In that case we would actually want to control the flow of time such that we don't have to wait for 500 milliseconds to elapse before we can verify the result of an stream. In this case we could use the TestScheduler which can execute tasks synchronously so that we don't have to deal with any of that asynchronous messiness.
let scheduler = new TestScheduler();
//Overrides the default scheduler with the default scheduler
let source = Observable.timer(500, scheduler);
//Subscribe to the source, which behaves normally
source.subscribe(x => expect(x).to.be(0));
//When this gets called all pending actions get executed.
scheduler.flush();
There are some other more corner cases where we want to alter the flow of time as well. For instance if we are operating in the context of a game we would likely want to link our scheduling to the requestAnimationFrame or to some other faux time scale, which would necessitate the use of something like the AnimationFrameScheduler or the VirtualTimeScheduler.
Scenario 1
you have initial value and want subscriber to get first initial value and then get some other value (depanding on condition).
const dispatcher = (new BehaviorSubject("INITIAL"))
.pipe(observeOn(asyncScheduler));
let did = false; // condition
dispatcher.pipe(
tap((value) => {
if(!did) {
did = true;
dispatcher.next("SECOND");
}
}))
.subscribe((state) => {
console.log('Subscription value: ', state);
});
//Output: Initial ... SECOND
Without .pipe(observeOn(asyncScheduler)) it will output vise-verse since Subject .next is sync operation.
codepen example

How to run an unblocking background task in a Meteor/JavaScript client?

I'd like to run a task on a Meteor client which is resource hungry in the background and keep the interface responsive for the user in the meantime. The task does some math (for example finding prime numbers like described here: https://stackoverflow.com/a/22930538/2543628 ).
I've tried to follow the tips from https://stackoverflow.com/a/21351966 but still the interface always "freezes" until the task is complete.
setTimeout, setInterval and those packages like in my current approach also didn't help:
var taskQueue = new PowerQueue();
taskQueue.add(function(done) {
doSomeMath();
// It's still blocking/freezing the interface here until done() is reached
done();
});
Can I do something to make the interface responsive during doSomeMath() is running or am I doing something wrong (also it doesn't look like there is much you could do wrong in PowerQueue)?
JavaScript libraries which solve the problem of asynchronous queuing, assume that the tasks being queued are running in a concurrent but single-threaded environment like node.js or your browser. However, in your case you need more than just concurrency - you need multi-threaded execution in order to move your CPU-intensive computation out of your UI thread. This can be achieved with web workers. Note that web workers are only supported in modern browsers, so keep reading if you don't care about IE9.
The above article should be enough to get you started, however it's worth mentioning that the worker script will need to be kept outside of your application tree so it doesn't get bundled. An easy way to do this is to put it inside of the public directory.
Here is a quick example where my worker computes a Fibonacci sequence (inefficiently):
public/fib.js
var fib = function(n) {
if (n < 2) {
return 1;
} else {
return fib(n - 2) + fib(n - 1);
}
};
self.addEventListener('message', (function(e) {
var n = e.data;
var result = fib(n);
self.postMessage(result);
self.close();
}), false);
client/app.js
Meteor.startup(function () {
var worker = new Worker('/fib.js');
worker.postMessage(40);
worker.addEventListener('message', function(e) {
console.log(e.data);
}, false);
});
When the client starts, it loads the worker and asks it to compute the 40th number in the sequence. This takes a few seconds to complete but your UI should remain responsive. After the value is returned, it should print 165580141 to the console.

Categories

Resources