I don't understand what does it mean by scheduler in rxjs documentation, so I'm trying to understand by scenario its useful in, so I can understand scheduler
&tldr;
In most cases you will never need to concern yourself with Schedulers if only for the fact that for 90% of the cases the default is fine.
Explanation
A Scheduler is simply a way of standardizing time when using RxJS. It effectively schedules events to occur at sometime in the future.
We do this by using the schedule method to queue up new operations that the scheduler will execute in the future. How the Scheduler does this is completely up to the implementation. Often though it is simply about choosing the most efficient means of executing a future action.
Take a simple example whereby we are using the timer operator to execute an action at sometime in the future.
var source = Observable.timer(500);
This is pretty standard fare for RxJS. The Scheduler comes in when you ask the question, what does 500 mean? In the default case it will equal 500 milliseconds, because that is what the convention is and that is what the default Scheduler will do, it will wait 500 milliseconds and then emit an event.
However, there are cases where we may not want the flow of time to operate normally. The most common use case for this is when we are testing. We don't actually want to wait 500 milliseconds for a task to complete, otherwise our test suite would take ages to actually complete!
In that case we would actually want to control the flow of time such that we don't have to wait for 500 milliseconds to elapse before we can verify the result of an stream. In this case we could use the TestScheduler which can execute tasks synchronously so that we don't have to deal with any of that asynchronous messiness.
let scheduler = new TestScheduler();
//Overrides the default scheduler with the default scheduler
let source = Observable.timer(500, scheduler);
//Subscribe to the source, which behaves normally
source.subscribe(x => expect(x).to.be(0));
//When this gets called all pending actions get executed.
scheduler.flush();
There are some other more corner cases where we want to alter the flow of time as well. For instance if we are operating in the context of a game we would likely want to link our scheduling to the requestAnimationFrame or to some other faux time scale, which would necessitate the use of something like the AnimationFrameScheduler or the VirtualTimeScheduler.
Scenario 1
you have initial value and want subscriber to get first initial value and then get some other value (depanding on condition).
const dispatcher = (new BehaviorSubject("INITIAL"))
.pipe(observeOn(asyncScheduler));
let did = false; // condition
dispatcher.pipe(
tap((value) => {
if(!did) {
did = true;
dispatcher.next("SECOND");
}
}))
.subscribe((state) => {
console.log('Subscription value: ', state);
});
//Output: Initial ... SECOND
Without .pipe(observeOn(asyncScheduler)) it will output vise-verse since Subject .next is sync operation.
codepen example
Related
I've seen that web workers have a terminate() function, but is there a way to pause and resume web workers from the host thread similar to interrupts?
The critical factor with this question is that using the terminate() call on a web worker will break the state internal to the worker (and any operations taking place may in turn be corrupted in a half-complete state.) I'm looking for a mechanism by which to switch between hundreds-thousands of tasks held in however many web workers equal the total cores on the machine, without losing state within those tasks or relying on cooperative multithreading styles like fibers.
Is it possible to pause/resume a web worker externally?
No. Browser javascript does not have a means of externally pausing web workers without some cooperation from the webWorker itself.
A work-around (with cooperation from the webWorker) would involve the webWorker running a bit of code, then allowing a message to arrive from the outside world that tells it to pause and then the webWorker itself not calling the next unit of work until it receives a message to resume processing. But, because even webWorkers are event driven, you can't receive a message from the outside world until you finish what you were doing and return back to the event system.
This would probably involve executing a chunk of work, setting a timer for only a few ms from now which allows messages to arrive, then when a message arrives, you store the new pause/resume state and then when the timer fires, you check the state and if its pause, you don't call the next unit of work. If it's not the pause state, you execute the next unit of work and when it finishes, you again do setTimeout() with a really short time (allowing any pending messages to process) and keep going over and over again.
Unfortunately, this technique requires you to chunk your work which negates some of the benefits of webWorkers in the first place (though you still retain the benefit of using more than one CPU to process work).
I'm looking for a mechanism by which to switch between hundreds-thousands of tasks held in however many web workers equal the total cores on the machine, without losing state within those tasks
You would need to have each currently running webWorker cooperate and allow a message from the outside world to get processed so they could see when the outside world wants them to stop doing more work and then have them cooperatively not call the next unit of work so they wouldn't be doing anything until the next message arrives telling them to start doing their work again.
If you're going to have thousands of these, you probably don't want them all to be individual webWorkers anyway as that's a bit heavy weight of a system. You probably want a set of webWorkers that are working on a work queue. You queue up work units to be processed and each webWorker grabs the next unit of work from the queue, processes it, grabs the next unit and so on. Your main process then just keeps the queue fed with whatever work you want the webWorkers to be doing. This also requires chunking the work into proper units of work.
Considering you are targeting recent browsers, with access to SharedArrayBuffer and Atomics. The simplest solution is to pass SAB reference to the worker:
main.js
const sab = new SharedArrayBuffer(4);
const int32 = new Int32Array(sab);
const start = () => {
const worker = new Worker("worker.js");
worker.postMessage({kind:"init", sab});
}
const pause = () => {
Atomics.store(int32, 0, 1);
Atomics.notify(int32, 0);
}
const resume = () => {
Atomics.store(int32, 0, 0);
Atomics.notify(int32, 0);
}
...worker loop will wait for the byte to have particular value 1/0.
worker.js
onmessage = event => {
if(event.data.kind === "init"){
while(true) {
Atomics.wait(new Int32Array(event.data.sab), 0, 1);
... your bitcoin miner goes here
}
}
}
In case your worker does not contain synchronous main loop but rather consist of async operations, you can intercept the event loop (block any other line of code to be executed) as follows:
main.js
const sab = new SharedArrayBuffer(4);
const int32 = new Int32Array(sab);
const start = () => {
const worker = new Worker("worker.js");
}
const pause = () => {
Atomics.store(int32, 0, 1);
Atomics.notify(int32, 0);
worker.postMessage({kind:"pause", sab});
}
const resume = () => {
Atomics.store(int32, 0, 0);
Atomics.notify(int32, 0);
}
worker.js
... async/await, fetch(), setTimeout()
onmessage = event => {
if(event.data.kind === "pause")
Atomics.wait(new Int32Array(event.data.sab), 0, 1);
}
I have a firestore app with a cloud function that triggers off a cronjob.
The cloud function takes a long time and pulls a large amount of data. I've set the memory limit of my function to 2Gb and the timeout to 540 second and Retry on failure is NOT checked.
The cloud function essentially looks like this:
export const fetchEpisodesCronJob = pubsub
.topic('daily-tick')
.onPublish(() => {
console.log(`TIMING - Before Fetches ${rssFeeds.length} feeds`, new Date())
return Promise.map(
rssFeeds.map(rssFeed => rssFeed.url),
url => fetch(url).catch(e => e).then(addFeedToDB), // <-- This can take a long time
{
concurrency: 4
}
).catch(e => {
console.warn('Error fetching feeds', e);
})
})
What I see in the logs however is this (Continues indefinitely):
As you can see the function is being finished with a status timeout however it's starting right back up again. What's weird is I've specified a 540 second limit however the timeout comes in at a consistent 5 minute mark. Also note I checked the cloud console and I manually spun off the last cronjob pubsub at 10:00AM yet you can see multiple pubsub triggers since then. (So I believe the cronjob is setup fine)
Also I get consistent errors repeating in the console:
My question is how do I prevent the cloud function from re-executing when it's already been killed due to a timeout. Is this a bug or do I need to explicitly set a kill statement somewhere.
Thanks!
So this is a bug with firebase. According to #MichaelBleigh
Turns out there's a backend bug in Cloud Functions that happens when a function is created with the default timeout but later increased that is causing this. A fix is being worked on and will hopefully address the issue soon.
If you're reading this in between now and when the bug is fixed though I found that the function will be triggered again ever 300 seconds. So an immediate work around for me is to set the timeout for 250 seconds and keep the time complexity of the function as minimal as possible. This may mean increasing the memory usage for the time being.
Context
I'm building a general purpose game playing A.I. framework/library that uses the Monte Carlo Tree Search algorithm. The idea is quite simple, the framework provides the skeleton of the algorithm, the four main steps: Selection, Expansion, Simulation and Backpropagation. All the user needs to do is plug in four simple(ish) game related functions of his making:
a function that takes in a game state and returns all possible legal moves to be played
a function that takes in a game state and an action and returns a new game state after applying the action
a function that takes in a game state and determines if the game is over and returns a boolean and
a function that takes in a state and a player ID and returns a value based on wether the player has won, lost or the game is a draw. With that, the algorithm has all it needs to run and select a move to make.
What I'd like to do
I would love to make use of parallel programming to increase the strength of the algorithm and reduce the time it needs to run each game turn. The problem I'm running into is that, when using Child Processes in NodeJS, you can't pass functions to the child process and my framework is entirely built on using functions passed by the user.
Possible solution
I have looked at this answer but I am not sure this would be the correct implementation for my needs. I don't need to be continually passing functions through messages to the child process, I just need to initialize it with functions that are passed in by my framework's user, when it initializes the framework.
I thought about one way to do it, but it seems so inelegant, on top of probably not being the most secure, that I find myself searching for other solutions. I could, when the user initializes the framework and passes his four functions to it, get a script to write those functions to a new js file (let's call it my-funcs.js) that would look something like:
const func1 = {... function implementation...}
const func2 = {... function implementation...}
const func3 = {... function implementation...}
const func4 = {... function implementation...}
module.exports = {func1, func2, func3, func4}
Then, in the child process worker file, I guess I would have to find a way to lazy load require my-funcs.js. Or maybe I wouldn't, I guess it depends how and when Node.js loads the worker file into memory. This all seems very convoluted.
Can you describe other ways to get the result I want?
child_process is less about running a user's function and more about starting a new thread to exec a file or process.
Node is inherently a single-threaded system, so for I/O-bound things, the Node Event Loop is really good at switching between requests, getting each one a little farther. See https://nodejs.org/en/docs/guides/event-loop-timers-and-nexttick/
What it looks like you're doing is trying to get JavaScript to run multiple threads simultaniously. Short answer: can't ... or rather it's really hard. See is it possible to achieve multithreading in nodejs?
So how would we do it anyway? You're on the right track: child_process.fork(). But it needs a hard-coded function to run. So how do we get user-generated code into place?
I envision a datastore where you can take userFn.ToString() and save it to a queue. Then fork the process, and let it pick up the next unhandled thing in the queue, marking that it did so. Then write to another queue the results, and this "GUI" thread then polls against that queue, returning the calculated results back to the user. At this point, you've got multi-threading ... and race conditions.
Another idea: create a REST service that accepts the userFn.ToString() content and execs it. Then in this module, you call out to the other "thread" (service), await the results, and return them.
Security: Yeah, we just flung this out the window. Whether you're executing the user's function directly, calling child_process#fork to do it, or shimming it through a service, you're trusting untrusted code. Sadly, there's really no way around this.
Assuming that security isn't an issue you could do something like this.
// Client side
<input class="func1"> // For example user inputs '(gamestate)=>{return 1}'
<input class="func2">
<input class="func3">
<input class="func4">
<script>
socket.on('syntax_error',function(err){alert(err)});
submit_funcs_strs(){
// Get function strings from user input and then put into array
socket.emit('functions',[document.getElementById('func1').value,document.getElementById('func2').value,...
}
</script>
// Server side
// Socket listener is async
socket.on('functions',(funcs_strs)=>{
let funcs = []
for (let i = 0; i < funcs_str.length;i++){
try {
funcs.push(eval(funcs_strs));
} catch (e) {
if (e instanceof SyntaxError) {
socket.emit('syntax_error',e.message);
return;
}
}
}
// Run algorithm here
}
I'm trying to write a websocket rxjs based wrapper.
And I'm struggling with my rxjs understanding.
I have a pause stream which is supposed to pause the pausable buffered streams when an error occures and resume them once i get a "ok" form the websocket.
Somehow only the first subscription on my pauseable buffered streams are fired. From then on only the queue stacks up higher.
I have prepared a jsbin to reproduce the issue.
https://jsbin.com/mafakar/edit?js,console
There the "msg recived" stream only fires for the first subscription. And then the q and observer begin stacking up.
I somehow have the feeling this is about hot and cold obserables but I cannot grasp the issues. I would appreciate any help.
Thank you in advance!
It is not the cold/hot issue. What you do in your onMessage is subscribe, then dispose. The dispose terminates the sequence. The onMessageStream should be subscribed to only once, for example, in the constructor:
this.onmessageStream.subscribe(message => console.log('--- msg --- ', message.data));
The subscribe block, including the dispose should be removed.
Also, note that you used replaySubject without a count, this means that the queue holds all previous values. Unless this is a desired behavior, considered changing it to .replaySubject(1)
Here is a working jsbin.
As #Meir pointed out dispose in a subscribe block is a no no since its behavior is non-deterministic. In general I would avoid the use of Subjects and rely on factory methods instead. You can see a refactored version here: https://jsbin.com/popixoqafe/1/edit?js,console
A quick breakdown of the changes:
class WebSocketWrapper {
// Inject the pauser from the external state
constructor(pauser) {
// Input you need to keep as a subject
this.input$ = new Rx.Subject();
// Create the socket
this._socket = this._connect();
// Create a stream for the open event
this.open$ = Rx.Observable.fromEvent(this._socket, 'open');
// This concats the external pauser with the
// open event. The result is a pauser that won't unpause until
// the socket is open.
this.pauser$ = Rx.Observable.concat(
this.open$.take(1).map(true)
pauser || Rx.Observable.empty()
)
.startWith(false);
// subscribe and buffer the input
this.input$
.pausableBuffered(this.pauser$)
.subscribe(msg => this._socket.send(msg));
// Create a stream around the message event
this.message$ = Rx.Observable.fromEvent(this._socket, 'message')
// Buffer the messages
.pausableBuffered(this.pauser$)
// Create a shared version of the stream and always replay the last
// value to new subscribers
.shareReplay(1);
}
send(request) {
// Push to input
this.input$.onNext(request);
}
_connect() {
return new WebSocket('wss://echo.websocket.org');
}
}
As an aside you should also avoid relying on internal variables like source which are not meant for external consumption. Although RxJS 4 is relatively stable, since those are not meant for public consumption they could be changed out from under you.
I want to simulate multiple slow subscriptions. The client subscribes to two or more publications at the same time, and the result arrive later.
The goal is to be able to see how network latencies and randomness can affect my application (it bugs because I expected a publication to be ready before another, ...).
Using the following short setup for the publications:
// server/foo.js
Meteor.publish('foo', function() {
console.log('publishing foo');
Meteor._sleepForMs(2000);
console.log('waking up foo');
this.ready();
});
// server/bar.js is the same with a different name
Meteor.publish('bar', function() {
console.log('publishing bar');
Meteor._sleepForMs(2000);
console.log('waking up bar');
this.ready();
});
Both publications are slowed down thanks to Meteor._sleepForMs as seen in this amazing answer.
The client then subscribes to each publication:
Meteor.subscribe('bar'); // /client/bar.js
Meteor.subscribe('foo'); // /client/foo.js
From there I expected to see both 'publishing' logs first, then both 'waking up'.
However, this appears in the console:
15:37:45? publishing bar
15:37:47? waking up bar
15:37:47? publishing foo
15:37:49? waking up foo
(I removed some irrelevant fluff like the day)
So obviously it runs in a synchronous fashion. I thought that two things can cause that: the server waitForMs which would entirely block the server (fairly weird), or the client subscription design.
To make sure that it wasn't the server I added a simple heartbeat:
Meteor.setInterval(function() { console.log('beep'); }, 500);
And it did not stop beeping, so the server isn't fully blocked.
I thus suspect that the issue lies within the client subscription model, which maybe waits for the subscription to be ready before calling another..?
Thus, two questions:
Why doesn't my experiment run the way I wanted it to?
How should I modify it to achieve my desired goal (multiple slow publications) ?
Meteor processes DDP messages (which include subscriptions) in a sequence. This ensures that you can perform some action like deleting an object and then inserting it back in the correct order, and not run into any errors.
There is support for getting around this in Meteor.methods using this.unblock() to allow the next available DDP message to process without waiting for the previous one to finish executing. Unfortunately this is not available for Meteor.publish in the Meteor core. You can see discussion (and some workarounds) about this issue here: https://github.com/meteor/meteor/issues/853
There is also a package that adds this functionality to publications:
https://github.com/meteorhacks/unblock/
Why doesn't my experiment run the way I wanted it to?
Meteor._sleepForMs is blocking from the way it is implemented:
Meteor._sleepForMs = function (ms) {
var fiber = Fiber.current;
setTimeout(function() {
fiber.run();
}, ms);
Fiber.yield();
};
Calling it prevents the next line from executing inside the fiber until the duration passes. However, this does not block the Node server from handling other events (i.e. executing another publication) due to the way fiber works.
Here is a talk about Fibers in Meteor: https://www.youtube.com/watch?v=AWJ8LIzQMHY
How should I modify it to achieve my desired goal (multiple slow publications) ?
Try using Meteor.setTimeout to simulate latency asynchronously.
Meteor.publish('foo', function() {
console.log('publishing foo');
var self = this;
Meteor.setTimeout(function () {
console.log('waking up foo');
self.ready();
}, 2000);
});
I believe it's because the publications are blocking.
You can use meteorhacks:unblock to unblock publications:
https://atmospherejs.com/meteorhacks/unblock
It could be a good idea to use this.unblock() at the start of every publication (once you've added meteorhacks:unblock).