rxjs pausableBuffered multiple subscriptions

rxjs pausableBuffered multiple subscriptions - javascript

I'm trying to write a websocket rxjs based wrapper.
And I'm struggling with my rxjs understanding.
I have a pause stream which is supposed to pause the pausable buffered streams when an error occures and resume them once i get a "ok" form the websocket.
Somehow only the first subscription on my pauseable buffered streams are fired. From then on only the queue stacks up higher.
I have prepared a jsbin to reproduce the issue.
https://jsbin.com/mafakar/edit?js,console
There the "msg recived" stream only fires for the first subscription. And then the q and observer begin stacking up.
I somehow have the feeling this is about hot and cold obserables but I cannot grasp the issues. I would appreciate any help.
Thank you in advance!

It is not the cold/hot issue. What you do in your onMessage is subscribe, then dispose. The dispose terminates the sequence. The onMessageStream should be subscribed to only once, for example, in the constructor:
this.onmessageStream.subscribe(message => console.log('--- msg --- ', message.data));
The subscribe block, including the dispose should be removed.
Also, note that you used replaySubject without a count, this means that the queue holds all previous values. Unless this is a desired behavior, considered changing it to .replaySubject(1)
Here is a working jsbin.

As #Meir pointed out dispose in a subscribe block is a no no since its behavior is non-deterministic. In general I would avoid the use of Subjects and rely on factory methods instead. You can see a refactored version here: https://jsbin.com/popixoqafe/1/edit?js,console
A quick breakdown of the changes:
class WebSocketWrapper {
// Inject the pauser from the external state
constructor(pauser) {
// Input you need to keep as a subject
this.input$ = new Rx.Subject();
// Create the socket
this._socket = this._connect();
// Create a stream for the open event
this.open$ = Rx.Observable.fromEvent(this._socket, 'open');
// This concats the external pauser with the
// open event. The result is a pauser that won't unpause until
// the socket is open.
this.pauser$ = Rx.Observable.concat(
this.open$.take(1).map(true)
pauser || Rx.Observable.empty()
)
.startWith(false);
// subscribe and buffer the input
this.input$
.pausableBuffered(this.pauser$)
.subscribe(msg => this._socket.send(msg));
// Create a stream around the message event
this.message$ = Rx.Observable.fromEvent(this._socket, 'message')
// Buffer the messages
.pausableBuffered(this.pauser$)
// Create a shared version of the stream and always replay the last
// value to new subscribers
.shareReplay(1);
}
send(request) {
// Push to input
this.input$.onNext(request);
}
_connect() {
return new WebSocket('wss://echo.websocket.org');
}
}
As an aside you should also avoid relying on internal variables like source which are not meant for external consumption. Although RxJS 4 is relatively stable, since those are not meant for public consumption they could be changed out from under you.

Related

Where and how does libUV interact with code on node.js

I wondered a question: "where and how does libUV interact with code on node.js". Lately I was investigating streams, also I read source on github.
Well, let's take source of script called as destroy.js. This script is responding for the destruction of streams: stream.destroy(). After that operation:
in function destroy are set states for streams into values:
writable._stateWritable.destroyed = true
readable._stateReadable.destroyed = true
in function _destroy are set states for streams into values:
writable._stateWritable.closed = true
readable._stateReadable.closed = true
in funtion emitCloseNT:
sets value writable._stateWritable.closeEmmited = true
sets value readable._stateReadable.closeEmmited = true
emmits event close
That's all. Where and how does libUV interact with stream.destroy()? Even documentation of node about writable.destroy says:
This is a destructive and immediate way to destroy a stream
But what is it really? I see only the process of setting state for the streams and only it. So, where does libUV actually destroy stream?

I'm not a subject matter expert, but after debugging the following code, I got a rough idea of what happens behind the scenes:
var cnt = 0;
new stream.Readable({
read(size) {
if (++cnt > 10) this.destroy();
this.push(String(cnt));
}
}).pipe(process.stdout);
Upon this.destroy(), the readableState.destroyed is set to true here, and because of this the following this.push("11") returns false here. If readableState.destroyed had been false, it would instead have called addChunk, which would have ensured that reading goes on by emitting a readable event and calling maybeReadMore (see here).
If the readable stream was created by fs.createReadStream, then the _destroy method additionally calls a close method, which closes the file descriptor.

Is it possible to pause/resume a web worker externally?

I've seen that web workers have a terminate() function, but is there a way to pause and resume web workers from the host thread similar to interrupts?
The critical factor with this question is that using the terminate() call on a web worker will break the state internal to the worker (and any operations taking place may in turn be corrupted in a half-complete state.) I'm looking for a mechanism by which to switch between hundreds-thousands of tasks held in however many web workers equal the total cores on the machine, without losing state within those tasks or relying on cooperative multithreading styles like fibers.

Is it possible to pause/resume a web worker externally?
No. Browser javascript does not have a means of externally pausing web workers without some cooperation from the webWorker itself.
A work-around (with cooperation from the webWorker) would involve the webWorker running a bit of code, then allowing a message to arrive from the outside world that tells it to pause and then the webWorker itself not calling the next unit of work until it receives a message to resume processing. But, because even webWorkers are event driven, you can't receive a message from the outside world until you finish what you were doing and return back to the event system.
This would probably involve executing a chunk of work, setting a timer for only a few ms from now which allows messages to arrive, then when a message arrives, you store the new pause/resume state and then when the timer fires, you check the state and if its pause, you don't call the next unit of work. If it's not the pause state, you execute the next unit of work and when it finishes, you again do setTimeout() with a really short time (allowing any pending messages to process) and keep going over and over again.
Unfortunately, this technique requires you to chunk your work which negates some of the benefits of webWorkers in the first place (though you still retain the benefit of using more than one CPU to process work).
I'm looking for a mechanism by which to switch between hundreds-thousands of tasks held in however many web workers equal the total cores on the machine, without losing state within those tasks
You would need to have each currently running webWorker cooperate and allow a message from the outside world to get processed so they could see when the outside world wants them to stop doing more work and then have them cooperatively not call the next unit of work so they wouldn't be doing anything until the next message arrives telling them to start doing their work again.
If you're going to have thousands of these, you probably don't want them all to be individual webWorkers anyway as that's a bit heavy weight of a system. You probably want a set of webWorkers that are working on a work queue. You queue up work units to be processed and each webWorker grabs the next unit of work from the queue, processes it, grabs the next unit and so on. Your main process then just keeps the queue fed with whatever work you want the webWorkers to be doing. This also requires chunking the work into proper units of work.

Considering you are targeting recent browsers, with access to SharedArrayBuffer and Atomics. The simplest solution is to pass SAB reference to the worker:
main.js
const sab = new SharedArrayBuffer(4);
const int32 = new Int32Array(sab);
const start = () => {
const worker = new Worker("worker.js");
worker.postMessage({kind:"init", sab});
}
const pause = () => {
Atomics.store(int32, 0, 1);
Atomics.notify(int32, 0);
}
const resume = () => {
Atomics.store(int32, 0, 0);
Atomics.notify(int32, 0);
}
...worker loop will wait for the byte to have particular value 1/0.
worker.js
onmessage = event => {
if(event.data.kind === "init"){
while(true) {
Atomics.wait(new Int32Array(event.data.sab), 0, 1);
... your bitcoin miner goes here
}
}
}
In case your worker does not contain synchronous main loop but rather consist of async operations, you can intercept the event loop (block any other line of code to be executed) as follows:
main.js
const sab = new SharedArrayBuffer(4);
const int32 = new Int32Array(sab);
const start = () => {
const worker = new Worker("worker.js");
}
const pause = () => {
Atomics.store(int32, 0, 1);
Atomics.notify(int32, 0);
worker.postMessage({kind:"pause", sab});
}
const resume = () => {
Atomics.store(int32, 0, 0);
Atomics.notify(int32, 0);
}
worker.js
... async/await, fetch(), setTimeout()
onmessage = event => {
if(event.data.kind === "pause")
Atomics.wait(new Int32Array(event.data.sab), 0, 1);
}

Rxjs: What scenario do you want to use scheduler

I don't understand what does it mean by scheduler in rxjs documentation, so I'm trying to understand by scenario its useful in, so I can understand scheduler

&tldr;
In most cases you will never need to concern yourself with Schedulers if only for the fact that for 90% of the cases the default is fine.
Explanation
A Scheduler is simply a way of standardizing time when using RxJS. It effectively schedules events to occur at sometime in the future.
We do this by using the schedule method to queue up new operations that the scheduler will execute in the future. How the Scheduler does this is completely up to the implementation. Often though it is simply about choosing the most efficient means of executing a future action.
Take a simple example whereby we are using the timer operator to execute an action at sometime in the future.
var source = Observable.timer(500);
This is pretty standard fare for RxJS. The Scheduler comes in when you ask the question, what does 500 mean? In the default case it will equal 500 milliseconds, because that is what the convention is and that is what the default Scheduler will do, it will wait 500 milliseconds and then emit an event.
However, there are cases where we may not want the flow of time to operate normally. The most common use case for this is when we are testing. We don't actually want to wait 500 milliseconds for a task to complete, otherwise our test suite would take ages to actually complete!
In that case we would actually want to control the flow of time such that we don't have to wait for 500 milliseconds to elapse before we can verify the result of an stream. In this case we could use the TestScheduler which can execute tasks synchronously so that we don't have to deal with any of that asynchronous messiness.
let scheduler = new TestScheduler();
//Overrides the default scheduler with the default scheduler
let source = Observable.timer(500, scheduler);
//Subscribe to the source, which behaves normally
source.subscribe(x => expect(x).to.be(0));
//When this gets called all pending actions get executed.
scheduler.flush();
There are some other more corner cases where we want to alter the flow of time as well. For instance if we are operating in the context of a game we would likely want to link our scheduling to the requestAnimationFrame or to some other faux time scale, which would necessitate the use of something like the AnimationFrameScheduler or the VirtualTimeScheduler.

Scenario 1
you have initial value and want subscriber to get first initial value and then get some other value (depanding on condition).
const dispatcher = (new BehaviorSubject("INITIAL"))
.pipe(observeOn(asyncScheduler));
let did = false; // condition
dispatcher.pipe(
tap((value) => {
if(!did) {
did = true;
dispatcher.next("SECOND");
}
}))
.subscribe((state) => {
console.log('Subscription value: ', state);
});
//Output: Initial ... SECOND
Without .pipe(observeOn(asyncScheduler)) it will output vise-verse since Subject .next is sync operation.
codepen example

Node Js process.stdin

I'm just now getting into node.js and reading input from the command line.
I'm a bit confused on the following code example.
process.stdin.setEncoding('utf8');
process.stdin.on('readable', () => {
var chunk = process.stdin.read();
if (chunk !== null) {
process.stdout.write(`data: ${chunk}`);
}
});
process.stdin.on('end', () => {
process.stdout.write('end');
});
First, in old mode, process.stdin.resume() was needed to make it begin to listen. Doesn't using resume() make more sense for performance? Doesn't this continually listen, using up processing power that it doesn't need to use up?
Also, I read the docs but I'm not understanding what 'end' does here.
The docs say:
This event fires when there will be no more data to read.
But 'readable' is always listening so we never get to the 'end'?

Continually listening for input doesn't necessarily use more resources than resuming the stream manually. It's just a different way of handling the pipes.
The "readable" part stops listening when the "end" event is triggered, as end will close the stream and therefore there won't be anything readable anymore.
The end event is a translation of the end signal fired to the standard input (for instance a CTRL/D on a unix system.

How do reactive streams in JS work?

I'm novice in reactive streams and now trying to understand them. The idea looks pretty clear and simple, but on practice I can't understand what's really going on there.
For now I'm playing with most.js, trying to implement a simple dispatcher. The scan method seems to be exactly what I need for this.
My code:
var dispatch;
// expose method for pushing events to stream:
var events = require("most").create(add => dispatch = add);
// initialize stream, so callback in `create` above is actually called
events.drain();
events.observe(v => console.log("new event", v));
dispatch(1);
var scaner = events.scan(
(state, patch) => {
console.log("scaner", patch);
// update state here
return state;
},
{ foo: 0 }
);
scaner.observe(v => console.log("scaner state", v));
dispatch(2);
As I understand, the first observer should be called twice (once per event), and scaner callback and second observer – once each (because they were added after triggering first event).
On practice, however, console shows this:
new event 1
new event 2
scaner state { foo: 0 }
Scaner is never called, no matter how much events I push in stream.
But if I remove first dispatch call (before creating scaner), everything works just as I expected.
Why is this? I'm reading docs, reading articles, but so far didn't found anything even similar to this problem. Where am I wrong in my suggestions?

Most probably, you have studied examples like this from the API:
most.from(['a', 'b', 'c', 'd'])
.scan(function(string, letter) {
return string + letter;
}, '')
.forEach(console.log.bind(console));
They are suggesting a step-by-step execution like this:
Get an array ['a', 'b', 'c', 'd'] and feed its values into the stream.
The values fed are transformed by scan().
... and consumed by forEach().
But this is not entirely true. This is why your code doesn't work.
Here in the most.js source code, you see at line 1340 ff.:
exports.from = from;
function from(a) {
if(Array.isArray(a) || isArrayLike(a)) {
return fromArray(a);
}
...
So from() is forwarding to some fromArray(). Then, fromArray() (below in the code) is creating a new Stream:
...
function fromArray (a) {
return new Stream(new ArraySource(a));
}
...
If you follow through, you will come from Stream to sink.event(0, array[i]);, having 0 for timeout millis. There is no setTimeout in the code, but if you search the code further for .event = function, you will find a lot of additional code that uncovers more. Specially, around line 4692 there is the Scheduler with delay() and timestamps.
To sum it up: the array in the example above is fed into the stream asynchronously, after some time, even if the time seems to be 0 millis.
Which means you have to assume that somehow, the stream is first built, and then used. Even if the program code doesn't look that way. But hey, isn't it always the target to hide complexity :-) ?
Now you can check this with your own code. Here is a fiddle based on your snippet:
https://jsfiddle.net/aak18y0m/1/
Look at your dispatch() calls in the fiddle. I have wrapped them with setTimeout():
setTimeout( function() { dispatch( 1 /* or 2 */); }, 0);
By doing so, I force them also to be asynchronous calls, like the array values in the example actually are.
In order to run the fiddle, you need to open the browser debugger (to see the console) and then press the run button above. The console output shows that your scanner is now called three times:
doc ready
(index):61 Most loaded: [object Object]
(index):82 scanner state Object {foo: 0}
(index):75 scanner! 1
(index):82 scanner state Object {foo: 0}
(index):75 scanner! 2
(index):82 scanner state Object {foo: 0}
First for drain(), then for each event.
You can also reach a valid result (but it's not the same behind scenes) if you use dispatch() synchronously, having them added at the end, after JavaScript was able to build the whole stream. Just uncomment the lines after // Alternative solution, run again and watch the result.

Well, my question appears to be not so general as it sounds. It's just a lib-specific one.
First – approach from topic is not valid for most.js. They argue to 'take a declarative, rather than imperative, approach'.
Second – I tried Kefir.js lib, and with it code from topic works perfect. Just works. Even more, the same approach which is not supported in most.js, is explicitly recommended for Kefir.js.
So, the problem is in a particular lib implementation, not in my head.

Develop Reference

JavaScript is the programming language of the Web.