Guarantee for streams delaying flow of data

Guarantee for streams delaying flow of data - javascript

Node streams, from what i could find1, delay all flowing of data until the end of the tick (typically process.nextTick, but which queue isn't important), and don't start pumping data synchronously on method calls.
Sadly, i could not find a guarantee for this behavior in the docs (hopefully just missed it), and can therefore hardly depend on it. Is this behavior in the public API? Does this extend to all streams using the API for stream implementers?
To elaborate, except for explicitly reading sync with stream.Readable.prototype.read, all mechanisms appear to not synchronously start pumping data:
multiple event handlers in a row:
import * as stream from 'node:stream';
const r = stream.Readable.from(['my\ndata', 'more\nstuff']);
r.on('data', data => { /* do something */ });
// oops, attaching the first event handler could have already
// sent out all data synchronously
r.on('data', data => { /* do more */ });
starting a pipe, before the event handlers exist:
import * as stream from 'node:stream';
import * as readline from 'node:readline';
const r = stream.Readable.from(['my\ndata', 'more\nstuff']);
const p = new stream.PassThrough();
const rl = readline.createInterface({
input: p,
output: null,
});
r.pipe(p);
// oops, starting the pipe could have already sent all the data,
// giving "line" events without any listener being attached yet
rl.on('line', line => { /* do something */ });
My hope is i missed a line in the docs, and that's that. Otherwise, it will be hard to tell, how much i can depend on this behavior. E.g. in the examples above, one could explicitly pause the stream first, and resume it in nextTick (or similar), but it would be considerably cleaner, if this behavior could be depended upon.
[1]: e.g. the entire added complexity and construct of keeping track of sync, or old posts asking for the behavior at the time

Related

What is the purpose of script evaluation environment settings object set

If you work with JavaScript, you probably know about event loop along with macro and micro tasks. So whatwg spec defines a structure of any task that consists from:
steps
source
document
script evaluation environment settings object set
If the first three steps are clear the last one is unclear. It is mostly used in the "prepare to run script" for gathering "environment settings objects" and also it is used as part of event loop for reporting long tasks using gathered "environment settings objects".
My question is: What is this case when the task has in the "script evaluation environment settings object set" more than one "environment settings object"? An example wouldn't be excess.
My supposition: There is one place in whatwg spec that says the following:
These algorithms are not invoked by one script directly calling another, but they can be invoked reentrantly in an indirect manner, e.g. if a script dispatches an event which has event listeners registered.
Hence the example will be following:
/// example based on the chromium flag --process-per-site
setTimeout(() => {
console.log("task has been started");
const auxiliary = window.open("about:blank"); /// or address from same origin
auxiliary.document.body.onclick = () => {
auxiliary.document.writeln(`<script>const observer = new PerformanceObserver((list) => {
for (const entry of list.getEntries()) {
console.log(entry);
}
});
observer.observe({entryTypes: ['longtask']});</script>`);
auxiliary.document.writeln("Here we go");
auxiliary.document.body.onclick = "";
};
auxiliary.document.body.click();
const start = performance.now();
while(performance.now() < start + 3000);
console.log("heavy computation in the entry script is over");
})
What do we see? We see that this code produces one task, also it creates auxiliary browsing context for that is making an observer. Finally this observer is firing in the auxiliary context during the long task.
So I can conclude that this code demonstrating how the task can have more than one "environment settings object" in set of the task structure.

javascript: module calls a function in the file that requires the module

My first time writing a js library. The library is intended to execute, at specific times, functions in the file that required the library. Kind of like Angular executes user implemented hooks such as $onInit, except that, in my case, user can define an arbitrary number of functions to be called by my library. How can I implement that?
One way I have in mind is to define a registerFunction(name, function) method, which maps function names to implementations. But can user just give me an array of names and I automatically register the corresponding functions for them?

Unless you have a specific requirement that it do so, your module does not need to know the names of the functions it is provided. When your module invokes those functions, it will do so by acting on direct references to them rather than by using their names.
For example:
// my-module.js
module.exports = function callMyFunctions( functionList ) {
functionList.forEach( fn => fn() )
}
// main application
const myFunc1 = () => console.log('Function 1 executing')
const myFunc2 = () => console.log('Function 2 executing')
const moduleThatInvokesMyFunctions = require('./my-module.js')
// instruct the module to invoke my 2 cool functions
moduleThatInvokesMyFunctions([ myFunc1, myFunc2 ])
//> Function 1 executing
//> Function 2 executing
See that the caller provides direct function references to the module, which the module then uses -- without caring or even knowing what those functions are called. (Yes, you can obtain their names by inspecting the function references, but why bother?)
If you want a more in-depth answer or explanation, it would help to know more about your situation. What environment does your library target: browsers? nodejs? Electron? react-native?
The library is intended to execute, at specific times, functions in the file that required the library
The "at specific times" suggests to me something that is loosely event-based. So, depending on what platform you're targeting, you could actually use a real EventEmitter. In that case, you'd invent unique names for each of the times that a function should be invoked, and your module would then export a singleton emitter. Callers would then assign event handlers for each of the events they care about. For callers, that might look like this:
const lifecycleManager = require('./your-module.js')
lifecycleManager.on( 'boot', myBootHandler )
lifecycleManager.on( 'config-available', myConfigHandler )
// etc.
A cruder way to handle this would be for callers to provide a dictionary of functions:
const orchestrateJobs = require('./your-module.js')
orchestrateJobs({
'boot': myBootHandler,
'config-available': myConfigHandler
})
If you're not comfortable working with EventEmitters, this may be appealing. But going this route requires that you consider how to support other scenarios like callers wanting to remove a function, and late registration.
Quick sketch showing how to use apply with each function:
// my-module.js
module.exports = function callMyFunctions( functionList ) {
functionList.forEach( fn => fn.apply( thisValue, arrayOfArguments ) )
}
Note that this module still has no idea what names the caller has assigned to these functions. Within this scope, each routine bears the moniker "fn."
I get the sense you have some misconceptions about how execution works, and that's led you to believe that the parts of the program need to know the names of other parts of the program. But that's not how continuation-passing style works.
Since you're firing caller functions based on specific times, it's possible the event model might be a good fit. Here's a sketch of what that might look like:
// caller
const AlarmClock = require('./your-module.js')
function doRoosterCall( exactTime ) {
console.log('I am a rooster! Cock-a-doodle-doo!')
}
function soundCarHorn( exactTime ) {
console.log('Honk! Honk!')
}
AlarmClock.on('sunrise', doRoosterCall)
AlarmClock.on('leave-for-work', soundCarHorn)
// etc
To accomplish that, you might do something like...
// your-module.js
const EventEmitter = require('events')
const singletonClock = new EventEmitter()
function checkForEvents() {
const currentTime = new Date()
// check for sunrise, which we'll define as 6:00am +/- 10 seconds
if(nowIs('6:00am', 10 * 1000)) {
singletonClock.emit('sunrise', currentTime)
}
// check for "leave-for-work": 8:30am +/- 1 minute
if(nowIs('8:30am', 60 * 1000)) {
singletonClock.emit('leave-for-work', currentTime)
}
}
setInterval( checkForEvents, 1000 )
module.exports = singletonClock
(nowIs is some handwaving for time-comparisons. When doing cron-like work, you should assume your heartbeat function will almost never be fired when the time value is an exact match, and so you'll need something to provide "close enough" comparisons. I didn't provide an impl because (1) it seems like a peripheral concern here, and (2) I'm sure Momentjs, date-fns, or some other package provides something great so you won't need to implement it yourself.

How to wait for an event from Event Emitter in typescript using async/await in nodejs typescript

I have a typescript application running on NodeJS where I use Event Emitter class to emit an event when a value of a variable is changed.
I want to wait for the event to happen to proceed further and hence I need to induce a wait in my typescript code.
This is what I am doing,
if (StreamManagement.instance.activeStreams.get(streamObj.streamContext).streamState === 'Paused') {
await StreamManagement.instance.waitForStreamActive(streamObj);
}
This is my waitForStreamActive method,
public async waitForStreamActive(stream: Stream) {
const eventEmitter = new EventEmitter();
return new Promise(( resolve ) => {
eventEmitter.on('resume', resolve );
});
}
And I trigger the emit event like this,
public async updateStream(streamContext: string, state: boolean): Promise<string> {
const eventEmitter = new EventEmitter();
if (state === Resume) {
const streamState = StreamManagement.instance.activeStreams.get(streamContext).streamState = 'Active';
eventEmitter.emit('resume');
return streamState;
}
}
All these three code snippets are in different classes and in different files.
This snippet of code doesn't seem to work as I expected.
I want to achieve a wait in javascript until the promise is resolved. That is until the state is changed to resume.
Can someone please point me where I am going wrong?

Can someone please point me where I am going wrong?
You have two different EventEmitters. Events triggered on one EventEmitter do not fire on others.
More Code Review
Firing and listening on the same EventEmitter will work. That said, Promise is not the correct abstraction for things that return multiple times. A Promise can only be resolved once, whereas Events can fire multiple times. Suggest using EventEmitter as is, or alternatively use some other stream abstraction e.g. Observable 🌹

EventEmmitter(Obserbable pattern) and Promise(Chain of responsibility pattern), they have different obligations. I see that you want use them both. In your case it is not impossible because EventEmitter is not design for to chain observer processors. Use simple promises and builders only. There is a very good library RxJS it provides a lot of functionality. It can do what you ask: to build event driven architecture with sync/async chained cases.

read file stream line by line synchronously

I'm looking at nodejs readline module documentation for a task where I've to read a very large file line by line and it looks good. But for my particular task, I need it to read lines synchronously ie. no matter what, line 5 must not be read before line 4, and due to nature of node, I just want to confirm that is this code safe for that usage -
const readline = require('readline');
const fs = require('fs');
const rl = readline.createInterface({
input: fs.createReadStream('sample.txt')
});
rl.on('line', (line) => {
console.log(`Line from file: ${line}`);
});
If not, what should I use/do? Currently it is working for me but I don't know if it'll work with large lines where next line could be parsed faster than previous one etc..

I doubt very much that it is possible, that the callback fired later can be executed earlier than another one.
Basically, it refers to the event loop and stack of the process.
Still, to guarantee I can suggest to implement something similar to async/queue, but with ability to dynamically push callbacks.
Assuming you will have something like this:
const Queue = require('./my-queue')
const queue = new Queue()
function addLineToQueue(line) {
queue.push(function() {
// do some job with line
console.log(`Line: "${line}" was successfully processed!`)
})
}
You will modify your code:
rl.on('line', (line) => {
addLineToQueue(line)
console.log(`Added line to queue: ${line}`)
})
And sure your queue implementation should start as far as it has any tasks to execute. This way the order of callbacks will be guaranteed. But as for me it looks like a little overhead.

RxJS Testing Observable sequence without passing scheduler

I have problems attempting to test a piece of code that is similar to the following function.
Basically the question boils down to: is it possible to change the Scheduler for the debounce operator without passing a separate Scheduler to the function call?
The following example should explain the use case a bit more concrete. I am trying to test a piece of code similar to the following. I want to test the chain in the function (using a TestScheduler) without having to pass a scheduler to the debounce() operator.
// Production code
function asyncFunctionToTest(subject) {
subject
.tap((v) => console.log(`Tapping: ${v}`))
.debounce(1000)
.subscribe((v) => {
// Here it would call ReactComponent.setState()
console.log(`onNext: ${v}`)
});
}
The test file would contain the following code to invoke the function and make sure the subject emits the values.
// Testfile
const testScheduler = new Rx.TestScheduler();
const subject = new Rx.Subject();
asyncFunctionToTest(subject);
testScheduler.schedule(200, () => subject.onNext('First'));
testScheduler.schedule(400, () => subject.onNext('Second'))
testScheduler.advanceTo(1000);
The test code above still takes one actual second to do the debounce. The only solution i have found is to pass the TestScheduler into the function and passing it to the debounce(1000, testScheduler) method. This will make the debounce operator use the test scheduler.
My initial idea was to use observeOn or subscribeOn to change the defaultScheduler that is used throughout the operation chain by changing
asyncFunctionToTest(subject);
to be something like asyncFunctionToTest(subject.observeOn(testScheduler)); or asyncFunctionToTest(subject.subscribeOn(testScheduler));
that does not give me the result as i expected, however i presume i might not exactly understand the way the observeOn and subscribeOn operators work. (I guesstimate now that when using these operators it changes the schedulers the whole operation chain is run on, but operators still pick their own schedulers, unless specifically passed?)
The following JSBin contains the runnable example where i passed in the scheduler. http://jsbin.com/kiciwiyowe/1/edit?js,console

No not really, unless you actually patched the RxJS library. I know this was brought up recently as an issue and there may be support for say, being able to change what the DefaultScheduler at some point in the future, but at this time it can't be reliably done.
Is there any reason why you can't include the scheduler? All the operators that accept Schedulers already do so optionally and have sensible defaults so it really costs you nothing given that you production code could simply ignore the parameter.
As a more general aside to why simply adding observeOn or subscribeOn doesn't fix it is that both of those operators really only affect how events are propagated after they have been received by that operator.
For instance you could implement observeOn by doing the following:
Rx.Observable.prototype.observeOn = (scheduler) => {
var source = this;
return Rx.Observable.create((observer) => {
source.subscribe(x =>
{
//Reschedule this for a later propagation
scheduler.schedule(x,
(s, state) => observer.onNext(state));
},
//Errors get forwarded immediately
e => observer.onError(e),
//Delay completion
() => scheduler.schedule(null, () => observer.onCompleted()))
});
};
All the above is doing is rescheduling the incoming events, if operators down stream or upstream have other delays this operator has no effect on them. subscribeOn has a similar behavior except that it reschedules the subscription not the events.

Develop Reference

JavaScript is the programming language of the Web.

Guarantee for streams delaying flow of data - javascript

Related

What is the purpose of script evaluation environment settings object set

javascript: module calls a function in the file that requires the module

How to wait for an event from Event Emitter in typescript using async/await in nodejs typescript

read file stream line by line synchronously

RxJS Testing Observable sequence without passing scheduler

Categories

Resources