ANTLR4 how to wait for processing all rules? - javascript

I have created my own Xtext based DSL and vscode based editor with language server protocol. I parse the model from the current TextDocument with antlr4ts. Below is the code snippet for the listener
class TreeShapeListener implements DebugInternalModelListener {
public async enterRuleElem1(ctx: RuleElem1Context): Promise<void> {
...
// by the time the response is received, 'walk' function returns
var resp = await this.client.sendRequest(vscode_languageserver_protocol_1.DefinitionRequest.type,
this.client.code2ProtocolConverter.asTextDocumentPositionParams(this.document, position))
.then(this.client.protocol2CodeConverter.asDefinitionResult, (error) => {
return this.client.handleFailedRequest(vscode_languageserver_protocol_1.DefinitionRequest.type, error, null);
});
...
this.model.addElem1(elem1);
}
public async enterRuleElem2(ctx: RuleElem2Context): void {
...
this.model.addElem2(elem2);
}
and here I create the parser and the tree walker.
// Create the lexer and parser
let inputStream = antlr4ts.CharStreams.fromString(document.getText());
let lexer = new DebugInternaModelLexer(inputStream);
let tokenStream = new antlr4ts.CommonTokenStream(lexer);
let parser = new DebugInternalModelParser(tokenStream);
parser.buildParseTree = true;
let tree = parser.ruleModel();
let model = new Model();
ParseTreeWalker.DEFAULT.walk(new TreeShapeListener(model, client, document) as ParseTreeListener, tree);
console.log(model);
The problem is that while processing one of the rules (enterRuleElem1), I have an async function (client.sendRequest) which is returned after ParseTreeWalker.DEFAULT.walk returns. How can I make walk wait till all the rules are completed?
Edit 1: Not sure if this is how walk function works, but tried to recreate the above scenario with a minimal code below
function setTimeoutPromise(delay) {
return new Promise((resolve, reject) => {
if (delay < 0) return reject("Delay must be greater than 0")
setTimeout(() => {
resolve(`You waited ${delay} milliseconds`)
}, delay)
})
}
async function enterRuleBlah() {
let resp = await setTimeoutPromise(2500);
console.log(resp);
}
function enterRuleBlub() {
console.log('entered blub');
}
function walk() {
enterRuleBlah();
enterRuleBlub();
}
walk();
console.log('finished parsing');
and the output is
entered blub
finished parsing
You waited 2500 milliseconds
Edit 2: I tried the suggestion from the answer and now it works! My solution looks like:
public async doStuff() {
...
return new Promise((resolve)=> {
resolve(0);
})
}
let listener = new TreeShapeListener(model, client, document);
ParseTreeWalker.DEFAULT.walk(listener as ParseTreeListener, tree);
await listener.doStuff();

The tree walk is entirely synchronous, regardless whether you make your listener/visitor rules async or not. Better separate the requests from the walk, which should only collect all the information need to know what to send and after that process this collection and actually send the requests, for which you then can wait.

Related

Node.js htmlparser2 writableStream still emit events after end() call

Sorry for the probable trivial question but I still fail to get how streams work in node.js.
I want to parse an html file and get the path of the first script I encounter. I'd like to interrupt the parsing after the first match but the onopentag() listener is still invoked until the effective end of the html file. why ?
const { WritableStream } = require("htmlparser2/lib/WritableStream");
const scriptPath = await new Promise(function(resolve, reject) {
try {
const parser = new WritableStream({
onopentag: (name, attrib) => {
if (name === "script" && attrib.src) {
console.log(`script : ${attrib.src}`);
resolve(attrib.src); // return the first script, effectively called for each script tag
// none of below calls seem to work
indexStream.unpipe(parser);
parser.emit("close");
parser.end();
parser.destroy();
}
},
onend() {
resolve();
}
});
const indexStream = got.stream("/index.html", {
responseType: 'text',
resolveBodyOnly: true
});
indexStream.pipe(parser); // and parse it
} catch (e) {
reject(e);
}
});
Is it possible to close the parser stream before the effective end of indexStream and if yes how ?
If not why ?
Note that the code works and my promise is effectively resolved using the first match.
There's a little confusion on how the WriteableStream works. First off, when you do this:
const parser = new WritableStream(...)
that's misleading. It really should be this:
const writeStream = new WritableStream(...)
The actual HTML parser is an instance variable in the WritableStream object named ._parser (see code). And, it's that parser that is emitting the onopentag() callbacks and because it's working off a buffer that may have some accumulated text disconnecting from the readstream may not immediately stop events that are still coming from the buffered data.
The parser itself has a public reset() method and it appears that if disconnected from the readstream and then you called that reset method, it should stop emitting events.
You can try this (I'm not a TypeScript person so you may have to massage some things to make the TypeScript compiler happy, but hopefully you can see the concept here):
const { WritableStream } = require("htmlparser2/lib/WritableStream");
const scriptPath = await new Promise(function(resolve, reject) {
try {
const writeStream = new WritableStream({
onopentag: (name, attrib) => {
if (name === "script" && attrib.src) {
console.log(`script : ${attrib.src}`);
resolve(attrib.src); // return the first script, effectively called for each script tag
// disconnect the readstream
indexStream.unpipe(writeStream);
// reset the internal parser so it clears any buffers it
// may still be processing
writeStream._parser.reset();
}
},
onend() {
resolve();
}
});
const indexStream = got.stream("/index.html", {
responseType: 'text',
resolveBodyOnly: true
});
indexStream.pipe(writeStream); // and parse it
} catch (e) {
reject(e);
}
});

How to execute synchronous HTTP requests in JavaScript

I need to execute unknown number of http requests in a node.js program, and it needs to happen synchronously. only when one get the response the next request will be execute. How can I implement that in JS?
I tried it synchronously with the requset package:
function HttpHandler(url){
request(url, function (error, response, body) {
...
})
}
HttpHandler("address-1")
HttpHandler("address-2")
...
HttpHandler("address-100")
And asynchronously with request-promise:
async function HttpHandler(url){
const res = await request(url)
...
}
HttpHandler("address-1")
HttpHandler("address-2")
...
HttpHandler("address-100")
Non of them work. and as I said I can have unknown number of http request over the program, it depends on the end user.
Any ideas on to handle that?
Use the got() library, not the request() library because the request() library has been deprecated and does not support promises. Then, you can use async/await and a for loop to sequence your calls one after another.
const got = require('got');
let urls = [...]; // some array of urls
async function processUrls(list) {
for (let url of urls) {
await got(url);
}
}
processUrls(urls).then(() => {
console.log("all done");
}).catch(err => {
console.log(err);
});
You are claiming some sort of dynamic list of URLs, but won't show how that works so you'll have to figure out that part of the logic yourself. I'd be happy to show how to solve that part, but you haven't given us any idea how that should work.
If you want a queue that you can regularly add items to, you can do something like this:
class sequencedQueue {
// fn is a function to call on each item in the queue
// if its asynchronous, it should return a promise
constructor(fn) {
this.queue = [];
this.processing = false;
this.fn = fn;
}
add(...items) {
this.queue.push(...items);
return this.run();
}
async run() {
// if not already processing, start processing
// because of await, this is not a blocking while loop
while (!this.processing && this.queue.length) {
try {
this.processing = true;
await this.fn(this.queue.shift());
} catch (e) {
// need to decide what to do upon error
// this is currently coded to just log the error and
// keep processing. To end processing, throw an error here.
console.log(e);
} finally {
this.processing = false;
}
}
}
}

What are ways to run a script only after another script has finished?

Lets say this is my code (just a sample I wrote up to show the idea)
var extract = require("./postextract.js");
var rescore = require("./standardaddress.js");
RunFunc();
function RunFunc() {
extract.Start();
console.log("Extraction complete");
rescore.Start();
console.log("Scoring complete");
}
And I want to not let the rescore.Start() run until the entire extract.Start() has finished. Both scripts contain a spiderweb of functions inside of them, so having a callback put directly into the Start() function is not appearing viable as the final function won't return it, and I am having a lot of trouble understanding how to use Promises. What are ways I can make this work?
These are the scripts that extract.Start() begins and ends with. OpenWriter() is gotten to through multiple other functions and streams, with the actual fileWrite.write() being in another script that's attached to this (although not needed to detect the end of run. Currently, fileWrite.on('finish') is where I want the script to be determined as done
module.exports = {
Start: function CodeFileRead() {
//this.country = countryIn;
//Read stream of thate address components
fs.createReadStream("Reference\\" + postValid.country + " ADDRESS REF DATA.csv")
//Change separator based on file
.pipe(csv({escape: null, headers: false, separator: delim}))
//Indicate start of reading
.on('resume', (data) => console.log("Reading complete postal code file..."))
//Processes lines of data into storage array for comparison
.on('data', (data) => {
postValid.addProper[data[1]] = JSON.stringify(Object.values(data)).replace(/"/g, '').split(',').join('*');
})
//End of reading file
.on('end', () => {
postValid.complete = true;
console.log("Done reading");
//Launch main script, delayed to here in order to not read ahead of this stream
ThisFunc();
});
},
extractDone
}
function OpenWriter() {
//File stream for writing the processed chunks into a new file
fileWrite = fs.createWriteStream("Processed\\" + fileName.split('.')[0] + "_processed." + fileName.split('.')[1]);
fileWrite.on('open', () => console.log("File write is open"));
fileWrite.on('finish', () => {
console.log("File write is closed");
});
}
EDIT: I do not want to simply add the next script onto the end of the previous one and forego the master file, as I don't know how long it will be and its supposed to be designed to be capable of taking additional scripts past our development period. I cannot just use a package as it stands because approval time in the company takes up to two weeks and I need this more immediately
DOUBLE EDIT: This is all my code, every script and function is all written by me, so I can make the scripts being called do what's needed
You can just wrap your function in Promise and return that.
module.exports = {
Start: function CodeFileRead() {
return new Promise((resolve, reject) => {
fs.createReadStream(
'Reference\\' + postValid.country + ' ADDRESS REF DATA.csv'
)
// .......some code...
.on('end', () => {
postValid.complete = true;
console.log('Done reading');
resolve('success');
});
});
}
};
And Run the RunFunc like this:
async function RunFunc() {
await extract.Start();
console.log("Extraction complete");
await rescore.Start();
console.log("Scoring complete");
}
//or IIFE
RunFunc().then(()=>{
console.log("All Complete");
})
Note: Also you can/should handle error by reject("some error") when some error occurs.
EDIT After knowing about TheFunc():
Making a new Event emitter will probably the easiest solution:
eventEmitter.js
const EventEmitter = require('events').EventEmitter
module.exports = new EventEmitter()
const eventEmitter = require('./eventEmitter');
module.exports = {
Start: function CodeFileRead() {
return new Promise((resolve, reject) => {
//after all of your code
eventEmitter.once('WORK_DONE', ()=>{
resolve("Done");
})
});
}
};
function OpenWriter() {
...
fileWrite.on('finish', () => {
console.log("File write is closed");
eventEmitter.emit("WORK_DONE");
});
}
And Run the RunFunc like as before.
There's no generic way to determine when everything a function call does has finished.
It might accept a callback. It might return a promise. It might not provide any kind of method to determine when it is done. It might have side effects that you could monitor by polling.
You need to read the documentation and/or source code for that particular function.
Use async/await (promises), example:
var extract = require("./postextract.js");
var rescore = require("./standardaddress.js");
RunFunc();
async function extract_start() {
try {
extract.Start()
}
catch(e){
console.log(e)
}
}
async function rescore_start() {
try {
rescore.Start()
}
catch(e){
console.log(e)
}
}
async function RunFunc() {
await extract_start();
console.log("Extraction complete");
await rescore_start();
console.log("Scoring complete");
}

Nodejs running function on background

I’ve nodes program which I need to run two function in the beginning of the program
And later on access the function results, currently with await each function at a time this works,
However in order to save a time and not waiting to GetService and GetProcess as I need the data later on in the project
It takes about 4 seconds to get this data and I want to run it on the background as I don’t need the results immediately,
How I can do it in node js, If I run promise.all It would wait until the getService and getProcess and then go to rest of the program.
an example
function main() {
//I want to run this both function in background to save time
let service = await GetServices();
this.process = await GetProcess();
…..//Here additional code is running
//let say that after 30 second this code is called
Let users = GetUser(service);
Let users = GetAdress(this.process);
}
im actually running yeoman generator
https://yeoman.io/authoring/
https://yeoman.io/authoring/user-interactions.html
export default class myGenerator extends Generator {
//here I want run those function in background to save time as the prompt to the user takes some time (lets say user have many questions...)
async initializing() {
let service = await GetServices();
this.process = await GetProcess();
}
async prompting() {
const answers = await this.prompt([
{
type: "input",
name: "name",
message: "Your project name",
default: this.appname // Default to current folder name
},
{
type: "confirm",
name: "list",
choises: this.process //here I need to data from the function running in background
}
]);
}
Let's assume that getServices() may take 3 seconds and getProcess() may take 4 seconds, so if you run these both functions at the same time you will be returned in total 4 seconds with the return values from both promises.
You can execute the code while this process is running in the background there will be a callback when the promises resolved, your late functions will be called at this stage.
Check the below simple example;
let service;
let process;
function main() {
// Both functions will execute in background
Promise.all([getServices(), getProcess()]).then((val) => {
service = val[0];
process = val[1];
console.log(service, process);
// Aafter completed this code will be called
// let users = GetUser(service);
// let users = GetAdress(process);
console.log('I am called after all promises completed.')
});
// Current example.
// let service = await GetServices();
// this.process = await GetProcess();
/* Code blocks.. */
console.log('Code will execute without delay...')
}
function getServices() {
return new Promise((resolve, reject) => {
setTimeout(() => {
resolve("service is returned")
}, 3000);
});
}
function getProcess() {
return new Promise((resolve, reject) => {
setTimeout(() => {
resolve("process is returned")
}, 4000);
});
}
main();
You can start the asynchronous operation but not await it yet:
function suppressUnhandledRejections(p) {
p.catch(() => {});
return p;
}
async function main() {
// We have to suppress unhandled rejections on these promises. If they become
// rejected before we await them later, we'd get a warning otherwise.
const servicePromise = suppressUnhandledRejections(GetServices());
this.processPromise = suppressUnhandledRejections(GetProcess());
// Do other stuff
const service = await servicePromise;
const process = await this.processPromise;
}
Also consider using Promise.all() which returns a promise for the completion of all promises passed to it.
async function main() {
const [ services, process, somethingElse ] = await Promise.all([
GetServices(),
GetProcess(),
SomeOtherAsyncOperation(),
]);
// Use the results.
}
To do what who you need, you have to understand the event loop.
Nodejs is designed to work in a single thread unlike languages like go, however nodejs handle proccess on different threads. so you can use nextTick () to add a new event to the main thread and it will be executed at the end of the whole block.
function main() {
//I want to run this both function in background to save time
let service = await GetServices();
this.process = await GetProcess();
…..//Here additional code is running
//Let say that after 30 second this code is called
Let users = GetUser(service);
Let users = GetAdr(this.process);
}
function someFunction(){
// do something...
}
main();
process.nextTick(someFunction());// happens after all main () processes are terminated...

How to work properly with async operations in nodejs streams

Recently I started to think - do I handle async operations inside Nodejs stream in a right way? Therefore I just want to make sure it's so.
class AsyncTransform extends Transform {
constructor() {
super({objectMode: true});
}
public async _transform(chunk, enc, done) {
const result = await someAsyncStuff();
result && this.push(result);
done();
}
}
This small example works great, but basically, all async stuff takes some time for execution and in most cases, I want to work with chunks in parallel. I will put up done on the top of _transform, but this is not a solution for some reasons, and one of them is that fact that when the last chunk will invoke done, the next push will throw an error Error: stream.push() after EOF. So we can't make a push if the last chunk has already invoked with its done.
For handle this case I use the _flush transform's method in conjunction with chunk's queue counter (increase when it come in, decrease on push). God damn, there are already so many words, so here is just a sample of my code.
const minute = 60000;
const second = 1000;
const CRITICAL_QUEUE_POINT = 50;
export class AsyncTransform extends Transform {
private queue: number = 0;
constructor() {
super({objectMode: true});
}
public async _transform(chunk, enc, done) {
this.checkQueue()
.then(() => this.init(chunk, done));
}
public _flush(done) {
this._done(done, true);
}
private async init(chunk, done) {
this.increaseQueueCounter();
this._done(done);
const user = await new UserRepository().search(chunk);
this.decreaseQueueCounter();
this._push(user);
}
/**
* Queue
* */
private checkQueue(): Promise<any> {
return new Promise((resolve) => {
const _checkQueue = () => {
if (this.queue >= CRITICAL_QUEUE_POINT) {
return setTimeout(_checkQueue, second * 10);
}
resolve();
};
_checkQueue();
});
}
private increaseQueueCounter(): void {
this.queue++;
}
private decreaseQueueCounter(): void {
this.queue--;
}
/**
* Transform API
* */
private _push(user) {
this.push(user);
}
private _done(done, isFlush: boolean = false) {
if (!isFlush) {
return done();
}
if (this.queue === 0) {
return done();
}
setTimeout(() => this._done(done, isFlush), second * 10);
}
}
I thought about it two years ago and was looking for a solution - a framework of some sort. I found a couple frameworks - highland and event stream - but all were so complex to use that I decided to write a new one: scramjet.
Now your code can be as simple as:
const {DataStream} = require('scramjet');
yourDataStream.pipe(new DataStream())
.map(async (chunk) => {
await checkQueue();
return new UserRepository().search(chunk);
});
Or, if I understand the checkQueue() correctly it just keeps the simultaneous number of connections below the critical level, then it's even simpler like this:
yourDataStream.pipe(new DataStream({maxParallel: CRITICAL_QUEUE_POINT }))
.map(async (chunk) => new UserRepository().search(chunk));
It will keep the number of connections at a stable level (everytime there's a response it'll start a new thread).

Categories

Resources