NodeJs: How to handle delayed errors in streams

NodeJs: How to handle delayed errors in streams - javascript

I have the following situation.
function emitErrorInStream() {
let resultStream = new PassThrough();
let testStream = new PassThrough();
testStream.on("error", () => {
throw new Error("AA");
}
// the setTimeout simulates what is actually happening in the code.
/*
* actual code
* let testStream = s3.getObject(params).createReadStream();
* if I pass in an incorrect parameter option to the getObject function
* it will be a few milliseconds before an error is thrown and subsequently caught by the stream's error handling method.
*/
setTimeout(() => {testStream.emit("error", "arg");}, 100);
return testStream.pipe(resultStream);
}
try{
let b = emitErrorInStream();
}
catch(err){
console.log(err) // error will not be caught
}
///... continue
I have tried a slew of things to catch the error thrown inside the error handler. I have tried using promises, which never resolve. How can I catch the error thrown inside thetestStream's error handler?
I have found that sending an end event inside the on("error") handler partially solves my issue as it does not crash the application running. It is not a recommended solution https://nodejs.org/api/stream.html#stream_event_end_1
Lastly, is catching this error possible if emitErrorInStream is a third party function to which I do not have access?
Any insights would be greatly appreciated.
// actual typescript code
downloadStream(bucketName: string, filename: string): Stream {
const emptyStream = new PassThrough();
const params = { Bucket: bucketName, Key: filename };
const s3Stream = this.s3.getObject(params).createReadStream();
// listen to errors returned by the service. i.e. the specified key does not exist.
s3Stream.on("error", (err: any) => {
log.error(`Service Error Downloading File: ${err}`);
// Have to emit an end event here.
// Cannot throw an error as it is outside of the event loop
// and can crash the server.
// TODO: find better solution as it is not recommended https://nodejs.org/api/stream.html#stream_event_end_1
s3Stream.emit("end");
});
return s3Stream.pipe(emptyStream);
}

Related

Node.js htmlparser2 writableStream still emit events after end() call

Sorry for the probable trivial question but I still fail to get how streams work in node.js.
I want to parse an html file and get the path of the first script I encounter. I'd like to interrupt the parsing after the first match but the onopentag() listener is still invoked until the effective end of the html file. why ?
const { WritableStream } = require("htmlparser2/lib/WritableStream");
const scriptPath = await new Promise(function(resolve, reject) {
try {
const parser = new WritableStream({
onopentag: (name, attrib) => {
if (name === "script" && attrib.src) {
console.log(`script : ${attrib.src}`);
resolve(attrib.src); // return the first script, effectively called for each script tag
// none of below calls seem to work
indexStream.unpipe(parser);
parser.emit("close");
parser.end();
parser.destroy();
}
},
onend() {
resolve();
}
});
const indexStream = got.stream("/index.html", {
responseType: 'text',
resolveBodyOnly: true
});
indexStream.pipe(parser); // and parse it
} catch (e) {
reject(e);
}
});
Is it possible to close the parser stream before the effective end of indexStream and if yes how ?
If not why ?
Note that the code works and my promise is effectively resolved using the first match.

There's a little confusion on how the WriteableStream works. First off, when you do this:
const parser = new WritableStream(...)
that's misleading. It really should be this:
const writeStream = new WritableStream(...)
The actual HTML parser is an instance variable in the WritableStream object named ._parser (see code). And, it's that parser that is emitting the onopentag() callbacks and because it's working off a buffer that may have some accumulated text disconnecting from the readstream may not immediately stop events that are still coming from the buffered data.
The parser itself has a public reset() method and it appears that if disconnected from the readstream and then you called that reset method, it should stop emitting events.
You can try this (I'm not a TypeScript person so you may have to massage some things to make the TypeScript compiler happy, but hopefully you can see the concept here):
const { WritableStream } = require("htmlparser2/lib/WritableStream");
const scriptPath = await new Promise(function(resolve, reject) {
try {
const writeStream = new WritableStream({
onopentag: (name, attrib) => {
if (name === "script" && attrib.src) {
console.log(`script : ${attrib.src}`);
resolve(attrib.src); // return the first script, effectively called for each script tag
// disconnect the readstream
indexStream.unpipe(writeStream);
// reset the internal parser so it clears any buffers it
// may still be processing
writeStream._parser.reset();
}
},
onend() {
resolve();
}
});
const indexStream = got.stream("/index.html", {
responseType: 'text',
resolveBodyOnly: true
});
indexStream.pipe(writeStream); // and parse it
} catch (e) {
reject(e);
}
});

Cancelling IDBOpenDBRequest?

I have the following snippet of code:
function open_db(dbname, dbversion, upgrade, onblocked) {
if (upgrade === undefined) {
upgrade = function basic_init(ev) {
…
};
}
if (onblocked === undefined) {
onblocked = function onblocked(ev) {
throw ev;
};
}
let req = window.indexedDB.open(dbname, dbversion);
return new Promise((resolve, reject) => {
req.onsuccess = ev => resolve(ev.target.result);
req.onerror = ev => reject(ev.target.error);
req.onupgradeneeded = ev => {
try {
return upgrade(ev);
} catch (error) {
reject(error);
ev.target.onsuccess = ev => ev.target.close(); // IS THIS LINE NECESSARY?
throw error; // IS THIS LINE UNNECESSARY?
}
};
req.onblocked = ev => {
try {
return onblocked(ev);
} catch (error) {
reject(error);
ev.target.onsuccess = ev => ev.target.close(); // IS THIS LINE NECESSARY?
throw error; // IS THIS LINE UNNECESSARY?
}
};
});
}
If the .onblocked or .onupgradeneeded handlers throw a native error, will that cancel the open attempt? Or will the IDBOpenDBRequest object ignore such errors and steam on ahead obliviously until I manually close the db if/after it's opened?
In a nutshell: are the commented lines of code necessary? Are they sufficient to prevent a dangling open handle?
Is there a better way to cancel the request-to-open, rather than just adding .onsuccess = ev => … .close()?

You're asking the right question ("Is there a better way to cancel the request-to-open... ?") and the answer is: no, not as currently defined/implemented. All you can do is make the open a no-op by aborting the upgrade.
Throwing in a blocked handler doesn't have specified special behavior; doing anything here should be unnecessary, as it will be followed by an upgradeneeded eventually.
On upgradeneeded, closing the connection before the upgrade completes will terminate the request and abort the upgrade, so the version won't change. There are a handful of ways to do this:
call close on the connection (db = e.target.result; db.close();)
Defined by:
https://w3c.github.io/IndexedDB/#open-a-database - If connection was closed...
abort the transaction explicitly (tx = e.target.transaction; tx.abort();)
Defined by: https://w3c.github.io/IndexedDB/#open-a-database - If the upgrade transaction was aborted...
abort the transaction implicitly by throwing within the upgradeneeded event handler.
Defined by: https://w3c.github.io/IndexedDB/#run-an-upgrade-transaction - If didThrow is true...
Note that after seeing upgradeneeded, waiting until success (which your code does) means the transaction will have completed, and the upgrade will have happened.
So in your sample code, the throw statements are effectual (they will abort the upgrade), while the close calls are not. The success event should never fire, in that case, which makes adding handlers for success which close the connection irrelevant.

AbortController.abort(reason), but the reason gets lost before it arrives to the fetch catch clause

I am implementing abortable fetch calls.
There are basically two reasons for aborting the fetch on my page:
the user decides he/she does not want to wait for the AJAX data anymore and clicks a button; in this case the UI shows a message "call /whatever interrupted"
the user has moved to another part of the page and the data being fetched are no longer needed; in this case I don't want the UI to show anything, as it'd just confuse the user
In order to discriminate the two cases I was planning to use the reason parameter of the AbortController.abort method, but the .catch clause in my fetch call always receives a DOMException('The user aborted a request', ABORT_ERROR).
I have tried to provide a different DOMException as reason for the abort in case 2, but the difference is lost.
Has anyone found how to send information to the fetch .catch clause with regards to the reason to abort?

In the example below, I demonstrate how to determine the reason for an abortion of a fetch request. I provide inline comments for explanation. Feel free to comment if anything is unclear.
Re-run the code snippet to see a (potentially different) random result
'use strict';
function delay (ms, value) {
return new Promise(res => setTimeout(() => res(value), ms));
}
function getRandomInt (min = 0, max = 1) {
return Math.floor(Math.random() * (max - min + 1)) + min;
}
// Forward the AbortSignal to fetch:
// https://docs.github.com/en/rest/repos/repos#list-public-repositories
function fetchPublicGHRepos (signal) {
const headers = new Headers([['accept', 'application/vnd.github+json']]);
return fetch('https://api.github.com/repositories', {headers, signal});
}
function example () {
const ac = new AbortController();
const {signal} = ac;
const abortWithReason = (reason) => delay(getRandomInt(1, 5))
.then(() => {
console.log(`Aborting ${signal.aborted ? 'again ' : ''}(reason: ${reason})`);
ac.abort(reason);
});
// Unless GitHub invests HEAVILY into our internet infrastructure,
// one of these promises will resolve before the fetch request
abortWithReason('Reason A');
abortWithReason('Reason B');
fetchPublicGHRepos(signal)
.then(res => console.log(`Fetch succeeded with status: ${res.status}`))
.catch(ex => {
// This is how you can determine if the exception was due to abortion
if (signal.aborted) {
// This is set by the promise which resolved first
// and caused the fetch to abort
const {reason} = signal;
// Use it to guide your logic...
console.log(`Fetch aborted with reason: ${reason}`);
}
else console.log(`Fetch failed with exception: ${ex}`);
});
delay(10).then(() => console.log(`Signal reason: ${signal.reason}`));
}
example();

How to catch an error on a async callback function on outer try/catch block

Ok,
So I am using the puppeteer framework and I have an async function that interact with a webpage.
This function clicks and selects and elements of a webpage while it waiting for the traffic of the page to be idle.
This function works most of the time, but sometimes it stalls.
I want to be able to set a timeout so that if the function is taking longer than a certain amount of time, it throws an error and I can run it again.
So far I cannot seem to get this to work because I cannot get the callback function I pass to setTimeOut() to 'interact' with the outer function.
My code looks like this:
const scrap_webtite = async page => {
/* scrap the site */
try{ // catch all
// set timeout
let timed_out_ID = setTimeout(()=> throw "timeOut", 1000);
// run the async
let el = await sometimes_stalls_function(page);
// if function ran finished correcly
clearTimeout(timed_out_ID);
// save el
save_el(el);
}
}catch(e){
console.error("Something went wrong!", e);
// this makes the function run again
// here is where I want to ideally catch the timeout error
return false
}
}
I have also tried wrapping the setTimeOut function in an Promise as per this post and the using the .then().catch() callbacks to try to catch the error to no avail.
Apologies if this is a stupid question, thank for you help.

The problem you're running into is essentially that the error thrown in setTimeout() is not related to your function flow, and thus can't be caught there. You can essentially think of the timer's callback function as a "detached" function: the variables from the parent scope will still be available, but you can't return a value to the parent directly etc.
To work around this problem you have a few options, Promise.race() is one possible solution. The idea is to first make an async version of a timeout:
const rejectAfter = (timeout) => {
return new Promise((resolve, reject) => {
setTimeout(() => reject(), timeout);
});
};
Then extract your business logic out into a separate async function a-la:
const doTheThing = async () => {
// TODO: Implement
};
And finally in your scraping function, use Promise.race() to use the result from whichever of the two finishes first:
const scrape = async (page) => {
try {
const el = await Promise.race([
rejectAfter(1000),
doTheThing()
]);
} catch(error) {
// TODO: Handle error
}
}

try turning everything in the try block into a promise
const scrap_webtite = async page => {
/* scrap the site */
try{ // catch all
return await new Promise(async(r,j)=>{
// set timeout
let timed_out_ID = setTimeout(()=>j("timeOut"),1000);
// run the async
let el = await sometimes_stalls_function(page);
// if function ran finished correcly
clearTimeout(timed_out_ID);
// save el
r(save_el(el));
})
}catch(e){
console.error("Something went wrong!", e);
// this makes the function run again
// here is where I want to ideally catch the timeout error
return false
}
}

How to connect to mssql server synchronously in node.js

All of the examples for using the mssql client package/tedious driver are for async/callbacks/promises but I'm only developing a microservice that will see limited use and my understanding of asynchronous functions is still a bit fuzzy.
Here's what I have for trying to use async/await :
Report generation class:
const mssql = require('mssql');
const events = require('events');
class reporter {
constructor(searcher, logger) {
// Pass in search type and value or log the error of none defined
this.lg = logger
if (searcher.type && searcher.content) {
this.lg.lg("reporter created", 3)
this.srchType = searcher.type;
this.srchContent = searcher.content;
} else {
this.lg.lg("!MISSING SEARCH PARAMETERS", 0);
this.err = "!MISSING SEARCH PARAMETERS";
}
}
proc() {
//DB Connect async
async () => {
try {
await mssql.connect('mssql://username:password#localhost/database')
this.result = await mssql.query`select * from mytable where id = ${this.searcher}`
} catch (err) {
// ... error checks
}
}
return this.result;
}
}
Then called:
//Pass to reporter for resolution
var report1 = new reporter(searcher, logs);
report1.proc();
I'm sure this is probably a pretty bad way to accomplish this, so I'm also open to any input on good ways to accomplish the end goal, but I'd still like to know if it's possible to accomplish synchronously.

You can't do it synchronously. Figuring out this async stuff is definitely worth your time and effort.
async / await / promises let you more-or-less fake doing it synchronously
const report1 = new reporter(searcher, logs);
report1.proc()
.then ( result => {
/* in this function, "result" is what your async function returned */
/* do res.send() here if you're in express */
} )
.catch ( error => {
/* your lookup failed */
/* inform the client of your web service about the failure
* in an appropriate way. */
} )
And, unwrap the async function in your proc function, like so:
async proc() {
try {
await mssql.connect('mssql://username:password#localhost/database')
this.result = await mssql.query`select * from mytable where id = ${this.searcher}`
} catch (err) {
// ... error checks
}
return this.result;
}
await and .then are analogous.

Kind of an updated answer that continues off of O. Jones' answer.
The current version of Node.js (v15+) has support for top-level await, meaning you can run it all sequentially.
import mssql from 'mssql';
await mssql.connect('mssql://username:password#localhost/database')
const result = await mssql.query`select * from mytable where id = ${this.searcher}`
But it should still be avoided since you want to catch for errors instead of letting it crash.
In current versions of Node.js, if an await/promise rejects, and isn't caught with a .catch(), then the uncaught promise will terminate your application with the error

Develop Reference

JavaScript is the programming language of the Web.