Streaming with fetch API - javascript

I have created .NET Core API and in it I expose a POST endpoint that streams the response over multiple chunks where each chunk contains a JSON object, and I have created Angular client app that queries that endpoint through the fetch API:
const response = await fetch(environment.apiUrl + `/Vehicle/ParseVehiclesData`,
{
method: 'post',
body: file,
});
const reader = response.body.getReader();
while (true) {
const { value, done } = await reader.read();
if (done) break;
let res = JSON.parse(new TextDecoder("utf-8").decode(value));
console.log(res);
}
My goal here is to feed a progress bar with pieces of information about the progress which they already returned from the server in each chunk, now the problem I have is that a behavior occurs when users with weak connection speed tries it, their speed of consumption of the each chunk is less than speed of the server feed, which causes the browser to merge chunks together when there is a backpressure so my attempt to parse the decoded string fails because i will be trying to parse something like
{"name":"abc","progress":50}{"name":"ab1","progress":100"}
multiple json object will be attached together with no delimiters not even break lines,
so what I'm looking for is solution that make prevents the merge of the chunks when there is a backpressure or in the worst case a way to parse JSON objects with no delimiters and feed them to a custom stream or an observable, any help about the subject would be appreciated.
these are some articles i have stumpled upon while searching,
InternetSpeed

You could define a pre-parser to split the msgs by }{ separator (and handle first/last entries):
let preParser = (a)=>a.split('}{').map((item,index) => {
if(a.split('}{').length>1){
if (index== 0){return item+'}'} // first msg
else if (index<a.split('}{').length-1){return '{'+item+'}'} // middle msgs
else {return '{'+item}} // last msg
else {return a} // no need to do anything if single msg
})
a='{"name":"abc","progress":50}'
b='{"name":"abc","progress":50}{"name":"ab1","progress":100}{"name":"ab2","progress":150}'
console.log('example 1')
preParser(a).forEach(v => console.log(JSON.parse(v)))
console.log('example 2')
preParser(b).forEach(v => console.log(JSON.parse(v)))

Related

Get a particular URL in Node JS other ways

I have a REST API of Reddit. I am trying to parse the JSON output to get the URL of the responses. When I try to send the request, I get multiple outputs, but I am not sure how to do it as it's a random response.
https
.get("https://www.reddit.com/r/cute/random.json", resp => {
let data = "";
resp.on("data", chunk => {
data += chunk;
});
const obj = JSON.parse(data);
resp.on("end", () => {
console.log(obj.url);
});
})
.on("error", err => {
console.log("Error: " + err.message);
});
This is the code I have got. I used the default Node's http library and I don't think it worked. I have never used any Node Libraries, so it will be helpful if you can suggest them too. And also let me know if what I have done is right.
I understand that http is a core library of Node JS, but I strongly suggest you to use something like node-fetch. Make sure you run the following command on your terminal (or cmd) where your package.json file exists:
$ npm install node-fetch
This will install the node-fetch library, which acts similarly to how the Web based fetch works.
const fetch = require("node-fetch");
const main = async () => {
const json = await fetch("https://www.reddit.com/r/cute/random.json").then(
res => res.json()
);
console.log(
json
.map(entry => entry.data.children.map(child => child.data.url))
.flat()
.filter(Boolean)
);
};
main();
The URLs that you are looking for, I could find in the data.children[0].data.url so I did a map there. I hope this is something that might help you.
I get multiple output for the same code, run multiple times, because the URL you have used is a Random Article Fetching URL. From their wiki:
/r/random takes you to a random subreddit. You can find a link to /r/random in the header above. Reddit gold members have access to /r/myrandom, which is right next to the random button. Myrandom takes you to any of your subscribed subreddits randomly. /r/randnsfw takes you to a random NSFW (over 18) subreddit.
The output for me is like this:
[ 'https://i.redd.it/pjom447yp8271.jpg' ] // First run
[ 'https://i.redd.it/h9b00p6y4g271.jpg' ] // Second run
[ 'https://v.redd.it/lcejh8z6zp271' ] // Third run
Since it has only one URL, I changed the code to get the first one:
const fetch = require("node-fetch");
const main = async () => {
const json = await fetch("https://www.reddit.com/r/cute/random.json").then(
res => res.json()
);
console.log(
json
.map(entry => entry.data.children.map(child => child.data.url))
.flat()
.filter(Boolean)[0]
);
};
main();
Now it gives me:
'https://i.redd.it/pjom447yp8271.jpg' // First run
'https://i.redd.it/h9b00p6y4g271.jpg' // Second run
'https://v.redd.it/lcejh8z6zp271' // Third run
'https://i.redd.it/b46rf6zben171.jpg' // Fourth run
Preview
I hope this helps you. Feel free to ask me if you need more help. Other alternatives include axios, but I am not sure if this can be used on backend.

How do I stream JSON objects from ExpressJS?

I'm trying to stream JSON objects from an ExpressJS / Node backend API to a frontend site.
I do not want to use Sockets.IO for various reasons. As I understand it, the native streaming libraries should support streaming objects, it appears that just Express is complicating this.
My frontend code seams straight forward. I use Fetch to get my target URL, get a read stream from the response object, and set that read stream to objectMode: true.
Frontend Example:
async function () {
let url = "myurl";
let response = await fetch( url, {
method: 'GET',
mode: 'cors',
wtihCredentials: 'include'
}
const reader = response.body.getReader({objectMode: true });
// Where things are a bit ambiguous
let x = true;
while (x) {
const {done, value} = reader.read()
if (done) { break; }
// do something with value ( I push it to an array )
}
}
Backend Bode Example ( fails because of I cannot change the stream to objectMode )
router.get('/', (request, response) => {
response.writeHead(200, { 'Content-Type' : 'application/json' });
MongoDB.connection.db.collection('myCollection').find({}).forEach( (i) => {
response.write(i);
}).then( () => {
response.end()
})
})
Now my problem is that there does not appear to be anyway to change the ExpressJS write stream to objectMode: true. To my dismay, the ExpressJS documentation doesn't even acknoledge the existence of the write() function on the response object: https://expressjs.com/en/api.html#res
How do I change this over to objectMode: true ?
conversely, I tried to work with the writeStream as a string. The problem that I run into is that when the send buffer fills up, it does it by characters, not by the object. These means that at some point invalid JSON is passed to requester.
A suggested solution that I run into often is that I could read all of the chunks on the client and assemble valid JSON. This defeats the purpose of streaming, so Im trying to find a better way.
For what I believe is the same problem, I cannot figure out how to talk directly to the write stream object from the express code so I am unable to use the native writeStream operation writable.length in order to manually check to see if there is space for the entire JSON object as a string. This is preventing me from using stringified JSON with new line terminators.
https://nodejs.org/api/stream.html#stream_writable_writablelength
https://nodejs.org/api/stream.html#stream_writable_writableobjectmode
Could someone set me straight? I am working with 100k + records in my Mongo database, I really need partical page loading to work so that the users can start picking through the data.

Best way to structure a service in node.js/express?

I am starting to move the logic away from the routes in the express application, into a service provider. One of these routes deals with streams, not only that, it also requires some more logic to take place once the stream is finished. Here is an example of the express route.
router.get("/file-service/public/download/:id", async(req, res) => {
try {
const ID = req.params.id;
FileProvider.publicDownload(ID, (err, {file, stream}) => {
if (err) {
console.log(err.message, err.exception);
return res.status(err.code).send();
} else {
res.set('Content-Type', 'binary/octet-stream');
res.set('Content-Disposition', 'attachment; filename="' + file.filename + '"');
res.set('Content-Length', file.metadata.size);
stream.pipe(res).on("finish", () => {
FileProvider.removePublicOneTimeLink(file);
});
}
})
} catch (e) {
console.log(e);
res.status(500).send(e);
}
})
And here is one of the functions inside the service provider.
this.publicDownload = async(ID, cb) => {
const bucket = new mongoose.mongo.GridFSBucket(conn.db, {
chunkSizeBytes: 1024 * 255,
})
let file = await conn.db.collection("fs.files")
.findOne({"_id": ObjectID(ID)})
if (!file|| !file.metadata.link) {
return cb({
message: "File Not Public/Not Found",
code: 401,
exception: undefined
})
} else {
const password = process.env.KEY;
const IV = file.metadata.IV.buffer
const readStream = bucket.openDownloadStream(ObjectID(ID))
readStream.on("error", (e) => {
console.log("File service public download stream error", e);
})
const CIPHER_KEY = crypto.createHash('sha256').update(password).digest()
const decipher = crypto.createDecipheriv('aes256', CIPHER_KEY, IV);
decipher.on("error", (e) => {
console.log("File service public download decipher error", e);
})
cb(null, {
file,
stream: readStream.pipe(decipher)
})
}
}
Because it is not wise to pass res or req into the service provider (I'm guessing because of unit test).I have to return the stream inside the callback, from there I pipe that stream into the response, and also add an on finish event to remove a one-time download link for a file. Is there any way to move more of this logic into the service provider without passing res/req into it? Or am I going at this all wrong?
Is there any way to move more of this logic into the service provider without passing res/req into it?
As we've discussed in comments, you have a download operation that is part business logic and part web logic. Because you're streaming the response with custom headers, it's not as simple as "business logic get me the data and I'll manage the response completely on my own" like many classic database operations are.
If you are going to keep them completely separate while letting the download process encapsulate as much as it can, you would have to create a higher bandwidth interface between your service provider and the Express code that knows about the res object than just the one callback you have now.
Right now, you only have one operation supported and that's to pass the piped stream. But, the download code really wants to specify the content-type and size information (that's where it's known inside the download code) and it wants to know when the write stream is done so it can do its cleanup logic. And, something you don't show is proper error handling if there's an error while streaming the data to the client (with proper cleanup in that case too).
If you want to move more code into the downloader, you'd have to essentially make a little interface that allows the service code to drive more than one operation on the response, but without having an actual response object. That interface doesn't have to be a full response stream. It could just have methods on it for getting notified when the stream is done, starting the streaming, setting headers, etc...
As I've said in the comments, you will have to decide if that actually makes the code simpler or not. Design guidelines are not absolute. They are things to consider when making design choices. They shouldn't drive you in a direction that gives you code that is significantly more complicated than if made different design choices.

What is the best approach in handling multiple REST API calls for fast retrieving large number of data in JavaScript?

I have two API urls to call the first one is https://jsonplaceholder.typicode.com/todos I need to get this first url to retrieve the id. After retrieving, call the second url which is https://jsonplaceholder.typicode.com/todos/(id). I am using promise-based for this approach, but my problem is.
How to achieve this in fast retrieving the large number of data?
Note: I am using only Plain JavaScript and CDN for axios.
export const getData = () => {
const API = `https://jsonplaceholder.typicode.com/todos`;
return axios.get(API, {
headers: {
"accept": "application/json;odata=verbose"
}
}).then(res => {
const data = [];
const requests = res.map(val => {
const id = val.id;
var obj = {};
const url = `https://jsonplaceholder.typicode.com/todos/(id)`;
return axios.get(url).then(res => {
obj['Result'] = res;
});
});
return Promise.all(requests).then(() => {
return data;
});
});
}
This code is working but it was slow getting the data and I need some suggestions for best concepts.
The fastest way would be to not perform all the AJAX calls from a web browser.
Web browsers cap the number of simultaneous requests around 6–10 (from this post), so if you can perform a request a get a response in 200ms, you're still looking at a full minute of client-side requests.
If instead you built a server-side solution to aggregate the data, you could query your custom endpoint to retrieve larger chunks of data at a time.
If that isn't an option for you, then either way, the browser request limit will probably be your bottleneck.

How to pipe a data stream into a function that's assigned to a constant?

The example below from Google works but it uses pipe. For my situation I'm listening to a websocket that sends packets in 20ms increments and from what I've been able to find there is no way to pipe that data into a function.
The first argument that must be passed on initialization is a config object. After that only data is accepted. So I set the function up as a variable, then pass the config. But I can't figure out how to pass the stream of data into it afterwards. How do I pass data into recognizeStream without using pipe? Or is there are a way to use pipe with websocket
I can vouch for this setup working by reading and writing from temporary files at certain intervals but this has the obvious disadvantages of 1) all of that overhead and 2) most importantly, not being a real-time stream.
There are two solutions that I think would work but have not been able to implement:
There is some way to setup a pipe from websocket (This is ideal)
Simultaneously writing the data to a file while at the same time reading it back using createReadStream from a file using some implementation of fs (This seems like a minefield of problems)
tl;dr I need to send the stream of data from a websocket into a function assigned to a const as the data comes in.
Example setup from Google Docs
const Speech = require('#google-cloud/speech');
// Instantiates a client
const speech = Speech();
// The encoding of the audio file, e.g. 'LINEAR16'
const encoding = 'LINEAR16';
// The sample rate of the audio file, e.g. 16000
const sampleRate = 16000;
const request = {
config: {
encoding: encoding,
sampleRate: sampleRate
}
};
const recognizeStream = speech.createRecognizeStream(request)
.on('error', console.error)
.on('data', (data) => process.stdout.write(data.results));
// Start recording and send the microphone input to the Speech API
record.start({
sampleRate: sampleRate,
threshold: 0
}).pipe(recognizeStream);
Websocket setup
const WebSocketServer = require('websocket').server
wsServer.on('connect', (connection) => {
connection.on('message', (message) => {
if (message.type === 'utf8') {
console.log(message.utf8Data)
} else if (message.type === 'binary') {
// send message.binaryData to recognizeStream
}
})
})
You should just be able to do:
recognizeStream.write(message.binaryData)

Categories

Resources