I'm trying to stream JSON objects from an ExpressJS / Node backend API to a frontend site.
I do not want to use Sockets.IO for various reasons. As I understand it, the native streaming libraries should support streaming objects, it appears that just Express is complicating this.
My frontend code seams straight forward. I use Fetch to get my target URL, get a read stream from the response object, and set that read stream to objectMode: true.
Frontend Example:
async function () {
let url = "myurl";
let response = await fetch( url, {
method: 'GET',
mode: 'cors',
wtihCredentials: 'include'
}
const reader = response.body.getReader({objectMode: true });
// Where things are a bit ambiguous
let x = true;
while (x) {
const {done, value} = reader.read()
if (done) { break; }
// do something with value ( I push it to an array )
}
}
Backend Bode Example ( fails because of I cannot change the stream to objectMode )
router.get('/', (request, response) => {
response.writeHead(200, { 'Content-Type' : 'application/json' });
MongoDB.connection.db.collection('myCollection').find({}).forEach( (i) => {
response.write(i);
}).then( () => {
response.end()
})
})
Now my problem is that there does not appear to be anyway to change the ExpressJS write stream to objectMode: true. To my dismay, the ExpressJS documentation doesn't even acknoledge the existence of the write() function on the response object: https://expressjs.com/en/api.html#res
How do I change this over to objectMode: true ?
conversely, I tried to work with the writeStream as a string. The problem that I run into is that when the send buffer fills up, it does it by characters, not by the object. These means that at some point invalid JSON is passed to requester.
A suggested solution that I run into often is that I could read all of the chunks on the client and assemble valid JSON. This defeats the purpose of streaming, so Im trying to find a better way.
For what I believe is the same problem, I cannot figure out how to talk directly to the write stream object from the express code so I am unable to use the native writeStream operation writable.length in order to manually check to see if there is space for the entire JSON object as a string. This is preventing me from using stringified JSON with new line terminators.
https://nodejs.org/api/stream.html#stream_writable_writablelength
https://nodejs.org/api/stream.html#stream_writable_writableobjectmode
Could someone set me straight? I am working with 100k + records in my Mongo database, I really need partical page loading to work so that the users can start picking through the data.
Related
I have created .NET Core API and in it I expose a POST endpoint that streams the response over multiple chunks where each chunk contains a JSON object, and I have created Angular client app that queries that endpoint through the fetch API:
const response = await fetch(environment.apiUrl + `/Vehicle/ParseVehiclesData`,
{
method: 'post',
body: file,
});
const reader = response.body.getReader();
while (true) {
const { value, done } = await reader.read();
if (done) break;
let res = JSON.parse(new TextDecoder("utf-8").decode(value));
console.log(res);
}
My goal here is to feed a progress bar with pieces of information about the progress which they already returned from the server in each chunk, now the problem I have is that a behavior occurs when users with weak connection speed tries it, their speed of consumption of the each chunk is less than speed of the server feed, which causes the browser to merge chunks together when there is a backpressure so my attempt to parse the decoded string fails because i will be trying to parse something like
{"name":"abc","progress":50}{"name":"ab1","progress":100"}
multiple json object will be attached together with no delimiters not even break lines,
so what I'm looking for is solution that make prevents the merge of the chunks when there is a backpressure or in the worst case a way to parse JSON objects with no delimiters and feed them to a custom stream or an observable, any help about the subject would be appreciated.
these are some articles i have stumpled upon while searching,
InternetSpeed
You could define a pre-parser to split the msgs by }{ separator (and handle first/last entries):
let preParser = (a)=>a.split('}{').map((item,index) => {
if(a.split('}{').length>1){
if (index== 0){return item+'}'} // first msg
else if (index<a.split('}{').length-1){return '{'+item+'}'} // middle msgs
else {return '{'+item}} // last msg
else {return a} // no need to do anything if single msg
})
a='{"name":"abc","progress":50}'
b='{"name":"abc","progress":50}{"name":"ab1","progress":100}{"name":"ab2","progress":150}'
console.log('example 1')
preParser(a).forEach(v => console.log(JSON.parse(v)))
console.log('example 2')
preParser(b).forEach(v => console.log(JSON.parse(v)))
I posted a question yesterday (linked here) where I had been trying to send a PDF to a database, and then retrieve it a later date. Since then I have been advised that it is best to (in my case as I cannot use Cloud Computing services) to upload the PDF files to local storage, and save the URL of the file to the database instead. I have now begun implementing this, but I have come across some trouble.
I am currently using FileReader() as documented below to process the input file and send it to the server:
var input_file = "";
let reader = new FileReader();
reader.readAsText(document.getElementById("input_attachment").files[0]);
reader.onloadend = function () {
input_file = "&file=" + reader.result;
const body = /*all the rest of my data*/ + input_file;
const method = {
method: "POST",
body: body,
headers: {
"Content-type": "application/x-www-form-urlencoded"
}
};
After this bloc of code I do the stock standard fetch() and a route on my server receives this. Almost all data comes in 100% as expected, but the file comes in cut off somewhere around 1300 characters in (making it quite an incomplete PDF). What does appear to come in seems to match the first 1300 characters of the original PDF I uploaded.
I have seen suggestions that you are meant to use "multipart/form-data" content-type to upload files, but when I do this I seem to only then receive the first 700 characters or so of my PDF. I have tried using the middleware Multer to handle the "multipart/form-data" but it just doesn't seem to upload anything (though I can't guarantee that I am using it correctly).
I also initially had trouble with fetch payload too large error message, but have currently resolved this through this method:
app.use(bodyParser.urlencoded({ limit: "50mb", extended: false, parameterLimit: 50000 }));
Though I have suspicions that this may not be correctly implemented as I have seen some discussion that the urlencoded limit is set prior to the file loading, and cannot be changed in the middle of the program.
Any and all help is greatly appreciated, and I will likely use any information here to construct an answer on my original question from yesterday so that anybody else facing these sort of issues have a resource to go to.
I personally found the solution to this problem as follows. On the client-side of my application this code is an example of what was implemented.
formData = new FormData();
formData.append("username", "John Smith");
formData.append("fileToUpload", document.getElementById("input_attachment").files[0]);
const method = {
method: "POST",
body: formData
};
fetch(url, method)
.then(res => res.json())
.then(res => alert("File uploaded!"))
.catch(err => alert(err.message))
As can be noted I have changed from using "application/x-www-form-urlencoded" encoding to "multipart/form-data" to upload files. nodeJS and Express however do not natively support this encoding type. I chose to use the library Formidable (found this to be easiest to use without too much overhead) which can be investigated about here. Below is an example of my server-side implementation of this middleware (Formidable).
const express = require('express');
const app = express();
const formidable = require('formidable');
app.post('/upload', (req, res) => {
const form = formidable({ uploadDir: `${__dirname}/file/`, keepExtensions: true });
form.parse(req, (err, fields, files) => {
if (err) console.log(err.stack);
else {
console.log(fields.username);
});
});
The file(s) are automatically uploaded to the directory specified in uploadDir, and the keepExtensions ensures that the file extension is saved as well. The non-file inputs are accessible through the fields object as seen through the fields.username example above.
From what I have found, this is the easiest method to take to setup an easy file upload system.
I am starting to move the logic away from the routes in the express application, into a service provider. One of these routes deals with streams, not only that, it also requires some more logic to take place once the stream is finished. Here is an example of the express route.
router.get("/file-service/public/download/:id", async(req, res) => {
try {
const ID = req.params.id;
FileProvider.publicDownload(ID, (err, {file, stream}) => {
if (err) {
console.log(err.message, err.exception);
return res.status(err.code).send();
} else {
res.set('Content-Type', 'binary/octet-stream');
res.set('Content-Disposition', 'attachment; filename="' + file.filename + '"');
res.set('Content-Length', file.metadata.size);
stream.pipe(res).on("finish", () => {
FileProvider.removePublicOneTimeLink(file);
});
}
})
} catch (e) {
console.log(e);
res.status(500).send(e);
}
})
And here is one of the functions inside the service provider.
this.publicDownload = async(ID, cb) => {
const bucket = new mongoose.mongo.GridFSBucket(conn.db, {
chunkSizeBytes: 1024 * 255,
})
let file = await conn.db.collection("fs.files")
.findOne({"_id": ObjectID(ID)})
if (!file|| !file.metadata.link) {
return cb({
message: "File Not Public/Not Found",
code: 401,
exception: undefined
})
} else {
const password = process.env.KEY;
const IV = file.metadata.IV.buffer
const readStream = bucket.openDownloadStream(ObjectID(ID))
readStream.on("error", (e) => {
console.log("File service public download stream error", e);
})
const CIPHER_KEY = crypto.createHash('sha256').update(password).digest()
const decipher = crypto.createDecipheriv('aes256', CIPHER_KEY, IV);
decipher.on("error", (e) => {
console.log("File service public download decipher error", e);
})
cb(null, {
file,
stream: readStream.pipe(decipher)
})
}
}
Because it is not wise to pass res or req into the service provider (I'm guessing because of unit test).I have to return the stream inside the callback, from there I pipe that stream into the response, and also add an on finish event to remove a one-time download link for a file. Is there any way to move more of this logic into the service provider without passing res/req into it? Or am I going at this all wrong?
Is there any way to move more of this logic into the service provider without passing res/req into it?
As we've discussed in comments, you have a download operation that is part business logic and part web logic. Because you're streaming the response with custom headers, it's not as simple as "business logic get me the data and I'll manage the response completely on my own" like many classic database operations are.
If you are going to keep them completely separate while letting the download process encapsulate as much as it can, you would have to create a higher bandwidth interface between your service provider and the Express code that knows about the res object than just the one callback you have now.
Right now, you only have one operation supported and that's to pass the piped stream. But, the download code really wants to specify the content-type and size information (that's where it's known inside the download code) and it wants to know when the write stream is done so it can do its cleanup logic. And, something you don't show is proper error handling if there's an error while streaming the data to the client (with proper cleanup in that case too).
If you want to move more code into the downloader, you'd have to essentially make a little interface that allows the service code to drive more than one operation on the response, but without having an actual response object. That interface doesn't have to be a full response stream. It could just have methods on it for getting notified when the stream is done, starting the streaming, setting headers, etc...
As I've said in the comments, you will have to decide if that actually makes the code simpler or not. Design guidelines are not absolute. They are things to consider when making design choices. They shouldn't drive you in a direction that gives you code that is significantly more complicated than if made different design choices.
I've created an api server'ish environment on cloudflare using cloudflare-worker and there is no node server running(cloudflare-worker is pretty much serverless event handler service). It does provide the configurations to handle any subdomain calls much like how api works. I've used a package called cf-worker-router to do so.
My cloud service looks like this:
import { ApiError, ApiRedirect, DomainRouter, FetchRouter } from 'cf-worker-router';
const router = new FetchRouter();
// add the cloudflare event listener
addEventListener('fetch', (event) => {
router.onFetch(event);
});
router.route('/users', 'POST', async (event) => {
// automatically converts anything not of Response type to ApiResponse
return await event.request.text();
});
And what I did was create a POST request to the url and supplied some body to the request. I was able to get the request text successfully but now I can't figure out how to parse the text I received.
When using the request as multipart/form-data request and the received body text is as follows:
"----------------------------093590450792211419875705\r\nContent-Disposition: form-data; name=\"name\"\r\n\r\nJon Doe\r\n----------------------------093590450792211419875705--\r\n"
I tried sending application/x-www-form-urlencoded and I the response text as such:
"name=Jon%20Doe"
And Similar for application/json request:
"{\n\t\"name\": \"Jon Doe\"\n}"
Since cloudflare is not using nodejs server, body-parser can't be applied here. This service is pretty much an open api so it needs to take care of all sorts of request content types. Is there any way to identify and decode the strignified contents from any of these content types to a valid object in javascript?
To handle form data uploads, you can use the request.formData() method which will return a promise of a FormData object.
For example:
addEventListener('fetch', event => {
event.respondWith(handleRequest(event.request))
})
async function handleRequest(request) {
const formData = await request.formData();
const name = formData.get('name');
return new Response(`Hello ${name}`);
}
The example below from Google works but it uses pipe. For my situation I'm listening to a websocket that sends packets in 20ms increments and from what I've been able to find there is no way to pipe that data into a function.
The first argument that must be passed on initialization is a config object. After that only data is accepted. So I set the function up as a variable, then pass the config. But I can't figure out how to pass the stream of data into it afterwards. How do I pass data into recognizeStream without using pipe? Or is there are a way to use pipe with websocket
I can vouch for this setup working by reading and writing from temporary files at certain intervals but this has the obvious disadvantages of 1) all of that overhead and 2) most importantly, not being a real-time stream.
There are two solutions that I think would work but have not been able to implement:
There is some way to setup a pipe from websocket (This is ideal)
Simultaneously writing the data to a file while at the same time reading it back using createReadStream from a file using some implementation of fs (This seems like a minefield of problems)
tl;dr I need to send the stream of data from a websocket into a function assigned to a const as the data comes in.
Example setup from Google Docs
const Speech = require('#google-cloud/speech');
// Instantiates a client
const speech = Speech();
// The encoding of the audio file, e.g. 'LINEAR16'
const encoding = 'LINEAR16';
// The sample rate of the audio file, e.g. 16000
const sampleRate = 16000;
const request = {
config: {
encoding: encoding,
sampleRate: sampleRate
}
};
const recognizeStream = speech.createRecognizeStream(request)
.on('error', console.error)
.on('data', (data) => process.stdout.write(data.results));
// Start recording and send the microphone input to the Speech API
record.start({
sampleRate: sampleRate,
threshold: 0
}).pipe(recognizeStream);
Websocket setup
const WebSocketServer = require('websocket').server
wsServer.on('connect', (connection) => {
connection.on('message', (message) => {
if (message.type === 'utf8') {
console.log(message.utf8Data)
} else if (message.type === 'binary') {
// send message.binaryData to recognizeStream
}
})
})
You should just be able to do:
recognizeStream.write(message.binaryData)