Parse streamed chunk data into JSON - javascript

Hi I'm trying to display data in chunk since I'm getting data in chunk.
for example let us assume that data is something like this.
data: {
user: [
{
name: 'a',
bankAccounts: ['123', '234', '567'],
address: ['some address', 'some other address', 'some more addres']
},
{
name: 'b',
bankAccounts: ['1233', '2334', '5637'],
address: ['some address1', 'some other address1', 'some more addres1']
},
{
name: 'c',
bankAccounts: ['123355', '233455', '563700'],
address: ['some address12', 'some other address12', 'some more addres12']
},
]
}
but the chunk I'm receiving is something like this
1st chunk: "data: user: [ {name: a"
2nd chunk: "bankAccounts: ['123', '234', '567'],"
3rd chunk: "address: ['some address', 'some other address', 'some more addres']"
and so on..
I'm receiving chunked data in such a way which can't be converted into json
since it is incomplete.
How can I stream this data in UI?
Any Idea !!!
My code for fetching streaming data
fetch('some url which stream data')
// Retrieve its body as ReadableStream
.then(response => {
const reader = response.body.getReader();
let decoder = new TextDecoder();
return new ReadableStream({
start(controller) {
return pump();
function pump() {
return reader.read().then(({ done, value }) => {
// When no more data needs to be consumed, close the stream
let newData = decoder.decode(value, {stream: !done});
console.log(newData);
if (done) {
controller.close();
return;
}
// Enqueue the next data chunk into our target stream
controller.enqueue(value);
return pump();
});
}
}
})
})
.then(stream => new Response(stream))
.then(response => {
console.log('response', response)
})

I know that generators are not very commonly used, but i feel like they would be perfect for streaming the data in this task,
async function* streamAsyncIterator(stream) {
const reader = stream.getReader();
const decoder = new TextDecoder();
while (true) {
const {done,value} = await reader.read();
if (done) break;
yield decoder.decode(value, { stream: !done });
}
reader.releaseLock();
}
fetch('https://httpbin.org/stream/1')
.then(async response => {
let str="";
for await (const value of streamAsyncIterator(response.body))
str+=value;
return JSON.parse(str);
})
.then(response => {
console.log('response', response)
})
however it seems what you want is to parse partially complete JSON, which can be achieved in variety of ways, for instance by using an npm library partial-json-parser
import partialParse from 'partial-json-parser';
fetch('https://httpbin.org/stream/1')
.then(async response => {
let str="";
for await (const value of streamAsyncIterator(response.body)){
str+=value;
functionUpdatingYourUi(partialParse(str));
}
return JSON.parse(str);
})
.then(response => {
console.log('response', response)
})

You can accept a string(start with an empty string) to your function pump and keep appending it until chunk is there. at the end when terminating the recursion, return the parsed data.
const manager = require('./manager');
// manager.UpdateEC2Instances().then(console.log);
manager.UpdateRDSInstances().then(console.log);
fetch('some url which stream data')
// Retrieve its body as ReadableStream
.then(response => {
const reader = response.body.getReader();
let decoder = new TextDecoder();
return new ReadableStream({
start(controller) {
return pump('');
function pump(str) {
return reader.read().then(({ done, value }) => {
// When no more data needs to be consumed, close the stream
str += decoder.decode(value, { stream: !done });
console.log(str);
if (done) {
controller.close();
return JSON.parse(str);
}
// Enqueue the next data chunk into our target stream
controller.enqueue(value);
return pump(str);
});
}
}
})
})
.then(stream => new Response(stream))
.then(response => {
console.log('response', response)
})

See this thread for a more complete discussion & more complete examples from #Damian Nadales.
If you are expecting your chunks to be complete JSON, which is not at all guarantee, you may decode your chunked value (of type Uint8Array) into UTF-8 using TextDecoder.decode, then parse the JSON using JSON.parse. E.g.,
var num = JSON.parse(
new TextDecoder("utf-8").decode(result.value)
);

Related

Upload byte array from axios to Node server

Background
Javascript library for Microsoft Office add-ins allows you to get raw content of the DOCX file through getFileAsync() api, which returns a slice of up to 4MB in one go. You keep calling the function using a sliding window approach till you have reed entire content. I need to upload these slices to the server and the join them back to recreate the original DOCX file.
My attempt
I'm using axios on the client-side and busboy-based express-chunked-file-upload middleware on my node server. As I call getFileAsync recursively, I get a raw array of bytes that I then convert to a Blob and append to FormData before posting it to the node server. The entire thing works and I get the slice on the server. However, the chunk that gets written to the disk on the server is much larger than the blob I uploaded, normally of the order of 3 times, so it is obviously not getting what I sent.
My suspicion is that this may have to do with stream encoding, but the node middleware does not expose any options to set encoding.
Here is the current state of code:
Client-side
public sendActiveDocument(uploadAs: string, sliceSize: number): Promise<boolean> {
return new Promise<boolean>((resolve) => {
Office.context.document.getFileAsync(Office.FileType.Compressed,
{ sliceSize: sliceSize },
async (result) => {
if (result.status == Office.AsyncResultStatus.Succeeded) {
// Get the File object from the result.
const myFile = result.value;
const state = {
file: myFile,
filename: uploadAs,
counter: 0,
sliceCount: myFile.sliceCount,
chunkSize: sliceSize
} as getFileState;
console.log("Getting file of " + myFile.size + " bytes");
const hash = makeId(12)
this.getSlice(state, hash).then(resolve(true))
} else {
resolve(false)
}
})
})
}
private async getSlice(state: getFileState, fileHash: string): Promise<boolean> {
const result = await this.getSliceAsyncPromise(state.file, state.counter)
if (result.status == Office.AsyncResultStatus.Succeeded) {
const data = result.value.data;
if (data) {
const formData = new FormData();
formData.append("file", new Blob([data]), state.filename);
const boundary = makeId(12);
const start = state.counter * state.chunkSize
const end = (state.counter + 1) * state.chunkSize
const total = state.file.size
return await Axios.post('/upload', formData, {
headers: {
"Content-Type": `multipart/form-data; boundary=${boundary}`,
"file-chunk-id": fileHash,
"file-chunk-size": state.chunkSize,
"Content-Range": 'bytes ' + start + '-' + end + '/' + total,
},
}).then(async res => {
if (res.status === 200) {
state.counter++;
if (state.counter < state.sliceCount) {
return await this.getSlice(state, fileHash);
}
else {
this.closeFile(state);
return true
}
}
else {
return false
}
}).catch(err => {
console.log(err)
this.closeFile(state)
return false
})
} else {
return false
}
}
else {
console.log(result.status);
return false
}
}
private getSliceAsyncPromise(file: Office.File, sliceNumber: number): Promise<Office.AsyncResult<Office.Slice>> {
return new Promise(function (resolve) {
file.getSliceAsync(sliceNumber, result => resolve(result))
})
}
Server-side
This code is totally from the npm package (link above), so I'm not supposed to change anything in here, but still for reference:
makeMiddleware = () => {
return (req, res, next) => {
const busboy = new Busboy({ headers: req.headers });
busboy.on('file', (fieldName, file, filename, _0, _1) => {
if (this.fileField !== fieldName) { // Current field is not handled.
return next();
}
const chunkSize = req.headers[this.chunkSizeHeader] || 500000; // Default: 500Kb.
const chunkId = req.headers[this.chunkIdHeader] || 'unique-file-id'; // If not specified, will reuse same chunk id.
// NOTE: Using the same chunk id for multiple file uploads in parallel will corrupt the result.
const contentRangeHeader = req.headers['content-range'];
let contentRange;
const errorMessage = util.format(
'Invalid Content-Range header: %s', contentRangeHeader
);
try {
contentRange = parse(contentRangeHeader);
} catch (err) {
return next(new Error(errorMessage));
}
if (!contentRange) {
return next(new Error(errorMessage));
}
const part = contentRange.start / chunkSize;
const partFilename = util.format('%i.part', part);
const tmpDir = util.format('/tmp/%s', chunkId);
this._makeSureDirExists(tmpDir);
const partPath = path.join(tmpDir, partFilename);
const writableStream = fs.createWriteStream(partPath);
file.pipe(writableStream);
file.on('end', () => {
req.filePart = part;
if (this._isLastPart(contentRange)) {
req.isLastPart = true;
this._buildOriginalFile(chunkId, chunkSize, contentRange, filename).then(() => {
next();
}).catch(_ => {
const errorMessage = 'Failed merging parts.';
next(new Error(errorMessage));
});
} else {
req.isLastPart = false;
next();
}
});
});
req.pipe(busboy);
};
}
Update
So it looks like I have found the problem at least. busboy appears to be writing my array of bytes as text in the output file. I get 80,75,3,4,20,0,6,0,8,0,0,0,33,0,44,25 (as text) when I upload the array of bytes [80,75,3,4,20,0,6,0,8,0,0,0,33,0,44,25]. Now need to figure out how to force it to write it as a binary stream.
Figured out. Just in case it helps anyone, there was no problem with busboy or office.js or axios. I just had to convert the incoming chunk of data to Uint8Array before creating a blob from it. So instead of:
formData.append("file", new Blob([data]), state.filename);
like this:
const blob = new Blob([ new Uint8Array(data) ])
formData.append("file", blob, state.filename);
And it worked like a charm.

Text -> PNG -> ReadStream, all done on the front-end?

I'm not sure if this is even possible, but here's what I'm trying to do:
Let the user enter some text
Generate a PNG from that text
Upload it to Pinata, which requires it to be in ReadStream format
Do all of this on the front-end
I've managed to accomplish (1) and (2) using html2canvas.
The tricky part is (3). The reason it has to be in ReadStream format is because that's the format Pinata's SDK wants:
const fs = require('fs');
const readableStreamForFile = fs.createReadStream('./yourfile.png');
const options = {
pinataMetadata: {
name: MyCustomName,
keyvalues: {
customKey: 'customValue',
customKey2: 'customValue2'
}
},
pinataOptions: {
cidVersion: 0
}
};
pinata.pinFileToIPFS(readableStreamForFile, options).then((result) => {
//handle results here
console.log(result);
}).catch((err) => {
//handle error here
console.log(err);
});
I realize that this would be no problem to do on the backend with node, but I'd like to do it on the front-end. Is that at all possible? Or am I crazy?
I'm specifically using Vue if that matters.
For anyone interested the solution ended up being using fetch+blob:
const generateImg = async () => {
const canvas = await html2canvas(document.getElementById('hello'));
const img = canvas.toDataURL('image/png');
const res = await fetch(img);
return res.blob();
};
This blob can then be passed into a more manual version of their SDK:
const uploadImg = (blob: Blob) => {
const url = `https://api.pinata.cloud/pinning/pinFileToIPFS`;
const data = new FormData();
data.append('file', blob);
const metadata = JSON.stringify({
name: 'testname',
});
data.append('pinataMetadata', metadata);
const pinataOptions = JSON.stringify({
cidVersion: 0,
});
data.append('pinataOptions', pinataOptions);
return axios
.post(url, data, {
maxBodyLength: 'Infinity' as any, // this is needed to prevent axios from erroring out with large files
headers: {
// #ts-ignore
'Content-Type': `multipart/form-data; boundary=${data._boundary}`,
pinata_api_key: apiKey,
pinata_secret_api_key: apiSecret,
},
})
.then((response) => {
console.log(response);
})
.catch((error) => {
console.log(error);
});
};

Get list of objects from s3 bucket (min.io or amazon) using promise

I am trying to get a list of object name from s3 bucket using min.io javascript API (https://docs.min.io/docs/javascript-client-api-reference#listObjectsV2). The API returns a stream. However, I always get an empty list.
The example of the dataStream is:
{
name: 'sample-mp4-file-1.mp4',
lastModified: 2020-10-14T02:35:38.308Z,
etag: '5021b3b7c402468d5b018a8b4a2b448a',
size: 10546620
}
{
name: 'sample-mp4-file-2.mp4',
lastModified: 2020-10-14T15:54:44.672Z,
etag: '5021b3b7c402468d5b018a8b4a2b448a',
size: 10546620
}
My function
public async listFiles(
bucketName: string,
prefix?: string
): Promise<string[]> {
const objectsList = [];
await minioClient.listObjectsV2(bucketName, "", true, "", function(
err,
dataStream
) {
if (err) {
console.log("Error listFiles: ", err);
return;
}
console.log("Succesfully get data");
dataStream.on("data", function(obj) {
objectsList.push(obj.name);
});
dataStream.on("error", function(e) {
console.log(e);
});
dataStream.on("end", function(e) {
console.log("Total number of objects: ", objectsList.length);
});
});
return objectsList;
}
Expected output is a list object name, [sample-mp4-file-1.mp4, sample-mp4-file-2.mp4]
According to the documentation, listObjectsV2() is returning a stream, not a promise. Therefore, await is returning immediately, before objectsList will contain anything.
The API you're using has to support Promises if you want to await them.
You could work around this by doing something like this:
const objectsList = await new Promise((resolve, reject) => {
const objectsListTemp = [];
const stream = minioClient.listObjectsV2(bucketName, '', true, '');
stream.on('data', obj => objectsListTemp.push(obj.name));
stream.on('error', reject);
stream.on('end', () => {
resolve(objectsListTemp);
});
});

DynamoDB update does not console.log any output

I have the following code. This code is supposed to receive an SQS message, read the body, then update a dynamo record with the information contained within that body. The update is not working which is one issue, but even stranger I'm not getting any output from the dynamodb update. The last line of output is the console.log which details the SQS message, then the function ends.
How is this possible? Shouldn't dynamo return some kind of output?
console.log('Loading function');
const util = require('util')
const AWS = require('aws-sdk');
var documentClient = new AWS.DynamoDB.DocumentClient();
exports.handler = async(event) => {
//console.log('Received event:', JSON.stringify(event, null, 2));
for (const { messageId, body } of event.Records) {
//const { body } = event.Records[0];
//console.log(body)
console.log('SQS message %s: %j', messageId, body);
const JSONBody = JSON.parse(body)
//const message = JSON.parse(test["Message"]);
const id = JSONBody.id;
const city = JSONBody.City;
const address = JSONBody.Address;
const params = {
TableName: 'myTable',
Key: {
ID: ':id',
},
UpdateExpression: 'set address = :address',
ExpressionAttributeValues: {
':id': id,
':address': address,
':sortKey': "null"
}
//ReturnValues: "UPDATED_NEW"
};
documentClient.update(params, function(err, data) {
if (err) console.log(err);
else console.log(data);
});
}
return `Successfully processed ${event.Records.length} messages.`;
};
There're a couple of ways to do this, but I'm not sure about your use cases: Are operations are critical? Do the failed items need to be handled? Are performance need to be boosted as the large dataset? etc...
// I'm not recommend to this implementation
const { DynamoDB } = require('aws-sdk');
const documentClient = new DynamoDB.DocumentClient();
exports.handler = async (event) => {
for (const { messageId, body } of event.Records) {
console.log('SQS message %s: %j', messageId, body);
// Parse json is dangerous without knowing the structure, remember to handle
// when error occured
const JSONBody = JSON.parse(body)
const id = JSONBody.id;
const address = JSONBody.Address;
const params = {
TableName: 'myTable',
Key: {
ID: ':id',
},
UpdateExpression: 'set address = :address',
ExpressionAttributeValues: {
':id': id,
':address': address,
':sortKey': "null"
},
ReturnValues: "UPDATED_NEW"
};
// Wait for each update operation to finished
// IO time will be extended
await documentClient.update(params)
.promise()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err);
})
}
// In case there's a failed update operation, this message still be returned by lambda handler
return `Successfully processed ${event.Records.length} messages.`;
};
// My recommended way
const AWS = require('aws-sdk');
const documentClient = new AWS.DynamoDB.DocumentClient();
exports.handler = async (event) => {
// All the update operation is fired nearly concurrently
// IO will be reduced
return Promise.all(event.Records.map(({ messageId, body }) => {
console.log('SQS message %s: %j', messageId, body);
// Parse json is dangerous without knowing the structure, remember to handle
// when error occured
const JSONBody = JSON.parse(body)
const id = JSONBody.id;
const address = JSONBody.Address;
const params = {
TableName: 'myTable',
Key: {
ID: ':id',
},
UpdateExpression: 'set address = :address',
ExpressionAttributeValues: {
':id': id,
':address': address,
':sortKey': "null"
},
ReturnValues: "UPDATED_NEW"
};
return documentClient.update(params)
.promise()
.then(res => {
console.log(res)
})
}))
// When lambda handler finised all the update, lambda handler return a string
.then(() => {
return `Successfully processed ${event.Records.length} messages.`
})
// In case any of the update operation failed, the next update operations is cancelled
// Lambda handler return undefined
.catch(error => {
console.error(error);
// return some error for lambda response.
})
};
P/s: My two cents, before you do any kind of Lamba development with node.js runtime, you should understand the differences between callbacks, promises, await/async in javascript.
Fixed it by making the method synchronous, i.e removed async from the function def

Angular 2 Synchronous File Upload

I am trying to upload a file to web api which takes the file as byte array using angular 2 application.
I am not able to pass the byte array from angular 2 page to web api. It looks like the File Reader read method is asynchronous. How do I make this as synchronous call or wait for the file content to be loaded before executing the next line of code?
Below is my code
//attachment on browse - when the browse button is clicked
//It only assign the file to a local variable (attachment)
fileChange = (event) => {
var files = event.target.files;
if (files.length > 0) {
this.attachment = files[0];
}
}
//when the submit button is clicked
onSubmit = () => {
//Read the content of the file and store it in local variable (fileData)
let fr = new FileReader();
let data = new Blob([this.attachment]);
fr.readAsArrayBuffer(data);
fr.onloadend = () => {
this.fileData = fr.result; //Note : This always "undefined"
};
//build the attachment object which will be sent to Web API
let attachment: Attachment = {
AttachmentId: '0',
FileName: this.form.controls["attachmentName"].value,
FileData: this.fileData
}
//build the purchase order object
let order: UpdatePurchaseOrder = {
SendEmail: true,
PurchaseOrderNumber: this.form.controls["purchaseOrderNumber"].value,
Attachment: attachment
}
//call the web api and pass the purchaseorder object
this.updatePoService
.updatePurchaseOrder(this.form.controls["purchaseOrderRequestId"].value, order)
.subscribe(data => {
if (data) {
this.saveSuccess = true;
}
else {
this.saveSuccess = false;
}
},
error => this.errors = error,
() => this.res = 'Completed'
);
}
Any hint would be useful.
regards,
-Alan-
You cannot make this async call synchronous. But you can take advantage of the observables to wait for the files to be read:
//when the submit button is clicked
onSubmit = () => {
let file = Observable.create((observer) => {
let fr = new FileReader();
let data = new Blob([this.attachment]);
fr.readAsArrayBuffer(data);
fr.onloadend = () => {
observer.next(fr.result);
observer.complete()
};
fr.onerror = (err) => {
observer.error(err)
}
fr.onabort = () => {
observer.error("aborted")
}
});
file.map((fileData) => {
//build the attachment object which will be sent to Web API
let attachment: Attachment = {
AttachmentId: '0',
FileName: this.form.controls["attachmentName"].value,
FileData: fileData
}
//build the purchase order object
let order: UpdatePurchaseOrder = {
SendEmail: true,
PurchaseOrderNumber: this.form.controls["purchaseOrderNumber"].value,
Attachment: attachment
}
return order;
})
.switchMap(order => this.updatePoService.updatePurchaseOrder(this.form.controls["purchaseOrderRequestId"].value, order))
.subscribe(data => {
if (data) {
this.saveSuccess = true;
} else {
this.saveSuccess = false;
}
},
error => this.errors = error,
() => this.res = 'Completed'
);
}
I arrived here looking for a solution for a similar issue. I'm performing requests to an endpoint which can response a binary blob if anything goes well or a JSON file in event of error.
this.httpClient.post(urlService, bodyRequest,
{responseType: 'blob', headers: headers})
.pipe(map((response: Response) => response),
catchError((err: Error | HttpErrorResponse) => {
if (err instanceof HttpErrorResponse) {
// here, err.error is a BLOB containing a JSON String with the error message
} else {
return throwError(ErrorDataService.overLoadError(err, message));
}
}));
As FileReaderSync apparently doesn't work in Angular6 I took n00dl3's solution (above) to throw the error after parsing the Blob content:
return this.httpClient.post(urlService, bodyRequest,
{responseType: 'blob', headers: headers})
.pipe(map((response: Response) => response),
catchError((err: Error | HttpErrorResponse) => {
const message = `In TtsService.getTts(${locale},${outputFormat}). ${err.message}`;
if (err instanceof HttpErrorResponse) {
const $errBlobReader: Observable<HttpErrorResponse> = Observable.create((observer) => {
const fr = new FileReader();
const errorBlob = err.error;
fr.readAsText(errorBlob, 'utf8');
fr.onloadend = () => {
const errMsg = JSON.parse(fr.result).message;
const msg = `In TtsService.getTts(${locale},${outputFormat}). ${errMsg}`;
observer.error(ErrorDataService.overLoadError(err, msg));
};
fr.onerror = (blobReadError) => {
observer.error(blobReadError);
};
fr.onabort = () => {
observer.error('aborted');
};
});
return $errBlobReader;
} else {
return throwError(ErrorDataService.overLoadError(err, message));
}
}));
Thanks! You really saved my day!

Categories

Resources