Nodejs - data download and saving in chunks - javascript

What I am trying to do
I am fetching an image from a website with node-fetch, and I'm saving it to disk. My question is if I can save that image in chunks rather than downloading all of the image, and then saving all of the downloaded data at once. Doesn't seem like that node way to do things.
node-fetch documentation on how to save images
fetch('https://assets-cdn.github.com/images/modules/logos_page/Octocat.png')
.then(res => {
const dest = fs.createWriteStream('./octocat.png');
res.body.pipe(dest);
});
That results in a 0 byte big image, on closer inspection res.body.pipe() is undefined, but doesn't throw an error ?
What I currently have
After a bit of investigation I noticed that the "downloaded" data is in a blob form. MDN says that I should "process"(?) that blob with arrayBuffer and then I can save that buffer to disk with .write.
const result = await (await (await fetch(url)).blob()).arrayBuffer();
const fileStream = fs.createWriteStream('./1.jpg').write(Buffer.from(result));
This setup "works", but the saving nor the downloading of the file is not done in chunks, everything is basically synchronous. I feel like I am missing something or I am doing something wrong.

Related

How to use GCP DLP with a file stream

I'm working with Node.js and GCP Data Loss Prevention to attempt to redact sensitive data from PDFs before I display them. GCP has great documentation on this here
Essentially you pull in the nodejs library and run this
const fileBytes = Buffer.from(fs.readFileSync(filepath)).toString('base64');
// Construct image redaction request
const request = {
parent: `projects/${projectId}/locations/global`,
byteItem: {
type: fileTypeConstant,
data: fileBytes,
},
inspectConfig: {
minLikelihood: minLikelihood,
infoTypes: infoTypes,
},
imageRedactionConfigs: imageRedactionConfigs,
};
// Run image redaction request
const [response] = await dlp.redactImage(request);
const image = response.redactedImage;
So normally, I'd get the file as a buffer, then pass it to the DLP function like the above. But, I'm no longer getting our files as buffers. Since many files are very large, we now get them from FilesStorage as streams, like so
return FilesStorage.getFileStream(metaFileInfo1, metaFileInfo2, metaFileInfo3, fileId)
.then(stream => {
return {fileInfo, stream};
})
The question is, is it possible to perform DLP image redaction on a stream instead of a buffer? If so, how?
I've found some other questions that say you can stream with ByteContentItem and GCPs own documentation mentions "streams". But, I've tried passing the returned stream from .getFileStream into the above byteItem['data'] property, and it doesn't work.
So chunking the stream up into buffers of appropriate size is going to work best here. There seem to be a number of approaches to build buffers from a stream you can use here.
Potentially relevant: Convert stream into buffer?
(A native stream interface is a good feature request, just not yet there.)

How to convert Base64 string to PNG (currently using Jimp)?

I'm currently creating a real-time chat application. This is a web application that uses node.js for the backend and uses socket.io to connect back and forth.
Currently, I'm working on creating user profiles with profile pictures. These profile pictures will be stored in a folder called images/profiles/. The file will be named by the user's id. For example: user with the id 1 will have their profile pictures stored in images/profiles/1.png. Very self-explanatory.
When the user submits the form to change their profile picture, the browser JavaScript will get the image, and send it to the server:
form.addEventListener('submit', handleForm)
function handleForm(event) {
event.preventDefault(); // stop page from reloading
let profilePicture; // set variable for profile picture
let profilePictureInput = document.getElementById('profilePictureInput'); // get image input
const files = profilePictureInput.files[0]; // get input's files
if (files) {
const fileReader = new FileReader(); // initialize file reader
fileReader.readAsDataURL(files);
fileReader.onload = function () {
profilePicture = this.result; // put result into variable
socket.emit("request-name", {
profilePicture: profilePicture,
id: userID,
}); // send result, along with user id, to server
}
}
I've commented most of the code so it's easy to follow. The server then gets this information. With this information, the server is supposed to convert the sent image to a png format (I can do whatever format, but it has to be the same format for all images). I am currently using the jimp library to do this task, but it doesn't seem to work.
const jimp = require('jimp'); // initialize Jimp
socket.on('request-name', (data) => { // when request has been received
// read the buffer from image (I'm not 100% sure what Buffer.from() does, but I saw this online)
jimp.read(Buffer.from(data.profilePicture), function (error, image) {
if (error) throw error; // throw error if there is one
image.write(`images/profiles/${data.id}.png`); // write image to designated place
}
});
The error I get:
Error: Could not find MIME for Buffer <null>
I've scoured the internet for answers but was unable to find any. I am available to use another library if this helps. I can also change the file format (.png to .jpg or .jpeg, if needed; it just needs to be consistent with all files). The only things I cannot change are the use of JavaScript/Node.js and socket.io to send the information to the server.
Thank you in advance. Any and all help is appreciated.
If you're just getting the data URI as a string, then you can construct a buffer with it and then use the built in fs to write the file. Make sure the relative path is accurate.
socket.on('request-name', data => {
const imgBuffer = Buffer.from(data.profilePicture, 'base64');
fs.writeFile(`images/profiles/${data.id}.png`, imgBuffer);
}

How to read binary data response from AWS when doing a GET directly to an S3 URI in browser?

Some general context: This is an app that uses the MERN stack, but the question is more specific to AWS S3 data.
I have an S3 set up and i store images and files from my app there. I usually generate signedURLs with the server and do a direct upload from the browser.
within my app db i store the object URIs as a string and then an image for example i can render with an <img/> tag no problem. So far so good.
However, when they are PDFs and i want to let the user download the PDF i stored in S3, doing an <a href={s3Uri} download> just causes the pdf to be opened in another window/tab instead of prompting the user to download. I believe this is due to the download attribute being dependent on same-origin and you cannot download a file from an external resource (correct me if im wrong please)
So then my next attempt is to then do an http fetch of the resource directly using axios, it looks something like this
axios.create({
baseURL: attachment.fileUrl,
headers: {common: {Authorization: ''}}
})
.get('')
.then(res => {
console.log(res)
console.log(typeof res.data)
console.log(new Buffer.from(res.data).toString())
})
So by doing this I am successfully reading the response headers (useful cuz then i can handle images/files differently) BUT when i try to read the binary data returned i have been unsuccessful and parsing it or even determining how it is encoded, it looks like this
%PDF-1.3
3 0 obj
<</Type /Page
/Parent 1 0 R
/Resources 2 0 R
/Contents 4 0 R>>
endobj
4 0 obj
<</Filter /FlateDecode /Length 1811>>
stream
x�X�R�=k=E׷�������Na˅��/���� �[�]��.�,��^ �wF0�.��Ie�0�o��ݧO_IoG����p��4�BJI���g��d|��H�$�12(R*oB��:%먺�����:�R�Ф6�Xɔ�[:�[��h�(�MQ���>���;l[[��VN�hK/][�!�mJC
.... and so on
I have another function I use to allow users to download PDFs that i store directly in my database as strings in base64. These are PDF's my app generates and are fairly small so i store them directly in the DB, as opposed to the ones i store in AWS S3 which are user-submitted and can be several MBs in size (the ones in my db are just a few KB)
The function I use to process my base64 pdfs and provide a downloadable link to the users looks like this
export const makePdfUrlFromBase64 = (base64) => {
const binaryImg = atob(base64);
const binaryImgLength = binaryImg.length;
const arrayBuffer = new ArrayBuffer(binaryImgLength);
const uInt8Array = new Uint8Array(arrayBuffer);
for (let i = 0; i < binaryImgLength; i++) {
uInt8Array[i] = binaryImg.charCodeAt(i);
}
const outputBlob = new Blob([uInt8Array], {type: 'application/pdf'});
return URL.createObjectURL(outputBlob)
}
HOWEVER, when i try to apply this function to the data returned from AWS i get this error:
DOMException: Failed to execute 'atob' on 'Window': The string to be decoded contains characters outside of the Latin1 range.
So what kind of binary data encoding do i have here from AWS?
Note: I am able to render an image with this binary data by passing the src in the img tag like this:
<img src={data:${res.headers['Content-Type']};base64,${res.data}} />
which is my biggest hint that this is some form of base64?
PLEASE! If anyone has a clue how i can achieve my goal here, im all ears! The goal is to be able to prompt the user to download the resource which i have in an S3 URI. I can link to it and they can open it in browser, and then download manually, but i want to force the prompt.
Anybody know what kind of data is being returned here? any way to parse it as a stream? a buffer?
I have tried to stringify it with JSON or to log it to the console as a string, im open to all suggestions at this point
You're doing all kinds of unneeded conversions. When you do the GET request, you already have the data in the desired format.
const response = await fetch(attachment.fileUrl,
headers: {Authorization: ''}}
});
const blob = await response.blob();
return URL.createObjectURL(res.data);

How do you upload a blob base64 string to google cloud storage using Node.js

I'm building an application that will allow me to take a picture from my react app which accesses the web cam, then I need to upload the image to google cloud storage using a Hapi node.js server. The problem I'm encountering is that the react app snaps a picture and gives me this blob string (I actually don't even know if that's what it's called) But the string is very large and looks like this (I've shortened it due to it's really large size:
"imageBlob": "...
I'm finding it hard to find resources that show me how to do this exactly, I need to upload that blob file and save it to a google cloud storage bucket.
I have this in my app so-far:
Item.postImageToStorage = async (request, h) => {
const image = request.payload.imageBlob;
const projectId = 'my-project-id'
const keyFilename = 'path-to-my-file'
const gc = new Storage({
projectId: projectId,
keyFilename: keyFilename
})
const bucket = gc.bucket('my-bucket.appspot.com/securityCam');
const blob = bucket.file(image);
const blobStream = blob.createWriteStream();
blobStream.on('error', err => {
h.response({
success: false,
error: err.message || '=-->' + err
})
});
console.log('===---> ', 'no errors::::')
blobStream.on('finish', () => {
console.log('done::::::', `https://storage.googleapis.com/${bucket.name}/${blob.name}`)
// The public URL can be used to directly access the file via HTTP.
const publicUrl = format(
`https://storage.googleapis.com/${bucket.name}/${blob.name}`
);
});
console.log('===---> ', 'past finish::::')
blobStream.end(image);
console.log('===---> ', 'at end::::')
return h.response({
success: true,
})
// Utils.postRequestor(path, payload, headers, timeout)
}
I ge to the success message/response h.response but no console logs appear except the ones outside of the blobStream.on I see all that start with ===---> but nothing else.
Not sure what I'm doing wrong, thanks in advance!
At the highest level, let us assume you want to write into file my-file.dat that is to live in bucket my-bucket/my-folder. Let us assume that the data you want to write is a binary chunk of data that is stored in a JavaScript Buffer object referenced by a variable called my_data. We would then want to code something similar to :
const bucket = gc.bucket('my-bucket/my-folder');
const my_file = bucket.file('my-file.dat');
const my_stream = my_file.createWriteStream();
my_stream.write(my_data);
my_stream.end();
In your example, something looks fishy with the value you are passing in as the file name in the line:
const blob = bucket.file(image);
I'm almost imagining you are thinking you are passing in the content of the file rather than the name of the file.
Also realize that your JavaScript object field called "imageBlob" will be a String. It may be that it indeed what you want to save but I can also imagine that what you want to save is binary data corresponding to your webcam image. In which case you will have to decode the string to a binary Buffer. This looks like it will be extracting the string data starting data:image/jpeg;base64, and then creating a Buffer from that by treating the string as Base64 encoded binary.
Edit: fixed typo

Write image file to Firebase Storage from HTTP function

I am trying to write an image Firebase Storage via a Cloud Function (for more suitable write access).
My current attempt is to read the file object on the client, send it (the data) to an http firebase function, and then save it to storage from there. After saving the file successfully, I try using the download url as an img src value, but the file does not display. I also see an error in the Storage console (Error loading preview) when attempting to view the file.
If I save the data in Storage as base64, I can copy the contents of the file into the img src attribute, and it displays fine. However, I'd like to simply use the download URL as I could do if I just uploaded the image via the client SDK or directly via the console.
In the client, I'm simply using FileReader to read the uploaded file for sending. I've tried all the ways of reading it (readAsText,readAsBinaryString, readAsDataURL, readAsArrayBuffer), but none seem to solve the issue.
Here is how I am uploading the file via the Firebase Function:
import * as functions from 'firebase-functions';
import * as admin from 'firebase-admin';
import * as path from 'path';
import * as os from 'os';
import * as fs from 'fs-extra';
export default functions.https.onCall(async(req, context) => {
const filename = req.filename;
const bucket = admin.storage().bucket(environment.bucket)
const temp_filename = filename;
const temp_filepath = path.join(os.tmpdir(), temp_filename);
await fs.outputFile(temp_filepath, req.data, {});
// Upload.
await bucket.upload(temp_filepath, {destination: 'logos'})
.then((val) => {})
.catch((err) => {});
});
This uploads the file successfully, however, the Download URL does not work when used as the img src attribute.
One thing I have noticed is that when using the client SDK to send a file (via AngularFireStorage), the payload is the raw png contents. E.g. a snippet of the file:
PNG
IHDRÈÈ­X®¤IDATx^í]
Eµ¾·{&1,!dù»*yVQ#PTEDPA>ÊâC\P"ÈÄ"
F}òIW÷üCL#BÉL÷}
....
However, reading the file as text does not yield this encoding. I have tried several other encodings.
Any help would be immensely appreciated.
Edit
Here is what I mean about using the download URL:
<img alt='logo' src='https://firebasestorage.googleapis.com/v0/b/y<project-name>/o/logos%2FAnM65PlBGluoIzdgN9F5%2Fuser.png?alt=media&token=<token>' />
The above src url is the one provided in the Firebase Storage console when clicking on the file. It is labeled as 'Download URL' (I believe this is the one retrieved by calling getDownloadUrl() via the sdk).
When using AngularFireStorage to put the file in storage, the Download URL will work. When I say it 'will work', I mean the image will display properly. When using FileReader to pass the data to an http cloud function to upload (as seen above), the image will not display. In other words, after uploading the file via the backend, the download url does in fact provide what was uploaded, it's just not in a format that an img tag can display.
One possible issue may be that I am not getting the encoding correct when using FileReader readAsText. Here is what I am doing with FileReader:
const reader = new FileReader();
reader.onloadend = () => {
firebase.functions().httpsCallable('http_put_logo')(reader.result);
};
// Have tried various encodings here, as well as all reader methods.
reader.readAsText(file);
Edit 2
All of the discussion on this question so far seems to be around correctly getting the download URL. I'm not sure if Firebase docs have this information, but the download URL is available in the Storage console. I'm simply copying and pasting that URL to for testing purposes at the moment.
The reason why I am doing this is because I plan to save these image URLs in the DB since they are going to be frequently used and publicly readable. So, I'm not going to use the getDownLoadURL() method to fetch these images, I'm simply just going to link to them directly in img tags.
Here is an image of my console to see what I mean (bottom right):
You just have to click it and copy it. You can then open it in a browser tab, download it, use it as a src value, etc.
Edit 3
Here is an image of what the request payload looks like when using the client sdk:
Here is when I read the file as text and send to backend for upload:
Notice there are differences in the payloads. That's why I'm uncertain if I'm properly reading the file or encoding it incorrectly.
What part of your code is taking care of getting the URL? I recently used a similar approach to uploading images to Firebase Storage using cloud functions. What worked best for me was to execute a different function to get the URL after the upload is complete. Something like this:
const bucket = admin.storage().bucket(environment.bucket)
const temp_filename = filename;
const temp_filepath = path.join(os.tmpdir(), temp_filename);
await fs.outputFile(temp_filepath, req.data, {});
// Upload.
await bucket.upload(temp_filepath, {destination: 'images'})
.then((val) => {retrieveUrl(temp_filename)})
.catch((err) => {});
retrieveUrl = (imageName) => {
const storage = firebase.storage();
storage.ref(`/images/${imageName}.jpg`).getDownloadURL()
.then( url => {
/*Save the url to a variable or attach it directly to the src of your image, depending on the structure of your project*/
})
.catch(err => console.log(err));
}
Keep in mind that you need to install firebase in your project in order to call firebase.storage.

Categories

Resources