createReadStream not working/extremely slow for large files - javascript

Im using DropBox API to upload files. To upload the files to dropbox I am going through the following steps:
First upload file from form to a local directory on the server.
Read File from local directory using fs.createReadStream
Send file to Dropbox via the dropbox API.
The issue:
For some reason fs.createReadStream takes absolute ages when reading and uploading a large file. Now the file I'm trying to upload is only 12MB which is not a big file and it takes approximately 18mins to upload/process a 12MB file.
I don't know where the issue is either it's in createReadStream or dropbox api code.
It works with files of size within kb.
My Code:
let options = {
method: 'POST',
uri: 'https://content.dropboxapi.com/2/files/upload',
headers: {
'Authorization': 'Bearer TOKEN HERE',
'Dropbox-API-Arg': "{\"path\": \"/test/" + req.file.originalname + "\",\"mode\": \"overwrite\",\"autorename\": true,\"mute\": false}",
'Content-Type': 'application/octet-stream'
},
// I think the issue is here.
body: fs.createReadStream(`uploads/${req.file.originalname}`)
};
rp(options)
.then(() => {
return _deleteLocalFile(req.file.originalname)
})
.then(() => {
return _generateShareableLink(req.file.originalname)
})
.then((shareableLink) => {
sendJsonResponse(res, 200, shareableLink)
})
.catch(function (err) {
sendJsonResponse(res, 500, err)
});
Update:
const rp = require('request-promise-native');

I had an experience similar to this issue before and after a large amount of head scratching and digging around, I was able to resolve the issue, in my case anyway.
For me, the issue arose due to the default chunking size for createReadStream() being quite small 64kb and this for some reason having a knock on effect when uploading to Dropbox.
The solution therefore was to increase the chunk size.
// Try using chunks of 256kb
body: fs.createReadStream(`uploads/${req.file.originalname}`, {highWaterMark : 256 * 1024});

https://github.com/request/request#streaming
I believe you need to pipe the stream to the request.
see this answer:
Sending large image data over HTTP in Node.js

Related

Function stops without any logs in firebase (sometimes works well.. sometimes partly) - confused

Can you please share your thoughts or experience why this can happen:
I have a folder on firebase cloud with a bunch of images (jpeg, png). It can contain from 2 to 40+ files. After uploading the files from the client is complete, a request is sent to the firebase initiating resizing function of these new images.
Resizing should work as:
uploading every file to tmpFilePath (for (const file of files) loop is used)
resizing the files in tmpFilePath (I tried spawn and sharp, and sharp seems to be more stable)
downloading the resized file back to the folder.
The problem is that the function doesn't work from time to time. Sometimes it stops at the step of downloading to tmpFilePath with no logs. After re-initiating, it can proceed a couple of files and stop again. Sometimes it works well. I thought the file size matters, but again, the same folder with the same files can work ok after several initiations.. Confused. Is it specific to firebase functions?
This is a short version of the code:
for (const file of files) {
console.log(`${file} before downloading to tempFolder`)
await destBucket.file(filePath).download({
destination: tmpFilePath
})
.catch((err) => console.log(err))
.then(() => {
console.log(`${file} before sharp`)
return sharp (tmpFilePath, { failOnError: false })
.resize(resizedW, resizedH)
.toFile (afterSharpPath)
.then(() => {
console.log(`${file} before uploading back`)
return destBucket.upload(afterSharpPath, {
destination: filePath,
contentType: orType,
metadata: {metadata: newMetadata},
bucket: bucket,
})
} } }
Anyone have an idea how this can be done?

Electron upload with progress

I have an Electron app which is able to upload very big files to the server via HTTP in renderer process without user input. I decided to use axios as my HTTP client and it was able to retrieve upload progress but with this I met few problems.
Browser's supported js and Node.js aren't "friendly" with each other in some moments. I used fs.createReadStream function to get the file but axios does not understand what ReadStream object is and I can't pipe (there are several topics on their GitHub issue tab but nothing was done with that till now) this stream to FormData (which I should place my file in).
I ended up using fs.readFileSync and then form-data module with its getBuffer() method but now my file is loaded entirely in the memory before upload and with how big my files are it kills Electron process.
Googling I found out about request library which in-fact is able to pipe a stream to request but it's deprecated, not supported anymore and apparently I can't get upload progress from it.
I'm running out of options. How do you upload files with Electron without user input (so without file input) not loading them in the memory upfront?
P.S. on form-data github page there is a piece of code explaining how to upload a file stream with axios but it doesn't work, nothing is sent and downgrading the library as one issue topic suggested didn't help either...
const form = new FormData();
const stream = fs.createReadStream(PATH_TO_FILE);
form.append('image', stream);
// In Node.js environment you need to set boundary in the header field 'Content-Type' by calling method `getHeaders`
const formHeaders = form.getHeaders();
axios.post('http://example.com', form, {
headers: {
...formHeaders,
},
})
.then(response => response)
.catch(error => error)
I was able to solve this and I hope it will help anyone facing the same problem.
Since request is deprecated I looked up for alternatives and found got.js for NodeJS HTTP requests. It has support of Stream, fs.ReadStream etc.
You will need form-data as well, it allows to put streams inside FormData and assign it to a key.
The following code solved my question:
import fs from 'fs'
import got from 'got'
import FormData from 'form-data'
const stream = fs.createReadStream('some_path')
// NOT native form data
const formData = new FormData()
formData.append('file', stream, 'filename');
try {
const res = await got.post('https://my_link.com/upload', {
body: formData,
headers: {
...formData.getHeaders() // sets the boundary and Content-Type header
}
}).on('uploadProgress', progress => {
// here we get our upload progress, progress.percent is a float number from 0 to 1
console.log(Math.round(progress.percent * 100))
});
if (res.statusCode === 200) {
// upload success
} else {
// error handler
}
} catch (e) {
console.log(e);
}
Works perfectly in Electron renderer process!

POST cutting off PDF data

I posted a question yesterday (linked here) where I had been trying to send a PDF to a database, and then retrieve it a later date. Since then I have been advised that it is best to (in my case as I cannot use Cloud Computing services) to upload the PDF files to local storage, and save the URL of the file to the database instead. I have now begun implementing this, but I have come across some trouble.
I am currently using FileReader() as documented below to process the input file and send it to the server:
var input_file = "";
let reader = new FileReader();
reader.readAsText(document.getElementById("input_attachment").files[0]);
reader.onloadend = function () {
input_file = "&file=" + reader.result;
const body = /*all the rest of my data*/ + input_file;
const method = {
method: "POST",
body: body,
headers: {
"Content-type": "application/x-www-form-urlencoded"
}
};
After this bloc of code I do the stock standard fetch() and a route on my server receives this. Almost all data comes in 100% as expected, but the file comes in cut off somewhere around 1300 characters in (making it quite an incomplete PDF). What does appear to come in seems to match the first 1300 characters of the original PDF I uploaded.
I have seen suggestions that you are meant to use "multipart/form-data" content-type to upload files, but when I do this I seem to only then receive the first 700 characters or so of my PDF. I have tried using the middleware Multer to handle the "multipart/form-data" but it just doesn't seem to upload anything (though I can't guarantee that I am using it correctly).
I also initially had trouble with fetch payload too large error message, but have currently resolved this through this method:
app.use(bodyParser.urlencoded({ limit: "50mb", extended: false, parameterLimit: 50000 }));
Though I have suspicions that this may not be correctly implemented as I have seen some discussion that the urlencoded limit is set prior to the file loading, and cannot be changed in the middle of the program.
Any and all help is greatly appreciated, and I will likely use any information here to construct an answer on my original question from yesterday so that anybody else facing these sort of issues have a resource to go to.
I personally found the solution to this problem as follows. On the client-side of my application this code is an example of what was implemented.
formData = new FormData();
formData.append("username", "John Smith");
formData.append("fileToUpload", document.getElementById("input_attachment").files[0]);
const method = {
method: "POST",
body: formData
};
fetch(url, method)
.then(res => res.json())
.then(res => alert("File uploaded!"))
.catch(err => alert(err.message))
As can be noted I have changed from using "application/x-www-form-urlencoded" encoding to "multipart/form-data" to upload files. nodeJS and Express however do not natively support this encoding type. I chose to use the library Formidable (found this to be easiest to use without too much overhead) which can be investigated about here. Below is an example of my server-side implementation of this middleware (Formidable).
const express = require('express');
const app = express();
const formidable = require('formidable');
app.post('/upload', (req, res) => {
const form = formidable({ uploadDir: `${__dirname}/file/`, keepExtensions: true });
form.parse(req, (err, fields, files) => {
if (err) console.log(err.stack);
else {
console.log(fields.username);
});
});
The file(s) are automatically uploaded to the directory specified in uploadDir, and the keepExtensions ensures that the file extension is saved as well. The non-file inputs are accessible through the fields object as seen through the fields.username example above.
From what I have found, this is the easiest method to take to setup an easy file upload system.

Piping zip file from SailsJS backend to React Redux Frontend

I have a SailsJS Backend where i generate a zip File, which was requested by my Frontend, a React App with Redux. I'm using Sagas for the Async Calls and fetch for the request. In the backend, it tried stuff like:
//zipFilename is the absolute path
res.attachment(zipFilename).send();
or
res.sendfile(zipFilename).send();
or
res.download(zipFilename)send();
or pipe the stream with:
const filestream = fs.createReadStream(zipFilename);
filestream.pipe(res);
on my Frontend i try to parse it with:
parseJSON(response) => {
return response.clone().json().catch(() => response.text());
}
everything i tried ends up with an empty zip file. Any suggestions?
There are various issues with the options that you tried out:
res.attachment will just set the Content-Type and Content-Disposition headers, but it will not actually send anything.
You can use this to set the headers properly, but you need to pipe the ZIP file into the response as well.
res.sendfile: You should not call .send() after this. From the official docs' examples:
app.get('/file/:name', function (req, res, next) {
var options = { ... };
res.sendFile(req.params.name, options, function (err) {
if (err) {
next(err);
} else {
console.log('Sent:', fileName);
}
});
});
If the ZIP is properly built, this should work fine and set the proper Content-Type header as long as the file has the proper extension.
res.download: Same thing, you should not call .send() after this. From the official docs' examples:
res.download('/report-12345.pdf', 'report.pdf', function(err) { ... });
res.download will use res.sendfile to send the file as an attachment, thus setting both Content-Type and Content-Disposition headers.
However, you mention that the ZIP file is being sent but it is empty, so you should probably check if you are creating the ZIP file properly. As long as they are built properly and the extension is .zip, res.download should work fine.
If you are building them on the fly, check this out:
This middleware will create a ZIP file with multiples files on the fly and send it as an attachment. It uses lazystream and archiver
const lazystream = require('lazystream');
const archiver = require('archiver');
function middleware(req, res) {
// Set the response's headers:
// You can also use res.attachment(...) here.
res.writeHead(200, {
'Content-Type': 'application/zip',
'Content-Disposition': 'attachment; filename=DOWNLOAD_NAME.zip',
});
// Files to add in the ZIP:
const filesToZip = [
'assets/file1',
'assets/file2',
];
// Create a new ZIP file:
const zip = archiver('zip');
// Set up some callbacks:
zip.on('error', errorHandler);
zip.on('finish', function() {
res.end(); // Send the response once ZIP is finished.
});
// Pipe the ZIP output to res:
zip.pipe(res);
// Add files to ZIP:
filesToZip.map((filename) => {
zip.append(new lazystream.Readable(() => fs
.createReadStream(filename), {
name: filename,
});
});
// Finalize the ZIP. Compression will start and output will
// be piped to res. Once ZIP is finished, res.end() will be
// called.
zip.finalize();
}
You can build around this to cache the built ZIPs instead of building them on the fly every time, which is time and resource consuming and totally unadvisable for most uses cases.

AWS SDK JS: Multipart upload to S3 resulting in Corrupt data

Trying to upload an mp4 file using the AWS JS SDK initiating a multipart upload, I keep getting a file corrupt error when I try to download and play it on my local.
Gists of my code:
Initiating the multipart upload with params:
const createMultipartUploadParams = {
Bucket: bucketname,
Key: fileHash.file_name,
ContentType: 'video/mp4' // TODO: Change hardcode
};
Call:
s3Instance.createMultipartUpload(createMultipartUploadParams, function(err, data) {
}
Doing the chunking:
Params:
const s3ChunkingParams = {
chunkSize,
noOfIterations,
lastChunk ,
UploadId: data.UploadId
}
Reading the file:
const reader = new FileReader();
reader.readAsArrayBuffer(file)
Uploading each chunk:
reader.onloadend = function onloadend(){
console.log('onloadend');
const partUploadParams = {
Bucket: bucketname,
Key: file_name,
PartNumber: i, // Iterating over all parts
UploadId: s3ChunkingParams.UploadId,
Body: reader.result.slice(start, stop) // Chunking up the file
};
s3Instance.uploadPart(partUploadParams, function(err, data1) {
}
Finally completing the multipartUpload:
s3Instance.completeMultipartUpload(completeMultipartParams, function(err, data)
I am guessing the problem is how I am reading the file, so I have tried Content Encoding it to base64 but that makes the size unusually huge. Any help is greatly appreciated!
Tried this too
Only thing that could corrupt is perhaps you are uploading additionally padded content for your individual parts which basically leads to final object being wrong. I do not believe S3 is doing something fishy here.
You can verify after uploading the file what is the final size of the object, if it doesn't match with your local copy then you know you have a problem somewhere.
Are you trying to upload from browser?
Alternatively you can look at - https://github.com/minio/minio-js. It has minimal set of abstracted API's implementing most commonly used S3 calls.
Here is a nodejs example for streaming upload.
$ npm install minio
$ cat >> put-object.js << EOF
var Minio = require('minio')
var fs = require('fs')
// find out your s3 end point here:
// http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
var s3Client = new Minio({
url: 'https://<your-s3-endpoint>',
accessKey: 'YOUR-ACCESSKEYID',
secretKey: 'YOUR-SECRETACCESSKEY'
})
var outFile = fs.createWriteStream('your_localfile.zip');
var fileStat = Fs.stat(file, function(e, stat) {
if (e) {
return console.log(e)
}
s3Client.putObject('mybucket', 'hello/remote_file.zip', 'application/octet-stream', stat.size, fileStream, function(e) {
return console.log(e) // should be null
})
})
EOF
putObject() here is a fully managed single function call for file sizes over 5MB it automatically does multipart internally. You can resume a failed upload as well and it will start from where its left off by verifying previously upload parts.
So you don't necessarily have to go through the trouble of writing lower level multipart calls.
Additionally this library is also isomorphic, can be used in browsers as well.

Categories

Resources