How to download a large S3 file with AWS Lambda (javascript)

How to download a large S3 file with AWS Lambda (javascript) - javascript

I have been struggling to get the following code to work properly. I am using the serverless framework and it works beautifully when invoked locally. I have a whole workflow where I take the downloaded file and process it before uploading back to S3. Ideally I would like to process line-by-line, but small steps first!
I can download a small file from S3 with no problems, but the file I am handling is much larger and shouldn't be in memory.
const AWS = require("aws-sdk");
const S3 = new AWS.S3();
const fs = require("fs");
module.exports = async function (event, context, callback) {
let destWriteStream = fs.createWriteStream("/tmp/a.csv");
let s3Input = { Bucket : "somebucket", Key : "somekey.csv" };
await new Promise((resolve, reject) => {
let s3Stream = S3.getObject(s3Input)
.createReadStream()
.on("close", function() {
resolve("/tmp/a.csv");
})
.pipe( destWriteStream );
}).then(d => {
console.log(d);
});
callback( null, "good" );
};
I have tried many different ways of doing this, including Promise's. I am running Node 8.10. It is just not working when run as a Lambda. It simply times out.
My issue is that I see very little complete examples. Any help would be appreciated.

I managed to get this working. Lambda, really does not do a good job with some of its error reporting.
The issue was that the bucket I was downloading from was from a different region than the Lambda was hosted in. Apparently this does not make a difference when running it locally.
So ... for others that may tread here ... check your bucket locations relative to your Lambda region.

Related

Downloading a zip file from a given path in express api + react

So I'm completely lost at this point. I have had a mixture of success and failure but I can't for the life of me get this working. So I'm building up a zip file and storing it in a folder structure that's based on uploadRequestIds and that all works fine. I'm fairly new to the node but all I want is to take the file that was built up which is completely valid and works if you open it once it's been constructed in the backend and then send that on to the client.
const prepareRequestForDownload = (dirToStoreRequestData, requestId) => {
const output = fs.createWriteStream(dirToStoreRequestData + `/Content-${requestId}.zip`);
const zip = archiver('zip', { zlib: { level: 9 } });
output.on('close', () => { console.log('archiver has been finalized.'); });
zip.on('error', (err) => { throw err; });
zip.pipe(output);
zip.directory(dirToStoreRequestData, false);
zip.finalize();
}
This is My function that builds up a zip file from all the files in a given directory and then stores it in said directory.
all I thought I would need to do is set some headers to have an attachment disposition type and create a read stream of the zip file into the res.send function and then react would be able to save the content. but that just doesn't seem to be the case. How should this be handled on both the API side from reading the zip and sending to the react side of receiving the response and the file auto-downloading/requesting a user saves the file.
This is what the temp structure looks like

There is some strategies to resolve it, all browser when you redirect to URL where extension ending with .zip, normally start downloading. What you can do is to return to your client the path for download something like that.
http://api.site.com.br/my-file.zip
and then you can use:
window.open('URL here','_blank')

Javascript promise call happens just one time

Some Context
Im running a nodejs aplication that works with the Twit API.
Im using it to periodically tweet some images on tweeter, to generate the images im using processing (java API)
So im using the 'fs' lib of nodejs to execute the command:
processing-java --sketch="%cd%"\\processing\\sketch --run
Then im using a eventListener to know when processing is closed to load the generated image, to do this im using this JS promise:
File: generate-sketch-images.js
const {exec} = require('child_process');
const fs = require('fs');
module.exports = new Promise((resolve,reject) =>{
let cmd = `processing-java --sketch="%cd%"\\processing\\sketch --run`;
console.log('running processing');
exec(cmd).addListener('close',() => {
resolve(fs.readFileSync(
'processing/paints/paint.png',
{encoding:'base64'}
));
console.log('image received from processing');
});
})
Then i made this function:
File: bot.js
const generate_image = require('./utils/generate-sketch-images');
async function gerar_imagem(){
console.log('generating the image');
let img = await generate_image;
}
The problem
The 'gerar_imagens' function only works at first time
When i do something like setInterval(gerar_imagens,1000); it works just one time
I have already made a lot of console logs, and seems like the function call happens but the promise is ignored
someone knows whats happening??

AWS S3 File Download from the client-side

I am currently trying to download the file from the s3 bucket using a button from the front-end. How is it possible to do this? I don't have any idea on how to start this thing. I have tried researching and researching, but no luck -- all I have searched are about UPLOADING files to the s3 bucket but not DOWNLOADING files. Thanks in advance.
NOTE: I am applying it to ReactJS (Frontend) and NodeJS (Backend) and also, the file is uploaded using Webmerge
UPDATE: I am trying to generate a download link with this (Tried node even if I'm not a backend dev) (lol)
see images below
what I have tried so far
onClick function

If the file you are trying to download is not public then you have to create a signed url to get that file.
The solution is here Javascript to download a file from amazon s3 bucket?
for getting non public files, which revolves around creating a lambda function that will generate a signed url for you then use that url to download the file on button click
BUT if the file you are trying to download you is public then you don't need a signed url, you just need to know the path to the file, the urls are structured like: https://s3.amazonaws.com/ [file path]/[filename]
They is also aws amplify its created and maintain by AWS team.
Just follow Get started and downloading the file from your react app is simply as:
Storage.get('hello.png', {expires: 60})
.then(result => console.log(result))
.catch(err => console.log(err));

Here is my solution:
let downloadImage = url => {
let urlArray = url.split("/")
let bucket = urlArray[3]
let key = `${urlArray[4]}/${urlArray[5]}`
let s3 = new AWS.S3({ params: { Bucket: bucket }})
let params = {Bucket: bucket, Key: key}
s3.getObject(params, (err, data) => {
let blob=new Blob([data.Body], {type: data.ContentType});
let link=document.createElement('a');
link.href=window.URL.createObjectURL(blob);
link.download=url;
link.click();
})
}
The url in the argument refers to the url of the S3 file.
Just put this in the onClick method of your button. You will also need the AWS SDK

Stream array of remote files to amazon S3 in Node.js

I have an array of URLs to files that I want to upload to an Amazon S3 bucket. There are 2916 URLs in the array and the files have a combined size of 361MB.
I try to accomplish this using streams to avoid using too much memory. My solution works in the sense that all 2916 files get uploaded, but (at least some of) the uploads seem to be incomplete, as the total size of the uploaded files varies between 200MB and 361MB for each run.
// Relevant code below (part of a larger function)
/* Used dependencies and setup:
const request = require('request');
const AWS = require('aws-sdk');
const stream = require('stream');
AWS.config.loadFromPath('config.json');
const s3 = new AWS.S3();
*/
function uploadStream(path, resolve) {
const pass = new stream.PassThrough();
const params = { Bucket: 'xxx', Key: path, Body: pass };
s3.upload(params, (err, data) => resolve());
return pass;
}
function saveAssets(basePath, assets) {
const promises = [];
assets.map(a => {
const url = a.$.url;
const key = a.$.path.substr(1);
const localPromise = new Promise(
(res, rej) => request.get(url).pipe(uploadStream(key, res))
);
promises.push(localPromise);
});
return Promise.all(promises);
}
saveAssets(basePath, assets).then(() => console.log("Done!"));
It's a bit messy with the promises, but I need to be able to tell when all files have been uploaded, and this part seems to work well at least (it writes "Done!" after ~25 secs when all promises are resolved).
I am new to streams so feel free to bash me if I approach this the wrong way ;-) Really hope I can get some pointers!

It seems I was trying to complete too many requests at once. Using async.eachLimit I now limit my code to a maximum of 50 concurrent requests which is the sweetspot for me in terms of trade-off between execution time, memory consumption and stability (all of the downloads completes every time!).

Upload large files as a stream to s3 with Plain Javascript using AWS-SDK-JS

There is a pretty nice example available for uploading large files to s3 via aws-sdk-js library but unfortunately this is using nodeJs fs.
Is there a way we can achieve the same thing in Plain Javascript? Here is a nice Gist as well which breaks down the large file into the smaller Chunks however this is still missing the .pipe functionality of nodeJs fs which is required to pass to asw-sdk-js upload function. Here is a relevant code snippet as well in Node.
var fs = require('fs');
var zlib = require('zlib');
var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());
var s3obj = new AWS.S3({params: {Bucket: 'myBucket', Key: 'myKey'}});
s3obj.upload({Body: body}).
on('httpUploadProgress', function(evt) {
console.log('Progress:', evt.loaded, '/', evt.total);
}).
send(function(err, data) { console.log(err, data) });
Is there something similar available in Plain JS (non nodeJs)? Useable with Rails.
Specifically, an alternative to the following line in Plain JS.
var body = fs.createReadStream('bigfile').pipe(zlib.createGzip());

The same link you provided contains an implementation intended for the Browser, and it also uses the AWS client SDK.
// Get our File object
var file = $('#file-chooser')[0].files[0];
// Upload the File
var bucket = new AWS.S3({params: {Bucket: 'myBucket'});
var params = {Key: file.name, ContentType: file.type, Body: file};
bucket.upload(params, function (err, data) {
$('#results').html(err ? 'ERROR!' : 'UPLOADED.');
});
** EDITS **
Note the documentation for the Body field includes Blob, which means streaming will occur:
Body — (Buffer, Typed Array, Blob, String, ReadableStream)
You can also use the Event Emitter convention in the client offered by the AWS SDK's ManagedUpload interface if you care to monitor progress. Here is an example:
var managed = bucket.upload(params)
managed.on('httpUploadProgress', function (bytes) {
console.log('progress', bytes.total)
})
managed.send(function (err, data) {
$('#results').html(err ? 'ERROR!' : 'UPLOADED.');
})
If you want to read the file from your local system in chunks before you send to s3.uploadPart, you'll want to do something with Blob.slice, perhaps defining a Pipe Chain.

Develop Reference

JavaScript is the programming language of the Web.

How to download a large S3 file with AWS Lambda (javascript) - javascript

Related

Downloading a zip file from a given path in express api + react

Javascript promise call happens just one time

AWS S3 File Download from the client-side

Stream array of remote files to amazon S3 in Node.js

Upload large files as a stream to s3 with Plain Javascript using AWS-SDK-JS

Categories

Resources