FTP in AWS Lambda - Issues Downloading Files (Async/Await) - javascript

I have been struggling with various FTP Node modules to try and get anything working in AWS Lambda. The best and most popular seems to be "Basic-FTP" that also supports async/await. But I just cannot get it to download files when any code is added beneath the FTP function.
I don't want to add the fs functions within the FTP async function as I need to solve what is causing the break when any code below is added and I also have other bits of code to add and work with the downloaded file and it's content later:
FTP SUCCESS - When the async function is used only with no fs code beneath it
FTP FAILURE - Adding the fs readdir/readFile functions or any other code below
ERROR Error: ENOENT: no such file or directory, open '/tmp/document.txt'
https://github.com/patrickjuchli/basic-ftp
const ftp = require("basic-ftp");
const fs = require("fs");
var FileNameWithExtension = "document.txt";
var ftpTXT;
exports.handler = async (event, context, callback) => {
example();
async function example() {
const client = new ftp.Client();
//client.ftp.verbose = true;
try {
await client.access({
host: host,
user: user,
password: password,
//secure: true
});
console.log(await client.list());
await client.download(fs.createWriteStream('/tmp/' + FileNameWithExtension), FileNameWithExtension);
}
catch (err) {
console.log(err);
}
client.close();
}
// Read the content from the /tmp/ directory to check FTP was succesful
fs.readdir("/tmp/", function (err, data) {
if (err) {
return console.error("There was an error listing the /tmp/ contents.");
}
console.log('Contents of AWS Lambda /tmp/ directory: ', data);
});
// Read TXT file and convert into string format
fs.readFile('/tmp/' + FileNameWithExtension, 'utf8', function (err, data) {
if (err) throw err;
ftpTXT = data;
console.log(ftpTXT);
});
// Do other Node.js coding with the downloaded txt file and it's contents
};

The problem is that you are getting lost when creating an async function inside your handler. Since example() is async, it returns a Promise. But you don't await on it, so the way it has been coded, it's kind of a fire and forget thing. Also, your Lambda is being terminated before your callbacks are triggered, so even if it got to download you would not be able to see it.
I suggest you wrap your callbacks in Promises so you can easily await on them from your handler function.
I have managed to make it work: I have used https://dlptest.com/ftp-test/ for testing, so change it accordingly. Furthermore, see that I have uploaded the file myself. So if you want to replicate this example, just create a readme.txt on the root of your project and upload it. If you already have this readme.txt file on your FTP server, just delete the line where it uploads the file.
Here's a working example:
const ftp = require("basic-ftp");
const fs = require("fs");
const FileNameWithExtension = "readme.txt";
module.exports.hello = async (event) => {
const client = new ftp.Client();
try {
await client.access({
host: 'ftp.dlptest.com',
user: 'dlpuser#dlptest.com',
password: 'puTeT3Yei1IJ4UYT7q0r'
});
console.log(await client.list());
await client.upload(fs.createReadStream(FileNameWithExtension), FileNameWithExtension)
await client.download(fs.createWriteStream('/tmp/' + FileNameWithExtension), FileNameWithExtension);
}
catch (err) {
console.log('logging err')
console.log(err);
}
client.close();
console.log(await readdir('/tmp/'))
console.log(await readfile('/tmp/', FileNameWithExtension))
return {
statusCode: 200,
body: JSON.stringify({message: 'File downloaded successfully'})
}
};
const readdir = dir => {
return new Promise((res, rej) => {
fs.readdir(dir, function (err, data) {
if (err) {
return rej(err);
}
return res(data)
});
})
}
const readfile = (dir, filename) => {
return new Promise((res, rej) => {
fs.readFile(dir + filename, 'utf8', function (err, data) {
if (err) {
return rej(err);
}
return res(data)
})
})
}
Here is the output of the Lambda function:
And here are the complete CloudWatch logs:
My file contains nothing but a 'hello' inside it. You can see it on the logs.
Do keep in mind that, in Lambda Functions, you have a 512MB limit when downloading anything to /tmp. You can see the limits in the docs

Related

How to pipe multiple Streams for uploading files/streams to Cloudinary or other storage provider in Nodejs & graphql-upload?

My apollo-server is using graphql-upload package which includes file upload support for GraphQL endpoints. But they only documented about uploading single files. But we need multiple file upload support. Well, I get the streams as an Array. But whenever I createReadStream for each streams & pipe them to cloudinary uploader var, it just uploads the last created stream rather then uploading the each stream.
Code
// graphql reolver
const post = async (_, { post }, { isAuthenticated, user }) => {
if (!isAuthenticated) throw new AuthenticationError("User unauthorized");
const files = await Promise.all(post.files);
let file_urls = [];
const _uploadableFiles = cloudinary.uploader.upload_stream({ folder: "post_files" },
(err, result) => {
console.log("err:", err);
console.log("result:", result);
if (err) throw err;
file_urls.push({
url: result.secure_url,
public_id: result.public_id,
file_type: result.metadata,
});
return result;
}
);
files.forEach(async (file) => await file.createReadStream().pipe(_uploadableFiles));
.... other db related stuff
}
After that, I get the Secure_URL from uploaded files which is returned by cloudinary upload_stream functions callback. But it only gives me the properties of one stream which was the last of the all streams. Please help me in this case. Is there any way to pipe multiple streams?
Instead of making one const upload stream you make it into a factory function that returns an upload stream on each call for pipe'ing
Use array map so that you get an array that you can use in Promise.all
One by one each file should get uploaded to their own respective upload stream, appending the generated file url info to file_urls(on success callback), when all are done Promise.all would resolve and the code can resume to do other db related stuff
const post = async (_, { post }, { isAuthenticated, user }) => {
if (!isAuthenticated) throw new AuthenticationError("User unauthorized");
const files = await Promise.all(post.files);
let file_urls = [];
function createUploader(){
return cloudinary.uploader.upload_stream({ folder: "post_files" },
(err, result) => {
console.log("err:", err);
console.log("result:", result);
if (err) throw err;
file_urls.push({
url: result.secure_url,
public_id: result.public_id,
file_type: result.metadata,
});
return result;
}
);
}
await Promise.all( files.map(async (file) => await file.createReadStream().pipe(createUploader())) ); //map instead of forEach
//.... other db related stuff
}

Why can my code run in a standard Node.js file, but not in a AWS Lambda Function?

What I'm trying to do is create a lambda function where the function calls two commands on an ec2 instance. When I had trouble running this code in a lambda function, I removed the code from the exports.handler() method and ran the code in a standalone node.js file in the same ec2 instance and I was able to get the code to work. The command I ran was 'node app.js'.
exports.handler = async (event) => {
const AWS = require('aws-sdk')
AWS.config.update({region:'us-east-1'});
var ssm = new AWS.SSM();
var params = {
DocumentName: 'AWS-RunShellScript', /* required */
InstanceIds: ['i-xxxxxxxxxxxxxxxx'],
Parameters: {
'commands': [
'mkdir /home/ec2-user/testDirectory',
'php /home/ec2-user/helloWorld.php'
/* more items */
],
/* '<ParameterName>': ... */
}
};
ssm.sendCommand(params, function(err, data) {
if (err) {
console.log("ERROR!");
console.log(err, err.stack); // an error occurred
}
else {
console.log("SUCCESS!");
console.log(data);
} // successful response
});
const response = {
statusCode: 200,
ssm: ssm
};
return response;
};
I figured that it could have been a permissions related issue, but the lambda is apart of the same vpc that the ec2 instance is in.
You're trying to combine async/await with callbacks. That won't work in a lambda AWS Lambda Function Handler in Node.js. The reason it's working locally, or in a node server, is because the server is still running when the function exits, so the callback still happens. In a Lambda the node process is gone as soon as the lambda exits if you are using async (or Promises), so the callback is not able to be fired.
Solution based on Jason's Answer:
const AWS = require('aws-sdk');
const ssm = new AWS.SSM();
exports.handler = async (event,context) => {
AWS.config.update({region:'us-east-1'});
const params = {
DocumentName: 'AWS-RunShellScript', /* required */
InstanceIds: ['i-xxxxxxxxxxxxxx'],
Parameters: {
'commands': [
'mkdir /home/ec2-user/testDirectory',
'php /home/ec2-user/helloWorld.php'
/* more items */
],
/* '<ParameterName>': ... */
}
};
const ssmPromise = new Promise ((resolve, reject) => {
ssm.sendCommand(params, function(err, data) {
if (err) {
console.log("ERROR!");
console.log(err, err.stack); // an error occurred
context.fail(err);
}
else {
console.log("SUCCESS!");
console.log(data);
context.succeed("Process Complete!");
} // successful response
});
});
console.log(ssmPromise);
const response = {
statusCode: 200,
ssm: ssm
};
return response;
};

How to assure that the Express response comes after the File System terminates the write stream?

In an Express.js API I'm creating a zip file that stores a collection of PDFs that is intended to be passed as a download
I have created the zipfile using the yazl package following the README file, and it's pretty good, the problem comes when I use the pipe to create the createWriteStream, because I don't know how to properly wait until is finished.
Then in my Express route I want to send the file, but this code is executed before the write stream is finished...
This is a piece of code of a Promise function named renderReports inside my repository.js file, after I write the PDFs file I use a loop to added to the yazl's zipFile, then I proceed to create the zip with the fs.createWriteStream
const renderFilePromises = renderResults.map((renderedResult, index) =>
writeFile(`./temporal/validatedPdfs/${invoices[index].id}.pdf`, renderedResult.content)
);
await Promise.all(renderFilePromises);
const zipfile = new yazl.ZipFile();
invoices.map((invoice, index) => {
zipfile.addFile(`./temporal/validatedPdfs/${invoice.id}.pdf`, `${invoice.id}.pdf`)
});
zipfile.outputStream.pipe(fs.createWriteStream("./temporal/output.zip").on('close', () => {
console.log('...Done');
}));
zipfile.end();
resolve();
And the following code is how I use the promise
app.post('/pdf-report', async (req, res, next) => {
const { invoices } = req.body;
repository.renderReports(reporter, invoices)
.then(() => {
res.sendFile('output.zip', {
root: path.resolve(__dirname, './../../temporal/'),
dotfiles: 'deny',
}, (err) => {
if (err) {
console.log(err);
res.status(err.status).end();
}
else {
console.log('Sent:', 'output.zip');
}
});
})
.catch((renderErr) => {
console.error(renderErr);
res.header('Content-Type', 'application/json');
return res.status(501).send(renderErr.message);
});
});
I hope somebody can explain how to approach this
You need to store the write stream in a variable so you can access it. Then, on this variable, wait for the stream‘s finish event. This is emitted by Node.js once the stream is done with writing.

Download file via FTP, write to /tmp/ and output .txt contents to the console with AWS Lambda

I am using just a single Node package, basic-ftp to try and download a TXT file and write the contents to the console. Further down the line I will be editing the text so will need to use fs. Just struggling to work with the output from createWriteStream from within the FTP program.
Can anyone help me write a TXT file to the /tmp/ file within AWS Lambda and then the correct syntax to open and edit the file after createWriteStream has been used?
var fs = require('fs');
const ftp = require("basic-ftp")
var path = require('path');
exports.handler = (event, context, callback) => {
var fullPath = "/home/example/public_html/_uploads/15_1_5c653e6f6780f.txt"; // File Name FULL PATH -------
const extension = path.extname(fullPath); // Used to calculate filenames below
const wooFileName = path.basename(fullPath, extension); // Uploaded filename with no path or extension eg. filename
const myFileNameWithExtension = path.basename(fullPath); // Uploaded filename with the file extension eg. filename.txt
const FileNameWithExtension = path.basename(fullPath); // Uploaded filename with the file extension eg. filename.txt
example()
async function example() {
const client = new ftp.Client()
client.ftp.verbose = true
try {
await client.access({
host: "XXXX",
user: "XXXX",
password: "XXXX",
//secure: true
})
await client.download(fs.createWriteStream('./tmp/' + myFileNameWithExtension), myFileNameWithExtension)
}
catch(err) {
console.log(err)
}
client.close()
}
//Read the content from the /tmp directory to check it's empty
fs.readdir("/tmp/", function (err, data) {
console.log(data);
console.log('Contents of AWS Lambda /tmp/ directory');
});
/*
downloadedFile = fs.readFile('./tmp/' + myFileNameWithExtension)
console.log(downloadedFile)
console.log("Raw text:\n" + downloadedFile.Body.toString('ascii'));
*/
}
Pretty sure your fs.createWriteStream() has to use an absolute path to /tmp in Lambdas. Your actual working directory is var/task not /.
Also, if you're using fs.createWriteStream() you'll need to wait for the finish event before reading from the file. Somethign like this...
async function example() {
var finalData = '';
const client = new ftp.Client()
client.ftp.verbose = true
try {
await client.access({
host: "XXXX",
user: "XXXX",
password: "XXXX",
//secure: true
})
let writeStream = fs.createWriteStream('/tmp/' + myFileNameWithExtension);
await client.download(writeStream, myFileNameWithExtension)
await finalData = (()=>{
return new Promise((resolve, reject)=> {
writeStream
.on('finish', ()=>{
fs.readFile("/tmp/"+myFileNameWithExtension, function (err, data) {
if (err) {
reject(err)
} else {
console.log('Contents of AWS Lambda /tmp/ directory', data);
resolve(data);
}
});
})
.on('error', (err)=> {
console.log(err);
reject(err);
})
})
})();
}
catch(err) {
console.log(err)
}
client.close();
return finalData;
}
You'll also need to access the file using fs.readFile(). What you were using fs.readdir() gives you a list of files in the directory, not the file's contents.
If you want to used readdir() you could do it like this, but as you can see it is redundant in your case. To handle errors I would suggest just handling the error event in the initial createWriteStream() instead of adding this extra overhead (added to previous example)...
writeStream
.on('finish', ()=>{
fs.readdir('/tmp',(err, files)=> {
let saved = files.find(file => file === myFileNameWithExtension);
fs.readFile("/tmp/"+saved, function (err, data) {
if (err) throw new Error();
console.log(data);
console.log('Contents of AWS Lambda /tmp/ directory');
});
})
})
.on('error', (err)=> {
console.log(err);
throw new Error();
})
NOTE: Please log out the result of saved, I can't remember if the files array is absolute of relative paths.

Permissions trouble on AWS Lambda, can't spawn child process

So I've created this nice little lambda, which runs great locally, however not so much when actually out in the wild.
The lambda takes an event, with html in the event source, converts that html to a PDF (using the html-pdf node module), passes that pdf to an s3 bucket, and then hands back a signed url that expires in 60 seconds.
Or at least that is what ought to happen (again, works locally). When testing on Lambda, I get the following error:
{
"errorMessage": "spawn EACCES",
"errorType": "Error",
"stackTrace": [
"exports._errnoException (util.js:870:11)",
"ChildProcess.spawn (internal/child_process.js:298:11)",
"Object.exports.spawn (child_process.js:362:9)",
"PDF.PdfExec [as exec] (/var/task/node_modules/html-pdf/lib/pdf.js:87:28)",
"PDF.PdfToFile [as toFile] (/var/task/node_modules/html-pdf/lib/pdf.js:83:8)",
"/var/task/index.js:72:43",
"Promise._execute (/var/task/node_modules/bluebird/js/release/debuggability.js:272:9)",
"Promise._resolveFromExecutor (/var/task/node_modules/bluebird/js/release/promise.js:473:18)",
"new Promise (/var/task/node_modules/bluebird/js/release/promise.js:77:14)",
"createPDF (/var/task/index.js:71:19)",
"main (/var/task/index.js:50:5)"
]
}
Here's the code itself (not compiled, there's a handy gulp task for that)
if(typeof regeneratorRuntime === 'undefined') {
require("babel/polyfill")
}
import fs from 'fs'
import pdf from 'html-pdf'
import md5 from 'md5'
import AWS from 'aws-sdk'
import Promise from 'bluebird'
import moment from 'moment'
const tempDir = '/tmp'
const config = require('./config')
const s3 = new AWS.S3()
export const main = (event, context) => {
console.log("Got event: ", event)
AWS.config.update({
accessKeyId: config.awsKey,
secretAccessKey: config.awsSecret,
region: 'us-east-1'
})
const filename = md5(event.html) + ".pdf"
createPDF(event.html, filename).then(function(result) {
uploadToS3(filename, result.filename).then(function(result) {
getOneTimeUrl(filename).then(function(result) {
return context.succeed(result)
}, function(err) {
console.log(err)
return context.fail(err)
})
}, function(err) {
console.log(err)
return context.fail(err)
})
}, function(err) {
console.log(err)
return context.fail(err)
})
}
const createPDF = (html, filename) => {
console.log("Creating PDF")
var promise = new Promise(function(resolve, reject) {
pdf.create(html).toFile(filename, function(err, res) {
if (err) {
reject(err)
} else {
resolve(res)
}
})
})
return promise
}
const uploadToS3 = (filename, filePath) => {
console.log("Pushing to S3")
var promise = new Promise(function(resolve, reject) {
var fileToUpload = fs.createReadStream(filePath)
var expiryDate = moment().add(1, 'm').toDate()
var uploadParams = {
Bucket: config.pdfBucket,
Key: filename,
Body: fileToUpload
}
s3.upload(uploadParams, function(err, data) {
if(err) {
reject(err)
} else {
resolve(data)
}
})
})
return promise
}
const getOneTimeUrl = (filename) => {
var promise = new Promise(function(resolve, reject) {
var params = {
Bucket: config.pdfBucket,
Key: filename,
Expires: 60
}
s3.getSignedUrl('getObject', params, function(err, url) {
if (err) {
reject(err)
} else {
resolve(url)
}
})
})
return promise
}
Seems like a problem within html-pdf. I thought it might be a problem with PhantomJS (which html-pdf depends on) due to some reading I did here: https://engineering.fundingcircle.com/blog/2015/04/09/aws-lambda-for-great-victory/ , however, since Lambda has bumped the max zip size to 50mb, I don't have a problem uploading the binary.
Any thoughts?
html-pdf uses phantomjs under the hood, which needs to compile some binaries when being installed. I guess your problem is that you are deploying those locally compiled binaries but Lambda needs the binaries compiled on Amazon Linux.
You can solve this problem by building your deploy package on an EC2 instance that is running Amazon Linux and then e.g. directly deploy it from there like it is explained in this tutorial.
Also check out this answer on a similar problem.

Categories

Resources