I am pulling down objects from s3. the objects are zipped, and I need to be able to unzip them and compare the contents with some strings. My problem is that I can't seem to get them properly unzipped. This is what I am seeing happen: s3 zipped -> over the wire -> to me as JS Buffer -> ???
I am unsure of what I can do next. I have seemingly tried everything, such as pako, and lzutf8 to decompress the strings, but no dice.
here is an attempt with lzutf8:
lzutf8.decompress(buffer,{outputEncoding: "String"}, (result, error) => {
if (err) console.log(err);
if (data) console.log(data);
});
Here is an attempt with pako:
pako.ungzip(buffer,{to: "string"}, (result, error) => {
if (error) console.log(err);
if (result) console.log(data);
})
pako throws an "incorrect header check", and lzutf8 silently does nothing.
I am not married to these libraries, so if there is anything else that will do the job, I am happy to try anything. I am guessing that my problem might have something to do with the encoding types? Not sure though.
Here is what the relevant part of my code looks like:
let pako = require('pako');
let streamBuffers = require('stream-buffers');
let ws = fs.createWriteStream(process.cwd() + 'path-to-file');
let rs = new streamBuffers.ReadableStreamBuffer();
objects.forEach((obj) => {
console.log(obj);
rs.on("data", (data) => {
ws.write(pako.ungzip);
})
rs.push(obj);
})
You can create a readable stream from an object in S3 with the AWS SDK's createReadStream method and then pipe that through a zlib.Gunzip transform stream:
var zlib = require('zlib');
var s3 = new AWS.S3({apiVersion: '2006-03-01'});
var params = {Bucket: <bucket>, Key: <key>};
var file = require('fs').createWriteStream(<path/to/file>);
s3.getObject(params).createReadStream().pipe(zlib.createGunzip()).pipe(file);
Related
I have this issue with node and amazon s3 when it comes to sha256 encryption. I'm reading my files from the file system using fs.createReadStream(filename).am getting this file in chunks.Then am pushing each chunk into an array. Each chunk consists of 1 *1024 *1024 bytes of data.. when the file has finished getting read, on readstream.on('end') am looping through each value in array and encrypting each chunk using sha256. in the process of looping, am also adding axios promises of each loop into an array so that when am finished looping through all the chunks and encrypting each at a time, I'm able to use promise.all to send all the requests. the result of each encrypted chunk is also sent together with the each request as an header . The challenge I've been facing and trying to solve is, whenever a request is made, it gets the calculations of the sha256 from s3 is completely different from what I have . I have tried to solve this and to understand this to no avail. below is my code, what could I be doing wrong ?
this is the error that am getting :
Corrupted chunk received:
File corrupted: Expected SHA jakoz9d12xYjzpWVJQlqYdgPxAuF+LjZ9bQRg0hzmL8=, but calculated SHA 103f77f9b006d9b5912a0da167cf4a8cec60b0be017b8262cd00deb3183f3a8b
const Encryptsha256 = function(chunksTobeHashed) {
var crypto = require('crypto');
var hash = crypto.createHash('sha256').update(chunksTobeHashed).digest('base64')
return hash;
}
const upload = async function(uploadFile) {
var folderPath = uploadFile.filePath
var chunksArray = []
var uploadFileStream = fs.createReadStream(folderPath, { highWaterMark: 1 * 1024 * 1024, encoding:"base64" })
uploadFileStream.on('data', (chunk) => {
chunksArray.push(chunk)
// console.log('chunk is ', chunk)
})
uploadFileStream.on('error', (error) => {
console.log('error is ', error)
})
// file_id: "2fe44d18-fa94b201-2fe44d18b196-f9066e05a81c"
uploadFileStream.on('end', async() => {
//code to get file id was here but removed.since it was not much neccessary to this quiz
var file_id = "2fe44d18-fa94b201-2fe44d18b196-f9066e05a81c"
let promises = [];
for (var i in chunksArray) {
var Content_SHA256 = Encryptsha256(chunksArray[i])
var payload = {
body: chunksArray[i],
}
promises.push(
axios.post(
`${baseURL}/home/url/upload/${fileId}/chunk/${i}`, payload, {
header: {
'Content-SHA256': Content_SHA256,
},
}
)
)
}
Promise.all(promises).then((response) => {
console.log('axios::', response)
})
.catch((error) => {
console.log('request error', error)
})
})
Using Node.js, I am trying to get an image from a URL and upload that image to another service without saving image to disk. I have the following code that works when saving the file to disk and using fs to create a readablestream. But as I am doing this as a cron job on a read-only file system (webtask.io) I'd want to achieve the same result without saving the file to disk temporarily. Shouldn't that be possible?
request(image.Url)
.pipe(
fs
.createWriteStream(image.Id)
.on('finish', () => {
client.assets
.upload('image', fs.createReadStream(image.Id))
.then(imageAsset => {
resolve(imageAsset)
})
})
)
Do you have any suggestions of how to achieve this without saving the file to disk? The upload client will take the following
client.asset.upload(type: 'file' | image', body: File | Blob | Buffer | NodeStream, options = {}): Promise<AssetDocument>
Thanks!
How about passing the buffer down to the upload function? Since as per your statement it'll accept a buffer.
As a side note... This will keep it in memory for the duration of the method execution, so if you call this numerous times you might run out of resources.
request.get(url, function (res) {
var data = [];
res.on('data', function(chunk) {
data.push(chunk);
}).on('end', function() {
var buffer = Buffer.concat(data);
// Pass the buffer
client.asset.upload(type: 'buffer', body: buffer);
});
});
I tried some various libraries and it turns out that node-fetch provides a way to return a buffer. So this code works:
fetch(image.Url)
.then(res => res.buffer())
.then(buffer => client.assets
.upload('image', buffer, {filename: image.Id}))
.then(imageAsset => {
resolve(imageAsset)
})
well I know it has been a few years since the question was originally asked, but I have encountered this problem now, and since I didn't find an answer with a comprehensive example I made one myself.
i'm assuming that the file path is a valid URL and that the end of it is the file name, I need to pass an apikey to this API endpoint, and a successful upload sends me back a response with a token.
I'm using node-fetch and form-data as dependencies.
const fetch = require('node-fetch');
const FormData = require('form-data');
const secretKey = 'secretKey';
const downloadAndUploadFile = async (filePath) => {
const fileName = new URL(filePath).pathname.split("/").pop();
const endpoint = `the-upload-endpoint-url`;
const formData = new FormData();
let jsonResponse = null;
try {
const download = await fetch(filePath);
const buffer = await download.buffer();
if (!buffer) {
console.log('file not found', filePath);
return null;
}
formData.append('file', buffer, fileName);
const response = await fetch(endpoint, {
method: 'POST', body: formData, headers: {
...formData.getHeaders(),
"Authorization": `Bearer ${secretKey}`,
},
});
jsonResponse = await response.json();
} catch (error) {
console.log('error on file upload', error);
}
return jsonResponse ? jsonResponse.token : null;
}
I am currently in the process of creating a REST API for my personal website. I'd like to include some downloads and I would like to offer the possibility of selecting multiple ones and download those as a zip file.
My first approach was pretty easy: Array with urls, request for each of them, zip it, send to user, delete. However, I think that this approach is too dirty considering there are things like streams around which seems to be quite fitting for this thing.
Now, I tried around and am currently struggling with the basic concept of working with streams and events throughout different scopes.
The following worked:
const r = request(url, options);
r.on('response', function(res) {
res.pipe(fs.createWriteStream('./file.jpg'));
});
From my understanding r is an incoming stream in this scenario and I listen on the response event on it, as soon as it occurs, I pipe it to a stream which I use to write to the file system.
My first step was to refactor this so it fits my case more but I already failed here:
async function downloadFile(url) {
return request({ method: 'GET', uri: url });
}
Now I wanted to use a function which calls "downloadFile()" with different urls and save all those files to the disk using createWriteStream() again:
const urls = ['https://download1', 'https://download2', 'https://download3'];
urls.forEach(element => {
downloadFile(element).then(data => {
data.pipe(fs.createWriteStream('file.jpg'));
});
});
Using the debugger I found out that the "response" event is non existent in the data object -- Maybe that's already the issue? Moreover, I figured that data.body contains the bytes of my downloaded document (a pdf in this case) so I wonder if I could just stream this to some other place?
After reading some stackoveflow threads I found the following module: archiver
Reading this thread: Dynamically create and stream zip to client
#dankohn suggested an approach like that:
archive
.append(fs.createReadStream(file1), { name: 'file1.txt' })
.append(fs.createReadStream(file2), { name: 'file2.txt' });
Making me assume I need to be capable of extracting a stream from my data object to proceed.
Am I on the wrong track here or am I getting something fundamentally wrong?
Edit: lmao thanks for fixing my question I dunno what happened
Using archiver seems to be a valid approach, however it would be advisable to use streams when feeding large data from the web into the zip archive. Otherwise, the whole archive data would need to be held in memory.
archiver does not support adding files from streams, but zip-stream does. For reading a stream from the web, request comes in handy.
Example
// npm install -s express zip-stream request
const request = require('request');
const ZipStream = require('zip-stream');
const express = require('express');
const app = express();
app.get('/archive.zip', (req, res) => {
var zip = new ZipStream()
zip.pipe(res);
var stream = request('https://loremflickr.com/640/480')
zip.entry(stream, { name: 'picture.jpg' }, err => {
if(err)
throw err;
})
zip.finalize()
});
app.listen(3000)
Update: Example for using multiple files
Adding an example which processes the next file in the callback function of zip.entry() recursively.
app.get('/archive.zip', (req, res) => {
var zip = new ZipStream()
zip.pipe(res);
var queue = [
{ name: 'one.jpg', url: 'https://loremflickr.com/640/480' },
{ name: 'two.jpg', url: 'https://loremflickr.com/640/480' },
{ name: 'three.jpg', url: 'https://loremflickr.com/640/480' }
]
function addNextFile() {
var elem = queue.shift()
var stream = request(elem.url)
zip.entry(stream, { name: elem.name }, err => {
if(err)
throw err;
if(queue.length > 0)
addNextFile()
else
zip.finalize()
})
}
addNextFile()
})
Using Async/Await
You can encapsulate it into a promise to use async/await like:
await new Promise((resolve, reject) => {
zip.entry(stream, { name: elem.name }, err => {
if (err) reject(err)
resolve()
})
})
zip.finalize()
I'm new in node, for practice i thought to develop a weather commandline application, but i found a problem with ajax request, i'm usually to use $.ajax of jquery but it doesn't works, ( I've tried to require jquery ). I've solved this problem with another module.
Now the problem is: when i try to print json information on the coords.json and next read it with read-json module there are some "\" & "\n" everywhere in the string, i've tried to replace it with regex and fs module but it doesn't re-write the file... why?
Here the full code:
// index.js
// modules
const program = require('commander');
const clear = require('clear');
const chalk = require('chalk');
const request = require('ajax-request');
const fs = require('fs');
const json = require('read-data').json;
const writeJson = require('write-json');
// Forecast.io Key
const key = "*************";
const freegeoip = "http://freegeoip.net/json/";
let latitude = 0,
longitude = 0 ;
// forecast.io api url
const url = `https://api.darksky.net/forecast/${key}/${latitude},${longitude}`;
// initialize myData with the freegeoip datas
let myData = request({
url: 'http://freegeoip.net/json/',
method: 'GET',
data: {
format: 'json'
},
}, function(err, res, body) {
writeJson('test.json', body, function(err) {
if (err) console.log(err);
});
});
fs.readFile('test.json', 'utf8', function (err,data) {
let result = data.replace(/[\\~#%&*<>?|\-]/g, '');
fs.writeFile('test.json', result, 'utf8', function (err) {
if (err) return console.log(err);
// if i do this is normal json
// console.log(result)
});
});
and the output in the file is:
// coords.json
"{\"ip\":\"**.**.**.**\",\"country_code\":\"IT\",\"country_name\":\"Italy\",\"region_code\":\"62\",\"region_name\":\"Latium\",\"city\":\"Rome\",\"zip_code\":\"00119\",\"time_zone\":\"Europe/Rome\",\"latitude\":**.*,\"longitude\":**.**\"metro_code\":0}\n"
but if i print it in console it's normal...
I really recommend that you use JSON.parse. It will parse your json and put it into a variable you can use:
fs.readFile('test.json', 'utf8', function (err,data) {
data = JSON.parse(data); // Yay you can use anything from the JSON
}
The \ are there to escape the quotes so that they don't end the string. They shouldn't affect anything, and are actually necessary. Have you tried it without the regex? That could be breaking things if it actually removes the .
I'm using the excellent Request library for downloading files in Node for a small command line tool I'm working on. Request works perfectly for pulling in a single file, no problems at all, but it's not working for ZIPs.
For example, I'm trying to download the Twitter Bootstrap archive, which is at the URL:
http://twitter.github.com/bootstrap/assets/bootstrap.zip
The relevant part of the code is:
var fileUrl = "http://twitter.github.com/bootstrap/assets/bootstrap.zip";
var output = "bootstrap.zip";
request(fileUrl, function(err, resp, body) {
if(err) throw err;
fs.writeFile(output, body, function(err) {
console.log("file written!");
}
}
I've tried setting the encoding to "binary" too but no luck. The actual zip is ~74KB, but when downloaded through the above code it's ~134KB and on double clicking in Finder to extract it, I get the error:
Unable to extract "bootstrap" into "nodetest" (Error 21 - Is a directory)
I get the feeling this is an encoding issue but not sure where to go from here.
Yes, the problem is with encoding. When you wait for the whole transfer to finish body is coerced to a string by default. You can tell request to give you a Buffer instead by setting the encoding option to null:
var fileUrl = "http://twitter.github.com/bootstrap/assets/bootstrap.zip";
var output = "bootstrap.zip";
request({url: fileUrl, encoding: null}, function(err, resp, body) {
if(err) throw err;
fs.writeFile(output, body, function(err) {
console.log("file written!");
});
});
Another more elegant solution is to use pipe() to point the response to a file writable stream:
request('http://twitter.github.com/bootstrap/assets/bootstrap.zip')
.pipe(fs.createWriteStream('bootstrap.zip'))
.on('close', function () {
console.log('File written!');
});
A one liner always wins :)
pipe() returns the destination stream (the WriteStream in this case), so you can listen to its close event to get notified when the file was written.
I was searching about a function which request a zip and extract it without create any file inside my server, here is my TypeScript function, it use JSZIP module and Request:
let bufs : any = [];
let buf : Uint8Array;
request
.get(url)
.on('end', () => {
buf = Buffer.concat(bufs);
JSZip.loadAsync(buf).then((zip) => {
// zip.files contains a list of file
// chheck JSZip documentation
// Example of getting a text file : zip.file("bla.txt").async("text").then....
}).catch((error) => {
console.log(error);
});
})
.on('error', (error) => {
console.log(error);
})
.on('data', (d) => {
bufs.push(d);
})