Use gunzip on a folder - javascript

Right so I have a folder full of other folders, which are compressed into .gz files. Inside these folders is text files.
I want to have a program that loops through these text files to see if they contain a specific string, but to do so I need to uncompress them first. I don't want to start messing about with files (unless I can just make them temporarily and delete after), i just want to perform operations on the contents of the .gz folder. I've tried zlib.Gunzip()._outBuffer.toString() which gives a load of gibberish when used on a compressed folder.
How should I proceed?

Had to to something quite similar recently, here's what worked for me:
Basically you just read in the file into a buffer which you then can pass to the gunzip function. This will return another buffer on which you can invoke toString('utf8') in order the get the contents as a string, which is exactly what you need:
const util = require('util');
let {gunzip} = require('zlib');
const fs = require('fs');
gunzip = util.promisify(gunzip);
async function getStringFromGzipFile(inputFilePath) {
const sourceBuffer = await fs.promises.readFile(inputFilePath);
return await gunzip(sourceBuffer);
}
(async () => {
const stringContent = await getStringFromGzipFile('/path/to/file');
console.log(stringContent);
})()
EDIT:
If you want to gunzip and extract a directory, you can use tar-fs which will extract the contents to a specified directory. Once your done processing the files in it you could just remove the directory. Here's how you would gunzip and extract a .tar.gz:
function gunzipFolder(sourceDir, destination) {
fs.createReadStream(sourceDir)
.pipe(zlib.createGunzip())
.pipe(tar.extract(destination));
}

Related

nodejs extract/decompress all content of .gz compressed files to directory

i have a file in the file system named temp.gz
the temp gzip file contains files and folder
i would like to extract them and have them in a folder like we would normally unzip a file in OS.
i have tried below code but thats not meant for extracting files to folder.
const unzip = zlib.createUnzip();
var input = fs.createReadStream(inFilename); /path/to/temp.gz
var output = fs.createWriteStream(outFilename); /path/to/output/folder
Above didnt work and i believe the reason is that it writes to a file and i have provided a directoy.
my requirement is extract files to directory.
zlib.gzip(responseData, (err, buffer) => {
// Calling gunzip method
zlib.gunzip(buffer, (err, buffer) => {
console.log(buffer,pathFileName);
fs.appendFileSync(pathFileName, buffer);
});
});
above i was trying to unzip as per the docs and write buffer to the output file and didnt create a folder as expected and didnt add the files there
sure i am missing something but not sure what.

How do I read a Huge Json file into a single object using NodeJS?

I'm upgrading a backend system that uses require('./file.json') to read a 1GB Json file into a object, and then passes that object to other parts of the system to be used.
I'm aware of two ways to read json files into an object
const fs = require('fs');
const rawdata = fs.readFileSync('file.json');
const data = JSON.parse(rawdata);
and
const data = require('./file.json');
This works fine in older versions of node(12) but not in newer version (14 or 16)
So I need to find another way to get this 1GB big file.json into const data without running into the ERR_STRING_TOO_LONG / Cannot create a string longer than 0x1fffffe8 characters error.
I've seen examples on StackOverflow etc. on how to Stream Huge Json files like this and break it down into smaller objects processing them individually, but this is not what I'm looking for, I need it in one data object so that entire parts of the system that expect a single data object don't have to be refactored to handle a stream.
Note: The Top-level object in the Json file is not an array.
Using big-json solves this problem.
npm install big-json
const fs = require('fs');
const path = require('path');
const json = require('big-json');
const readStream = fs.createReadStream('file.json');
const parseStream = json.createParseStream();
parseStream.on('data', function(pojo) {
// => receive reconstructed POJO
});
readStream.pipe(parseStream);
You need to stream it, so process it in chunks instead of loading it all into memory in a single point in time.
const fs = require("fs");
const stream = fs.createReadStream("file.json");
stream.on("data", (data) => {
console.log(data.toString());
});

Is it possible to get an image from a message and add it to a folder in Discord.js?

Like is it possible to do something like !image add [image file] and then add the attachment to a folder? I think i can do that with fs, but i'm not sure how
You can use the fs function fs.writeFile() or fs.writeFileSync(). This function accepts the absolute path to a file to write to, and the data to write. In your case, it should be a buffer or stream.
// const fs = require('fs');
fs.writeFileSync('./some_dir/some_file_name.extension', data);
To get the data in question, you should access Message#attachments(), a collection of all attachments on the message. Assuming you only want the first, you can use Collection#first() to narrow down the results.
const attachment = message.attachments.first();
if (!attachment) {
// maybe place in some error handling
}
Unfortunately, the MessageAttachment class doesn't actually hold a buffer/stream representing the attachment, only the URL leading to it. This means you'll need a third-party library such as axios or node-fetch.
// const fetch = require('node-fetch');
fetch(attachment.url)
.then(res => res.buffer())
.then(buffer => {
fs.writeFileSync(`./images/${attachment.name}`, buffer);
});
Make sure to validate that URL to make sure it's an image!
if(!/\.(png|jpe?g|svg)$/.test(attachment.url)) {
// this attachment isn't an image!
// we don't want to be downloading .exe files now, do we?
}
Finally, you should also be weary that if two files are named the same, such as image.png, trying to write the second one will overwrite the first. One way to overcome that issue is to add numerical suffixes to duplicates, such as image.png, image-1.png, image-2.png, etc. That could work out a little like this:
fetch(attachment.url)
.then(res => res.buffer())
.then(buffer => {
let path = `./images/${attachment.name}`;
// increment the suffix every iteration until a file
// by the same name cannot be found
for (let count = 1; fs.existsSync(path); count++) {
path = `./images/${attachment.name}-${count}`;
}
fs.writeFileSync(path, buffer);
});

Changing format of the file

Can I change the format of the file by using Native File System API? For example, when I read the every .pdf files of the one directory and changed them to the .jpeg files?
You can use a library like PDF.js to run the conversion as outlined in this gist. For the actual folder iteration, try this:
const dirHandle = await window.showDirectoryPicker();
for await (const entry of dirHandle.values()) {
if (entry.kind === 'file' && entry.name.endsWith('.pdf')) {
const file = await entry.getFile();
// Convert to image as outlined in gist.
}
}

Writing and reading to a file using streams

The below code writes to a file whatever I type in the console. It also simultaneously reads from the same file and displays whatever that's in the file.
Everything I type in the console is saved in the file. I manually went and checked it. But whatever I type doesn't get displayed simultaneously.
const fs = require('fs');
const Wstream = fs.createWriteStream('./testfile.txt', {encoding: 'utf8'});
const Rstream = fs.createReadStream('./testfile.txt', {encoding: 'utf8'});
process.stdin.pipe(Wstream);
Rstream.pipe(process.stdout);
Why isn't this the same as the following?
process.stdin.pipe(process.stdout);
The read stream will be closed when the data in ./testfile.txt is fully piped. It will not wait for additional entries or changes.
You can use fs.watch, to listen for file changes or even better use something easier like tail-stream or node-tail.

Categories

Resources