Unzip a zip file with JavaScript [duplicate] - javascript

I want to display OpenOffice files, .odt and .odp at client side using a web browser.
These files are zipped files. Using Ajax, I can get these files from server but these are zipped files. I have to unzip them using JavaScript, I have tried using inflate.js, http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt, but without success.
How can I do this?

I wrote an unzipper in Javascript. It works.
It relies on Andy G.P. Na's binary file reader and some RFC1951 inflate logic from notmasteryet. I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
The source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB: the links are dead; I'll find a new host soon.
Included in the source is a ZipFile.htm demonstration page, and 3 distinct scripts, one for the zipfile class, one for the inflate class, and one for a binary file reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, all of the necessary source is there.
Here's what the application code looks like in Javascript:
// In my demo, this gets attached to a click event.
// it instantiates a ZipFile, and provides a callback that is
// invoked when the zip is read. This can take a few seconds on a
// large zip file, so it's asynchronous.
var readFile = function(){
$("#status").html("<br/>");
var url= $("#urlToLoad").val();
var doneReading = function(zip){
extractEntries(zip);
};
var zipFile = new ZipFile(url, doneReading);
};
// this function extracts the entries from an instantiated zip
function extractEntries(zip){
$('#report').accordion('destroy');
// clear
$("#report").html('');
var extractCb = function(id) {
// this callback is invoked with the entry name, and entry text
// in my demo, the text is just injected into an accordion panel.
return (function(entryName, entryText){
var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>");
$("#"+id).html(content);
$("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>");
$('#report').accordion('destroy');
$('#report').accordion({collapsible:true, active:false});
});
}
// for each entry in the zip, extract it.
for (var i=0; i<zip.entries.length; i++) {
var entry = zip.entries[i];
var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>";
// contrive an id for the entry, make it unique
var randomId = "id-"+ Math.floor((Math.random() * 1000000000));
entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId +
"'></span></span></div>\n";
// insert the info for one entry as the last child within the report div
$("#report").append(entryInfo);
// extract asynchronously
entry.extract(extractCb(randomId));
}
}
The demo works in a couple of steps: The readFile fn is triggered by a click, and instantiates a ZipFile object, which reads the zip file. There's an asynchronous callback for when the read completes (usually happens in less than a second for reasonably sized zips) - in this demo the callback is held in the doneReading local variable, which simply calls extractEntries, which
just blindly unzips all the content of the provided zip file. In a real app you would probably choose some of the entries to extract (allow the user to select, or choose one or more entries programmatically, etc).
The extractEntries fn iterates over all entries, and calls extract() on each one, passing a callback. Decompression of an entry takes time, maybe 1s or more for each entry in the zipfile, which means asynchrony is appropriate. The extract callback simply adds the extracted content to an jQuery accordion on the page. If the content is binary, then it gets formatted as such (not shown).
It works, but I think that the utility is somewhat limited.
For one thing: It's very slow. Takes ~4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than .5s in a .NET program. EDIT: The Javascript ZipFile unpacks considerably faster than this now, in IE9 and in Chrome. It is still slower than a compiled program, but it is plenty fast for normal browser usage.
For another: it does not do streaming. It basically slurps in the entire contents of the zipfile into memory. In a "real" programming environment you could read in only the metadata of a zip file (say, 64 bytes per entry) and then read and decompress the other data as desired. There's no way to do IO like that in javascript, as far as I know, therefore the only option is to read the entire zip into memory and do random access in it. This means it will place unreasonable demands on system memory for large zip files. Not so much a problem for a smaller zip file.
Also: It doesn't handle the "general case" zip file - there are lots of zip options that I didn't bother to implement in the unzipper - like ZIP encryption, WinZip encryption, zip64, UTF-8 encoded filenames, and so on. (EDIT - it handles UTF-8 encoded filenames now). The ZipFile class handles the basics, though. Some of these things would not be hard to implement. I have an AES encryption class in Javascript; that could be integrated to support encryption. Supporting Zip64 would probably useless for most users of Javascript, as it is intended to support >4gb zipfiles - don't need to extract those in a browser.
I also did not test the case for unzipping binary content. Right now it unzips text. If you have a zipped binary file, you'd need to edit the ZipFile class to handle it properly. I didn't figure out how to do that cleanly. It does binary files now, too.
EDIT - I updated the JS unzip library and demo. It now does binary files, in addition to text. I've made it more resilient and more general - you can now specify the encoding to use when reading text files. Also the demo is expanded - it shows unzipping an XLSX file in the browser, among other things.
So, while I think it is of limited utility and interest, it works. I guess it would work in Node.js.

I'm using zip.js and it seems to be quite useful. It's worth a look!
Check the Unzip demo, for example.

I found jszip quite useful. I've used so far only for reading, but they have create/edit capabilities as well.
Code wise it looks something like this
var new_zip = new JSZip();
new_zip.load(file);
new_zip.files["doc.xml"].asText() // this give you the text in the file
One thing I noticed is that it seems the file has to be in binary stream format (read using the .readAsArrayBuffer of FileReader(), otherwise I was getting errors saying I might have a corrupt zip file
Edit: Note from the 2.x to 3.0.0 upgrade guide:
The load() method and the constructor with data (new JSZip(data)) have
been replaced by loadAsync().
Thanks user2677034

If you need to support other formats as well or just need good performance, you can use this WebAssembly library
it's promised based, it uses WebWorkers for threading and API is actually simple ES module
How to use
Install with npm i libarchive.js and use it as a ES module.
The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method.
import {Archive} from 'libarchive.js/main.js';
Archive.init({
workerUrl: 'libarchive.js/dist/worker-bundle.js'
});
document.getElementById('file').addEventListener('change', async (e) => {
const file = e.currentTarget.files[0];
const archive = await Archive.open(file);
let obj = await archive.extractFiles();
console.log(obj);
});
// outputs
{
".gitignore": {File},
"addon": {
"addon.py": {File},
"addon.xml": {File}
},
"README.md": {File}
}

I wrote "Binary Tools for JavaScript", an open source project that includes the ability to unzip, unrar and untar: https://github.com/codedread/bitjs
Used in my comic book reader: https://github.com/codedread/kthoom (also open source).
HTH!

If anyone's reading images or other binary files from a zip file hosted at a remote server, you can use following snippet to download and create zip object using the jszip library.
// this function just get the public url of zip file.
let url = await getStorageUrl(path)
console.log('public url is', url)
//get the zip file to client
axios.get(url, { responseType: 'arraybuffer' }).then((res) => {
console.log('zip download status ', res.status)
//load contents into jszip and create an object
jszip.loadAsync(new Blob([res.data], { type: 'application/zip' })).then((zip) => {
const zipObj = zip
$.each(zip.files, function (index, zipEntry) {
console.log('filename', zipEntry.name)
})
})
Now using the zipObj you can access the files and create a src url for it.
var fname = 'myImage.jpg'
zipObj.file(fname).async('blob').then((blob) => {
var blobUrl = URL.createObjectURL(blob)

Related

Filenames containing diacritics are not displayed properly using Javascript in NodeJs

So, after having not programmed for 23 years, I decided to start learning Javascript.
I am trying to write a program to read through my music files and create a HTML page based on the files found in a specific directory.
It goes well until I hit filenames containing diacritics in it (like é, ü, ø etc).
For Example: André Hazes turns into : André Hazes
For Example: Andrea Bocelli & Sarah Brightman - Time to Say Goodbye [Con Te Partirò] (single) turn into Andrea Bocelli & Sarah Brightman - Time to Say Goodbye [Con Te PartiroÌ€] (single)
The link I have created doesn't work anymore
The command I use to create the HTML statement is:
<td>${item.vFilename}</td>
This is the code I use to read the files from the filesystem. I work on a Mac, OS Catalina, so basically an Unix variant.
// List all files in a directory in Node.js recursively in a synchronous fashion
var ReadDirFiles = function(pdir, pfilelist) {
files = vFileSystem.readdirSync(pdir,"utf-8");
filelist = pfilelist;
files.forEach(function(file) {
if (vFileSystem.statSync(pdir + '/' + file).isDirectory()) {
filelist = ReadDirFiles(pdir + '/' + file, filelist);
}
else {
vstats = vFileSystem.statSync(pdir + '/' + file);
// debug info
// console.log(vstats);
filelist.push({vFilename: file, vDir: pdir, vBirthtime: formatDate(vstats.birthtime), vSize: vstats.size});
}
});
return filelist;
};
This is the statement I use to write the output to disk and it turns out the problem is in the write statement:
fs.writeFileSync(buildPathHtml.buildPathHtml(), html);
When the output is written back to disk, the conversion of the diacritics happens.
Anyone knows the trick how to work diacritics?
Try using encoding and decoding functions in your script. There are many functions doing this in Javascript, just use what you need. For a complete encoding/decoding you can use (or only copy & paste) into your script the code suggested in https://www.strictly-software.com/htmlencode/. There is a little encoding Javascript library doing the job (encoder.js).
Maybe your file system isn't utf8 based. I read somewhere it's Western for Mac.

Detecting Hard Links in Node.js

How can I tell if a file-system path is a hard link with Node.js? The function fs.lstat gives a stats object that, when given a hard link will return true for stats.isDirectory() and stats.isFile() respectively. fs.lstat doesn't offer up anything to note the difference between a normal file or directory and a linked one.
If my understanding of how linking (ln) works is correct, then a linked file points to the same place on the disk as the original file. This would mean that both the original and linked version are identical, and there is no way to tell the difference between the original file and the linked.
The functionality I'm looking for is as follows:
This is hypothetical pseudo-code for demonstration & communication purposes.
fs.writeFileSync('./file.txt', 'hello world')
fs.linkSync('./file.txt', './link.txt')
fs.isLinkSync('./file.txt') // => false
fs.isLinkSync('./link.txt') // => true
fs.linkChildrenSync('./file.txt') // => ['./link.txt']
fs.linkChildrenSync('./link.txt') // => []
fs.linkParentSync('./link.txt') // => './file.txt'
fs.linkParentSync('./file.txt') // => null
Alright.. just for fun...
You may have an option for finding the files via inode in a certain directory.
Once you grab the inode ID from the stat object..
fs.stat('./okay.file', function(err, stats){
var inodeID = stats.ino; // Double check that this is correct
});
You can then iterate over all the files in the folder and check with a conditional if the inode ID matches. Get all files in a directory. If it doesn't, you can assume there is no link (IN that current directory).
However, it doesn't look like we could search for a file by the inode id. see: nodejs open nfs files by inode (or a the fastest way to reopen a file)
fs.lstat: https://nodejs.org/api/fs.html#fs_fs_lstat_path_callback
Stats object: https://nodejs.org/api/fs.html#fs_class_fs_stats
Sorry but that is not possible you can't differ between original and hard linked file. They are the same on your linux system and poinzing to the same inode.

How to create file(.apk) from URL in Jaggery?

I have application store and applications have their url. I want to download apks from those urls to my jaggery server. Although below code(my first solution) create myApp.apk successfully, its not work properly.
First i tried to below code,
var url = "http://img.xxx.com/006/someApp.apk";
var data = get(url, {});
var file = new File("myApp.apk");
file.open("w");
file.write(data.data);
file.close();
when i print data.data value, its look like
i also tried,
var file = new File("http://img.xxx.com/006/someApp.apk");
file.saveAs("myApp.txt");
Can anyone help me?
.apk files are Android application files, and they are expected to start with PK, because they are actually zip archives!
They're not meant to be unzipped, although you can do it to see some of the application resources (but there are better ways for reverse engineering .apk files such as Apktool, if that's what you're looking for).
According to jaggery documentations, file.write is writing the String representation of the object to the file. So that's why you are getting an apk file which cannot be installed.
However you can make it work using copyURLToFile in apache commons-io java library as follows since jaggery supports java itself and all of WSO2 products have apache commons-io library in their class path.
<%
var JFileUtils = Packages.org.apache.commons.io.FileUtils;
var JUrl = Packages.java.net.URL;
var JFile = Packages.java.io.File;
var url = new JUrl("http://img.xxx.com/006/someApp.apk");
JFileUtils.copyURLToFile(url, new JFile("myApp.apk"));
print("done");
%>
Your file will be stored on $CARBON_HOME directory by default, unless you specified relative or absolute path to the file.

How to fetch file content (basically read) a local file in javascript for UIAutomation iOS

Is there a possible way to read a local file in JavaScript.
MyFolder:
db.csv
Parse.js
Trying to fetch the contents of file db.csv in Parse.js, But in vain.
Can you share some links where I can get enough knowledge how to read a file.
Running Instruments in Xcode5, with test scripts in .js file where I have to feed in some values from a .csv file.
iOS UIAutomation, apple provides an api for running a task on the target's host.
performTaskWithPathArgumentsTimeout
Using this, we can have a bash script to printout the contents of a file that we wanted to fetch in the first case.
Bash script can be as simple as this for this requirement.
#! /bin/bash
FILE_NAME="$1"
cat $FILE_NAME
Save it as for example FileReader.sh file.
And in your automation script,
var target = UIATarget.localTarget();
var host = target.host();
var result = host.performTaskWithPathArgumentsTimeout(executablePath,[filePath,fileName], 15);
UIALogger.logDebug("exitCode: " + result.exitCode);
UIALogger.logDebug("stdout: " + result.stdout);
UIALogger.logDebug("stderr: " + result.stderr);
where in,
executablePath is where the command need to be executed.
var executablePath = "/bin/sh";
filePath is the location of the created FileReader.sh file. When executed, outputs the content to standard output (in our requirement).
[give full absolute path of the file]
fileName is the actual file to fetch contents from.
[give full absolute path of the file] In my case I had a Contents.csv file, which I had to read.
and the last parameter is the timeout in seconds.
Hope this helps others, trying to fetch contents (reading files) for performing iOS UIAutomation.
References:
https://stackoverflow.com/a/19016573/344798
https://developer.apple.com/library/iOS/documentation/UIAutomation/Reference/UIAHostClassReference/UIAHost/UIAHost.html
If the file is on the same domain as the site you're in, you'd load it with Ajax. If you're using Ajax, it's be something like
$.get('db.csv', function(csvContent){
//process here
});
Just note that the path to the csv file will be relative to the web page you're in, not the JavaScript file.
If you're not using jQuery, you'd have to manually work with an XmlHttpRequest object to do your Ajax call.
And though your question doesn't (seem to) deal with it, if the file is located on a different domain, then you'd have to use either jsonP or CORS.
And, just in case this is your goal, no, you can't, in client side JavaScript open up some sort of Stream and read in a file. That would be a monstrous security vulnerability.
This is a fairly simple function in Illuminator's host functions library:
function readFromFile(path) {
var result = target.host().performTaskWithPathArgumentsTimeout("/bin/cat", [path], 10);
// be verbose if something didn't go well
if (0 != result.exitCode) {
throw new Error("readFromFile failed: " + result.stderr);
}
return result.stdout;
}
If you are using Illuminator, this is host().readFromFile(path).

Unzipping files

I want to display OpenOffice files, .odt and .odp at client side using a web browser.
These files are zipped files. Using Ajax, I can get these files from server but these are zipped files. I have to unzip them using JavaScript, I have tried using inflate.js, http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt, but without success.
How can I do this?
I wrote an unzipper in Javascript. It works.
It relies on Andy G.P. Na's binary file reader and some RFC1951 inflate logic from notmasteryet. I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
The source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB: the links are dead; I'll find a new host soon.
Included in the source is a ZipFile.htm demonstration page, and 3 distinct scripts, one for the zipfile class, one for the inflate class, and one for a binary file reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, all of the necessary source is there.
Here's what the application code looks like in Javascript:
// In my demo, this gets attached to a click event.
// it instantiates a ZipFile, and provides a callback that is
// invoked when the zip is read. This can take a few seconds on a
// large zip file, so it's asynchronous.
var readFile = function(){
$("#status").html("<br/>");
var url= $("#urlToLoad").val();
var doneReading = function(zip){
extractEntries(zip);
};
var zipFile = new ZipFile(url, doneReading);
};
// this function extracts the entries from an instantiated zip
function extractEntries(zip){
$('#report').accordion('destroy');
// clear
$("#report").html('');
var extractCb = function(id) {
// this callback is invoked with the entry name, and entry text
// in my demo, the text is just injected into an accordion panel.
return (function(entryName, entryText){
var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>");
$("#"+id).html(content);
$("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>");
$('#report').accordion('destroy');
$('#report').accordion({collapsible:true, active:false});
});
}
// for each entry in the zip, extract it.
for (var i=0; i<zip.entries.length; i++) {
var entry = zip.entries[i];
var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>";
// contrive an id for the entry, make it unique
var randomId = "id-"+ Math.floor((Math.random() * 1000000000));
entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId +
"'></span></span></div>\n";
// insert the info for one entry as the last child within the report div
$("#report").append(entryInfo);
// extract asynchronously
entry.extract(extractCb(randomId));
}
}
The demo works in a couple of steps: The readFile fn is triggered by a click, and instantiates a ZipFile object, which reads the zip file. There's an asynchronous callback for when the read completes (usually happens in less than a second for reasonably sized zips) - in this demo the callback is held in the doneReading local variable, which simply calls extractEntries, which
just blindly unzips all the content of the provided zip file. In a real app you would probably choose some of the entries to extract (allow the user to select, or choose one or more entries programmatically, etc).
The extractEntries fn iterates over all entries, and calls extract() on each one, passing a callback. Decompression of an entry takes time, maybe 1s or more for each entry in the zipfile, which means asynchrony is appropriate. The extract callback simply adds the extracted content to an jQuery accordion on the page. If the content is binary, then it gets formatted as such (not shown).
It works, but I think that the utility is somewhat limited.
For one thing: It's very slow. Takes ~4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than .5s in a .NET program. EDIT: The Javascript ZipFile unpacks considerably faster than this now, in IE9 and in Chrome. It is still slower than a compiled program, but it is plenty fast for normal browser usage.
For another: it does not do streaming. It basically slurps in the entire contents of the zipfile into memory. In a "real" programming environment you could read in only the metadata of a zip file (say, 64 bytes per entry) and then read and decompress the other data as desired. There's no way to do IO like that in javascript, as far as I know, therefore the only option is to read the entire zip into memory and do random access in it. This means it will place unreasonable demands on system memory for large zip files. Not so much a problem for a smaller zip file.
Also: It doesn't handle the "general case" zip file - there are lots of zip options that I didn't bother to implement in the unzipper - like ZIP encryption, WinZip encryption, zip64, UTF-8 encoded filenames, and so on. (EDIT - it handles UTF-8 encoded filenames now). The ZipFile class handles the basics, though. Some of these things would not be hard to implement. I have an AES encryption class in Javascript; that could be integrated to support encryption. Supporting Zip64 would probably useless for most users of Javascript, as it is intended to support >4gb zipfiles - don't need to extract those in a browser.
I also did not test the case for unzipping binary content. Right now it unzips text. If you have a zipped binary file, you'd need to edit the ZipFile class to handle it properly. I didn't figure out how to do that cleanly. It does binary files now, too.
EDIT - I updated the JS unzip library and demo. It now does binary files, in addition to text. I've made it more resilient and more general - you can now specify the encoding to use when reading text files. Also the demo is expanded - it shows unzipping an XLSX file in the browser, among other things.
So, while I think it is of limited utility and interest, it works. I guess it would work in Node.js.
I'm using zip.js and it seems to be quite useful. It's worth a look!
Check the Unzip demo, for example.
I found jszip quite useful. I've used so far only for reading, but they have create/edit capabilities as well.
Code wise it looks something like this
var new_zip = new JSZip();
new_zip.load(file);
new_zip.files["doc.xml"].asText() // this give you the text in the file
One thing I noticed is that it seems the file has to be in binary stream format (read using the .readAsArrayBuffer of FileReader(), otherwise I was getting errors saying I might have a corrupt zip file
Edit: Note from the 2.x to 3.0.0 upgrade guide:
The load() method and the constructor with data (new JSZip(data)) have
been replaced by loadAsync().
Thanks user2677034
If you need to support other formats as well or just need good performance, you can use this WebAssembly library
it's promised based, it uses WebWorkers for threading and API is actually simple ES module
How to use
Install with npm i libarchive.js and use it as a ES module.
The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method.
import {Archive} from 'libarchive.js/main.js';
Archive.init({
workerUrl: 'libarchive.js/dist/worker-bundle.js'
});
document.getElementById('file').addEventListener('change', async (e) => {
const file = e.currentTarget.files[0];
const archive = await Archive.open(file);
let obj = await archive.extractFiles();
console.log(obj);
});
// outputs
{
".gitignore": {File},
"addon": {
"addon.py": {File},
"addon.xml": {File}
},
"README.md": {File}
}
I wrote "Binary Tools for JavaScript", an open source project that includes the ability to unzip, unrar and untar: https://github.com/codedread/bitjs
Used in my comic book reader: https://github.com/codedread/kthoom (also open source).
HTH!
If anyone's reading images or other binary files from a zip file hosted at a remote server, you can use following snippet to download and create zip object using the jszip library.
// this function just get the public url of zip file.
let url = await getStorageUrl(path)
console.log('public url is', url)
//get the zip file to client
axios.get(url, { responseType: 'arraybuffer' }).then((res) => {
console.log('zip download status ', res.status)
//load contents into jszip and create an object
jszip.loadAsync(new Blob([res.data], { type: 'application/zip' })).then((zip) => {
const zipObj = zip
$.each(zip.files, function (index, zipEntry) {
console.log('filename', zipEntry.name)
})
})
Now using the zipObj you can access the files and create a src url for it.
var fname = 'myImage.jpg'
zipObj.file(fname).async('blob').then((blob) => {
var blobUrl = URL.createObjectURL(blob)

Categories

Resources