File and Directory Entries API broken in Chrome? - javascript

I'm trying to use the File and Directory Entries API to create a file uploader tool that will allow me to drop an arbitrary combination of files and directories into a browser window, to be read and uploaded.
(I'm fully aware that similar functionality can be achieved by using an file input element with webkitdirectory enabled, but I'm testing a use case where the user isn't forced to put everything into a single folder)
Using the Drag and Drop API, I've managed to read the DataTransfer items and convert them to FileSystemEntry objects using DataTransferItem.webkitGetAsEntry.
From there, I am able to tell that if the entry is a FileSystemFileEntry or a FileSystemDirectoryEntry. My plan of course if to recursively walk the directory structure, if any, which I should be able to do using the FileSystemDirectoryReader method readEntries, like this:
handleDrop(event) {
event.preventDefault();
event.stopPropagation();
//assuming I dropped only one directory
const directory = event.dataTransfer.items[0];
const directoryEntry = directory.webkitGetAsEntry();
const directoryReader = directoryEntry.createReader();
directoryReader.readEntries(function(entires){
// callback: the "entries" param is an Array
// containing the directory entries
});
}
However, I'm running into the following issue: in Chrome, the readEntries method only returns 100 entries. Apparently, this is the expected behavior as the way to obtain subsequent files from the directory is to call readEntries again. However, I'm finding this impossible to do. A subsequent call to the method throws the error:
DOMException: An operation that depends on state cached in an interface object was made but the state had changed since it was read from disk.
Does anyone know a way around this? Is this API hopelessly broken for directories of 100+ files in Chrome? Is this API deprecated? (not that it was ever "precated"). In Firefox, readEntries returns the whole directory content at once, which apparently against the spec, but it is usable.
Please advice.

Of course, as soon as I had posted this question the answer hit me. What I was trying to do was akin to the following:
handleDrop(event) {
event.preventDefault();
event.stopPropagation();
//assuming I dropped only one directory
const directory = event.dataTransfer.items[0];
const directoryEntry = directory.webkitGetAsEntry();
const directoryReader = directoryEntry.createReader();
directoryReader.readEntries(function(entries){
// callback: the "entries" param is an Array
// containing the directory entries
}, );
directoryReader.readEntries(function(entries){
//call entries a second time
});
}
The problem with this is that readEntries is asynchronous, so I'm trying to call it while it's "busy" reading the first batch (I'm sure lower-level programmers will have a better term for that). A better way of achieving what I was trying to do:
handleDrop(event) {
event.preventDefault();
event.stopPropagation();
//assuming I dropped only one directory
const directory = event.dataTransfer.items[0];
const directoryEntry = directory.webkitGetAsEntry();
const directoryReader = directoryEntry.createReader();
function read(){
directoryReader.readEntries(function(entries){
if(entries.length > 0) {
//do something with the entries
read(); //read the next batch
} else {
//do whatever needs to be done after
//all files are read
}
});
}
read();
}
This way we ensure the FileSystemDirectoryReader is done with one batch before starting the next one.

Related

Unzip a zip file with JavaScript [duplicate]

I want to display OpenOffice files, .odt and .odp at client side using a web browser.
These files are zipped files. Using Ajax, I can get these files from server but these are zipped files. I have to unzip them using JavaScript, I have tried using inflate.js, http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt, but without success.
How can I do this?
I wrote an unzipper in Javascript. It works.
It relies on Andy G.P. Na's binary file reader and some RFC1951 inflate logic from notmasteryet. I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
The source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB: the links are dead; I'll find a new host soon.
Included in the source is a ZipFile.htm demonstration page, and 3 distinct scripts, one for the zipfile class, one for the inflate class, and one for a binary file reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, all of the necessary source is there.
Here's what the application code looks like in Javascript:
// In my demo, this gets attached to a click event.
// it instantiates a ZipFile, and provides a callback that is
// invoked when the zip is read. This can take a few seconds on a
// large zip file, so it's asynchronous.
var readFile = function(){
$("#status").html("<br/>");
var url= $("#urlToLoad").val();
var doneReading = function(zip){
extractEntries(zip);
};
var zipFile = new ZipFile(url, doneReading);
};
// this function extracts the entries from an instantiated zip
function extractEntries(zip){
$('#report').accordion('destroy');
// clear
$("#report").html('');
var extractCb = function(id) {
// this callback is invoked with the entry name, and entry text
// in my demo, the text is just injected into an accordion panel.
return (function(entryName, entryText){
var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>");
$("#"+id).html(content);
$("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>");
$('#report').accordion('destroy');
$('#report').accordion({collapsible:true, active:false});
});
}
// for each entry in the zip, extract it.
for (var i=0; i<zip.entries.length; i++) {
var entry = zip.entries[i];
var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>";
// contrive an id for the entry, make it unique
var randomId = "id-"+ Math.floor((Math.random() * 1000000000));
entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId +
"'></span></span></div>\n";
// insert the info for one entry as the last child within the report div
$("#report").append(entryInfo);
// extract asynchronously
entry.extract(extractCb(randomId));
}
}
The demo works in a couple of steps: The readFile fn is triggered by a click, and instantiates a ZipFile object, which reads the zip file. There's an asynchronous callback for when the read completes (usually happens in less than a second for reasonably sized zips) - in this demo the callback is held in the doneReading local variable, which simply calls extractEntries, which
just blindly unzips all the content of the provided zip file. In a real app you would probably choose some of the entries to extract (allow the user to select, or choose one or more entries programmatically, etc).
The extractEntries fn iterates over all entries, and calls extract() on each one, passing a callback. Decompression of an entry takes time, maybe 1s or more for each entry in the zipfile, which means asynchrony is appropriate. The extract callback simply adds the extracted content to an jQuery accordion on the page. If the content is binary, then it gets formatted as such (not shown).
It works, but I think that the utility is somewhat limited.
For one thing: It's very slow. Takes ~4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than .5s in a .NET program. EDIT: The Javascript ZipFile unpacks considerably faster than this now, in IE9 and in Chrome. It is still slower than a compiled program, but it is plenty fast for normal browser usage.
For another: it does not do streaming. It basically slurps in the entire contents of the zipfile into memory. In a "real" programming environment you could read in only the metadata of a zip file (say, 64 bytes per entry) and then read and decompress the other data as desired. There's no way to do IO like that in javascript, as far as I know, therefore the only option is to read the entire zip into memory and do random access in it. This means it will place unreasonable demands on system memory for large zip files. Not so much a problem for a smaller zip file.
Also: It doesn't handle the "general case" zip file - there are lots of zip options that I didn't bother to implement in the unzipper - like ZIP encryption, WinZip encryption, zip64, UTF-8 encoded filenames, and so on. (EDIT - it handles UTF-8 encoded filenames now). The ZipFile class handles the basics, though. Some of these things would not be hard to implement. I have an AES encryption class in Javascript; that could be integrated to support encryption. Supporting Zip64 would probably useless for most users of Javascript, as it is intended to support >4gb zipfiles - don't need to extract those in a browser.
I also did not test the case for unzipping binary content. Right now it unzips text. If you have a zipped binary file, you'd need to edit the ZipFile class to handle it properly. I didn't figure out how to do that cleanly. It does binary files now, too.
EDIT - I updated the JS unzip library and demo. It now does binary files, in addition to text. I've made it more resilient and more general - you can now specify the encoding to use when reading text files. Also the demo is expanded - it shows unzipping an XLSX file in the browser, among other things.
So, while I think it is of limited utility and interest, it works. I guess it would work in Node.js.
I'm using zip.js and it seems to be quite useful. It's worth a look!
Check the Unzip demo, for example.
I found jszip quite useful. I've used so far only for reading, but they have create/edit capabilities as well.
Code wise it looks something like this
var new_zip = new JSZip();
new_zip.load(file);
new_zip.files["doc.xml"].asText() // this give you the text in the file
One thing I noticed is that it seems the file has to be in binary stream format (read using the .readAsArrayBuffer of FileReader(), otherwise I was getting errors saying I might have a corrupt zip file
Edit: Note from the 2.x to 3.0.0 upgrade guide:
The load() method and the constructor with data (new JSZip(data)) have
been replaced by loadAsync().
Thanks user2677034
If you need to support other formats as well or just need good performance, you can use this WebAssembly library
it's promised based, it uses WebWorkers for threading and API is actually simple ES module
How to use
Install with npm i libarchive.js and use it as a ES module.
The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method.
import {Archive} from 'libarchive.js/main.js';
Archive.init({
workerUrl: 'libarchive.js/dist/worker-bundle.js'
});
document.getElementById('file').addEventListener('change', async (e) => {
const file = e.currentTarget.files[0];
const archive = await Archive.open(file);
let obj = await archive.extractFiles();
console.log(obj);
});
// outputs
{
".gitignore": {File},
"addon": {
"addon.py": {File},
"addon.xml": {File}
},
"README.md": {File}
}
I wrote "Binary Tools for JavaScript", an open source project that includes the ability to unzip, unrar and untar: https://github.com/codedread/bitjs
Used in my comic book reader: https://github.com/codedread/kthoom (also open source).
HTH!
If anyone's reading images or other binary files from a zip file hosted at a remote server, you can use following snippet to download and create zip object using the jszip library.
// this function just get the public url of zip file.
let url = await getStorageUrl(path)
console.log('public url is', url)
//get the zip file to client
axios.get(url, { responseType: 'arraybuffer' }).then((res) => {
console.log('zip download status ', res.status)
//load contents into jszip and create an object
jszip.loadAsync(new Blob([res.data], { type: 'application/zip' })).then((zip) => {
const zipObj = zip
$.each(zip.files, function (index, zipEntry) {
console.log('filename', zipEntry.name)
})
})
Now using the zipObj you can access the files and create a src url for it.
var fname = 'myImage.jpg'
zipObj.file(fname).async('blob').then((blob) => {
var blobUrl = URL.createObjectURL(blob)

Does 'onDidChangeTextDocument()' promise in VScode extension depend on the user's active window to start listening?

I'm a new developer and this is my first Stack Overflow post. I've tried to stick to the format as best as possible. It's a difficult issue for me to explain, so please let me know if there's any problems with this post!
Problem
I'm working on a vscode extension specifically built for Next.js applications and running into issues on an event listener for the onDidChangeText() method. I'm looking to capture data from a JSON file that will always be located in the root of the project (this is automatically generated/updated on each refresh of the test node server for the Next.js app).
Expected Results
The extension is able to look for updates on the file using onDidChangeText(). However, the issue I'm facing is on the initial run of the application. In order for the extension to start listening for changes to the JSON file, the user has to be in the JSON file. It's supposed to work no matter what file the user has opened in vscode. After the user visits the JSON file while the extension is on, it begins to work from every file in the Next.js project folder.
Reproducing this issue is difficult because it requires an extension, npm package, and a next.js demo app, but the general steps are below. If needed, I can provide code for the rest.
1. Start debug session
2. Open Next.js application
3. Run application in node dev
4. Do not open the root JSON file
What I've Tried
Console logs show we are not entering the onDidTextDocumentChange() block until the user opens the root JSON file.
File path to the root folder is correctly generated at all times, and prior to the promise being reached.
Is this potentially an async issue? Or is the method somehow dependent on the Active Window of the user to start looking for changes to that document?
Since the file is both created and updated automatically, we've tested for both, and neither are working until the user opens the root JSON file in their vscode.
Relevant code snippet (this will not work alone but I can provide the rest of the code if necessary. ).
export async function activate(context: vscode.ExtensionContext) {
console.log('Congratulations, your extension "Next Step" is now active!');
setupExtension();
const output = vscode.window.createOutputChannel('METRICS');
// this is getting the application's root folder filepath string from its uri
if (!vscode.workspace.workspaceFolders) {
return;
}
const rootFolderPath = vscode.workspace.workspaceFolders[0].uri.path;
// const vscode.workspace.workspaceFolders: readonly vscode.WorkspaceFolder[] | undefined;
// this gives us the fileName - we join the root folder URI with the file we are looking for, which is metrics.json
const fileName = path.join(rootFolderPath, '/metrics.json');
const generateMetrics = vscode.commands.registerCommand(
'extension.generateMetrics',
async () => {
console.log('Succesfully entered registerCommand');
toggle = true;
vscode.workspace.onDidChangeTextDocument(async (e) => {
if (toggle) {
console.log('Succesfully entered onDidChangeTextDocument');
if (e.document.uri.path === fileName) {
// name the command to be called on any file in the application
// this parses our fileName to an URI - we need to do this for when we run openTextDocument below
const fileUri = vscode.Uri.parse(fileName);
// open the file at the Uri path and get the text
const metricData = await vscode.workspace
.openTextDocument(fileUri)
.then((document) => {
return document.getText();
});
}
}
});
});
}
Solved this by adding an "openTextDocument" call inside the "registerCommand" block outside of the "onDidChangeTextDocument" function. This made the extension aware of the 'metrics.json' file without it being open in the user's IDE.

How do I save an array to a file and manipulate it from within my code?

This is in p5.js which includes most javascript functions!
I am trying to make a save-file for my game. By this I mean: the user presses the save button in my game. It updates an array that is saved on a file included in the game package, the player keeps playing. How would I do something like this (creating files that can be accessed by my code and changed).
var SM = {
//save files
sf1: [1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
sf2: [1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
sf3: [1,0,0,0,0],
[0,0,0,0,0],
[0,0,0,0,0],
};
One more thing (FOR PROCESSING CODERS FROM HERE ON): I tried to use processing functions like saveStrings(); and loadStrings(); but I couldn't get saveStrings() to save to a specific location nor could I properly load a txt file. Here is the code I used for that:
var result;
function preload() {
result = loadStrings('assets/nouns.txt');
}
function setup() {
background(200);
var ind = floor(random(result.length));
text(result[ind], 10, 10, 80, 80);
}
I had a folder called assets within the sketch folder and assets had a txt file called nouns with strings in it (downloaded from saveStrings then manually moved) but the sketch wont go past the loading screen?
If you are running it from a browser, you can't save or load a file how you want, period. Saving and loading files in browser JavaScript involves user interaction, and they get to pick the file and where it saves.
If you want to save it locally, instead of trying to write it to a file, you should write and read it from localStorage, which you can then do just fine.
// save
localStorage.setItem('saveData', data);
// load
const data = localStorage.getItem('saveData');
If it is somehow a game run directly on the client (out of the browser), like written in Node.js, then you'd want to use the fs functions.
To expand a bit, if you have your save data as an object:
const saveData = {
state: [1,2,3],
name: 'player'
};
Then to save it, you would simply call:
localStorage.setItem('saveData', JSON.stringify(data));
You'll want to stringify it when you save it to make it work properly. To read it back, you can then just read it back with getItem():
const data = JSON.parse(localStorage.getItem('saveData') || '{}');
(That extra || '{}' bit will handle if it hasn't been saved before and give you an empty object.)
It's actually much easier than trying to write a JavaScript file that you would then read in. Even if you were writing a file, you'd probably want to write it as JSON, not JavaScript.
In order to save strings into a file in Javascript, I would recommand you this previous StackOverflow question, which provides a link to a very clear and easy-to-use library to manage files in Javascript.

store and access image file paths when templating (from cloudinary or other service)

I’m using gulp and nunjucks to automate some basic email templating tasks.
I have a chain of tasks which can be triggered when an image is added to the images folder e.g.:
images compressed
new image name and dimensions logged to json file
json image data then used to populate template when template task is run
So far so good.
I want to be able to define a generic image file path for each template which will then concatenate to each image name (as stored in the json file). So something like:
<img src="{{data.path}}{{data.src}}" >
If I want to nominate a distinct folder to contain the images for each template generated then cloudinary requires a mandatory unique version component to be applied in the file path. So the image path can never be consistent throughout a template.
if your public ID includes folders (elements divided by '/'), the
version component is mandatory, (but you can make it shorter. )
For example:
http://res.cloudinary.com/demo/image/upload/v1312461204/sample_email/hero_image.jpg
http://res.cloudinary.com/demo/image/upload/v1312461207/sample_email/footer_image.jpg
Same folder. Different path.
So it seems I would now need to create a script/task that can log and store each distinct file path (with its unique id generated by cloudinary) for every image any time an image is uploaded or updated and then rerun the templating process to publish them.
This just seems like quite a convoluted process so if there’s an easier approach I’d love to know?
Else if that really is the required route it would great if someone could point me to an example of the kind of script that achieves something similar.
Presumably some hosting services will not have the mandatory unique key which makes life easier. I have spent some time getting to know cloudinary and it’s a free service with a lot of scope so I guess I'm reluctant to abandon ship but open to all suggestions.
Thanks
Note that the version component (e.g., v1312461204) isn't mandatory anymore for most use-cases. The URL could indeed work without it, e.g.,:
http://res.cloudinary.com/demo/image/upload/sample_email/hero_image.jpg
Having said that, it is very recommended to include the version component in the URL in cases where you'd like to update the image with a new one while keeping the exact same public ID. In that case, if you'd access the exact same URL, you might get a CDN cached version of the image, which may be the old one.
Therefore, when you upload, you can get the version value from Cloudinary's upload response, and store it in your DB, and the next time you update your image, also update the URL with the new version value.
Alternatively, you can also ask Cloudinary to invalidate the image while uploading. Note that while including the version component "busts" the cache immediately, invalidation may take a while to propagate through the CDN. For more information:
http://cloudinary.com/documentation/image_transformations#image_versions
This is the solution I came up with. It's based on adapting the generic script I use to upload images from a folder to cloudinary and now stores the updated file paths from cloudinary and generates a json data file to publish the hosted src details to a template.
I'm sure it could be a lot better semantically so welcome any revisions offered if someone stumbles on this but it seems to do the job:
// points to the config file where we are defining file paths
var path = require('./gulp.path')();
// IMAGE HOSTING
var fs = require('fs'); // !! not installed !! Not required??
var cloudinary = require('cloudinary').v2;
var uploads = {};
var dotenv = require('dotenv');
dotenv.load();
// Finds the images in a specific folder and retrurns an array
var read = require('fs-readdir-recursive');
// Set location of images
var imagesInFolder = read(path.images);
// The array that will be populated with image src data
var imgData = new Array();
(function uploadImages(){
// Loop through all images in folder and upload
for(var i = 0; i < imagesInFolder.length;i++){
cloudinary.uploader.upload(path.images + imagesInFolder[i], {folder: path.hosted_folder, use_filename: true, unique_filename: false, tags: 'basic_sample'}, function(err,image){
console.log();
console.log("** Public Id");
if (err){ console.warn(err);}
console.log("* Same image, uploaded with a custom public_id");
console.log("* "+image.public_id);
// Generate the category title for each image. The category is defined within the image name. It's the first part of the image name i.e. anything prior to a hyphen:
var title = image.public_id.substr(image.public_id.lastIndexOf('/') + 1).replace(/\.[^/.]+$/, "").replace(/-.*$/, "");
console.log("* "+title);
console.log("* "+image.url);
// Add the updated src for each image to the output array
imgData.push({
[title] : {"src" : image.url}
});
// Stringify data with no spacing so .replace regex can easily remove the unwanted curly braces
var imgDataJson = JSON.stringify(imgData, null, null);
// Remove the unwanted [] that wraps the json imgData array
var imgDataJson = imgDataJson.substring(1,imgDataJson.length-1);
// Delete unwanted braces "},{" replace with "," otherwise what is output is not valid json
var imgDataJson = imgDataJson.replace(/(},{)/g, ',');
var outputFilename = "images2-hosted.json"
// output the hosted image path data to a json file
// (A separate gulp task is then run to merge and update the new 'src' data into an existing image data json file)
fs.writeFile(path.image_data_src + outputFilename, imgDataJson, function(err) {
if(err) {
console.log(err);
} else {
console.log("JSON saved to " + outputFilename);
}
});
});
}
})();
A gulp task is then used to merge the newly generated json to overide the existing json data file:
// COMPILE live image hosting data
var merge = require('gulp-merge-json');
gulp.task('imageData:comp', function() {
gulp
.src('src/data/images/*.json')
.pipe(merge('src/data/images.json'))
.pipe(gulp.dest('./'))
.pipe(notify({ message: 'imageData:comp task complete' }));
});

Unzipping files

I want to display OpenOffice files, .odt and .odp at client side using a web browser.
These files are zipped files. Using Ajax, I can get these files from server but these are zipped files. I have to unzip them using JavaScript, I have tried using inflate.js, http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt, but without success.
How can I do this?
I wrote an unzipper in Javascript. It works.
It relies on Andy G.P. Na's binary file reader and some RFC1951 inflate logic from notmasteryet. I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
The source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB: the links are dead; I'll find a new host soon.
Included in the source is a ZipFile.htm demonstration page, and 3 distinct scripts, one for the zipfile class, one for the inflate class, and one for a binary file reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, all of the necessary source is there.
Here's what the application code looks like in Javascript:
// In my demo, this gets attached to a click event.
// it instantiates a ZipFile, and provides a callback that is
// invoked when the zip is read. This can take a few seconds on a
// large zip file, so it's asynchronous.
var readFile = function(){
$("#status").html("<br/>");
var url= $("#urlToLoad").val();
var doneReading = function(zip){
extractEntries(zip);
};
var zipFile = new ZipFile(url, doneReading);
};
// this function extracts the entries from an instantiated zip
function extractEntries(zip){
$('#report').accordion('destroy');
// clear
$("#report").html('');
var extractCb = function(id) {
// this callback is invoked with the entry name, and entry text
// in my demo, the text is just injected into an accordion panel.
return (function(entryName, entryText){
var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>");
$("#"+id).html(content);
$("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>");
$('#report').accordion('destroy');
$('#report').accordion({collapsible:true, active:false});
});
}
// for each entry in the zip, extract it.
for (var i=0; i<zip.entries.length; i++) {
var entry = zip.entries[i];
var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>";
// contrive an id for the entry, make it unique
var randomId = "id-"+ Math.floor((Math.random() * 1000000000));
entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId +
"'></span></span></div>\n";
// insert the info for one entry as the last child within the report div
$("#report").append(entryInfo);
// extract asynchronously
entry.extract(extractCb(randomId));
}
}
The demo works in a couple of steps: The readFile fn is triggered by a click, and instantiates a ZipFile object, which reads the zip file. There's an asynchronous callback for when the read completes (usually happens in less than a second for reasonably sized zips) - in this demo the callback is held in the doneReading local variable, which simply calls extractEntries, which
just blindly unzips all the content of the provided zip file. In a real app you would probably choose some of the entries to extract (allow the user to select, or choose one or more entries programmatically, etc).
The extractEntries fn iterates over all entries, and calls extract() on each one, passing a callback. Decompression of an entry takes time, maybe 1s or more for each entry in the zipfile, which means asynchrony is appropriate. The extract callback simply adds the extracted content to an jQuery accordion on the page. If the content is binary, then it gets formatted as such (not shown).
It works, but I think that the utility is somewhat limited.
For one thing: It's very slow. Takes ~4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than .5s in a .NET program. EDIT: The Javascript ZipFile unpacks considerably faster than this now, in IE9 and in Chrome. It is still slower than a compiled program, but it is plenty fast for normal browser usage.
For another: it does not do streaming. It basically slurps in the entire contents of the zipfile into memory. In a "real" programming environment you could read in only the metadata of a zip file (say, 64 bytes per entry) and then read and decompress the other data as desired. There's no way to do IO like that in javascript, as far as I know, therefore the only option is to read the entire zip into memory and do random access in it. This means it will place unreasonable demands on system memory for large zip files. Not so much a problem for a smaller zip file.
Also: It doesn't handle the "general case" zip file - there are lots of zip options that I didn't bother to implement in the unzipper - like ZIP encryption, WinZip encryption, zip64, UTF-8 encoded filenames, and so on. (EDIT - it handles UTF-8 encoded filenames now). The ZipFile class handles the basics, though. Some of these things would not be hard to implement. I have an AES encryption class in Javascript; that could be integrated to support encryption. Supporting Zip64 would probably useless for most users of Javascript, as it is intended to support >4gb zipfiles - don't need to extract those in a browser.
I also did not test the case for unzipping binary content. Right now it unzips text. If you have a zipped binary file, you'd need to edit the ZipFile class to handle it properly. I didn't figure out how to do that cleanly. It does binary files now, too.
EDIT - I updated the JS unzip library and demo. It now does binary files, in addition to text. I've made it more resilient and more general - you can now specify the encoding to use when reading text files. Also the demo is expanded - it shows unzipping an XLSX file in the browser, among other things.
So, while I think it is of limited utility and interest, it works. I guess it would work in Node.js.
I'm using zip.js and it seems to be quite useful. It's worth a look!
Check the Unzip demo, for example.
I found jszip quite useful. I've used so far only for reading, but they have create/edit capabilities as well.
Code wise it looks something like this
var new_zip = new JSZip();
new_zip.load(file);
new_zip.files["doc.xml"].asText() // this give you the text in the file
One thing I noticed is that it seems the file has to be in binary stream format (read using the .readAsArrayBuffer of FileReader(), otherwise I was getting errors saying I might have a corrupt zip file
Edit: Note from the 2.x to 3.0.0 upgrade guide:
The load() method and the constructor with data (new JSZip(data)) have
been replaced by loadAsync().
Thanks user2677034
If you need to support other formats as well or just need good performance, you can use this WebAssembly library
it's promised based, it uses WebWorkers for threading and API is actually simple ES module
How to use
Install with npm i libarchive.js and use it as a ES module.
The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method.
import {Archive} from 'libarchive.js/main.js';
Archive.init({
workerUrl: 'libarchive.js/dist/worker-bundle.js'
});
document.getElementById('file').addEventListener('change', async (e) => {
const file = e.currentTarget.files[0];
const archive = await Archive.open(file);
let obj = await archive.extractFiles();
console.log(obj);
});
// outputs
{
".gitignore": {File},
"addon": {
"addon.py": {File},
"addon.xml": {File}
},
"README.md": {File}
}
I wrote "Binary Tools for JavaScript", an open source project that includes the ability to unzip, unrar and untar: https://github.com/codedread/bitjs
Used in my comic book reader: https://github.com/codedread/kthoom (also open source).
HTH!
If anyone's reading images or other binary files from a zip file hosted at a remote server, you can use following snippet to download and create zip object using the jszip library.
// this function just get the public url of zip file.
let url = await getStorageUrl(path)
console.log('public url is', url)
//get the zip file to client
axios.get(url, { responseType: 'arraybuffer' }).then((res) => {
console.log('zip download status ', res.status)
//load contents into jszip and create an object
jszip.loadAsync(new Blob([res.data], { type: 'application/zip' })).then((zip) => {
const zipObj = zip
$.each(zip.files, function (index, zipEntry) {
console.log('filename', zipEntry.name)
})
})
Now using the zipObj you can access the files and create a src url for it.
var fname = 'myImage.jpg'
zipObj.file(fname).async('blob').then((blob) => {
var blobUrl = URL.createObjectURL(blob)

Categories

Resources