Grabbing partial data from the clipboard in javascript - javascript

I have copied an Excel table which is about a million rows. When I look at the clipboard on my system, it seems to contain about 250MB of data. However, I only need to grab the styling information from it, for example:
This entire data comes out to (far) less than 1MB of data. Is there a way to read the clipboard as if it were a file or stream, so that I can just do, for example:
clipboard.read(1024)
Otherwise, if I do the straight:
evt.clipboardData.getData('text/html')
And grab the section of data I want after getting the data, it takes me over 10 seconds to do! Whereas I believe the event should only take 0.1s or so, if I'm able to read the clipboard data in a partial manner, as if it were a file.
What can I do here? Is it possible to use FileReader on the clipboard? If so, how could that be done?

The Clipboard.read API cited in the comments can return clipboard contents as a list of ClipboardItem objects, from which you can then obtain a Blob object, which you can then .slice to perform partial reads. (The MDN claims that Clipboard.read returns a DataTransfer object, but this disagrees with the specification, so I assume this is stale information, or simply in error.)
const perm = await navigator.permissions.query({ name: 'clipboard-read' });
switch (perm.state) {
case 'granted':
case 'prompt':
break;
default:
throw new Error("clipboard-read permission not granted");
}
const items = await navigator.clipboard.read();
for (const item of items) {
const blob = await item.getType('text/html');
const first1M = await blob.slice(0, 1048576).arrayBuffer();
/* process first1M */
}
However, the Clipboard API is nowhere near universally available as of yet. Firefox ESR 78.9 doesn’t implement it, and by the state of MDN it hardly seems to be on Mozilla’s radar at all. (I haven’t tried other browsers yet; perhaps in Chrome it’s already usable.)

After a lot a research, this is not possible in Javascript, there is no support for stream manipulation using the clipboard object so u have to read the entire content at once.
However, u can use MacOS (inferred from your picture) native tools for processing the clipboard data: pbcopy and pbpaste, and they are extremely fast, orders of magnitude faster than Javascript, so u can delegate the heavy processing of the text to them.
So, after u copy the 250MB of text, you can slice it and read only the first n bytes (in this case 1024) and substitute the content of the clipboard with that, so now u it will be available for u to use it in Javascript:
pbpaste | cut -b 1-1024 | pbcopy
If u need any documentation about each terminal command, u can run man command_name. Extracting the first 1024 bytes of the clipboard took less than a second with this approach.
I tested it with a sample text file of 390MB created with python with this script:
c = 30000000
with open('sample.txt', 'w') as file:
file.writelines('a sample code' for i in range(c))

Related

Unzip a zip file with JavaScript [duplicate]

I want to display OpenOffice files, .odt and .odp at client side using a web browser.
These files are zipped files. Using Ajax, I can get these files from server but these are zipped files. I have to unzip them using JavaScript, I have tried using inflate.js, http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt, but without success.
How can I do this?
I wrote an unzipper in Javascript. It works.
It relies on Andy G.P. Na's binary file reader and some RFC1951 inflate logic from notmasteryet. I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
The source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB: the links are dead; I'll find a new host soon.
Included in the source is a ZipFile.htm demonstration page, and 3 distinct scripts, one for the zipfile class, one for the inflate class, and one for a binary file reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, all of the necessary source is there.
Here's what the application code looks like in Javascript:
// In my demo, this gets attached to a click event.
// it instantiates a ZipFile, and provides a callback that is
// invoked when the zip is read. This can take a few seconds on a
// large zip file, so it's asynchronous.
var readFile = function(){
$("#status").html("<br/>");
var url= $("#urlToLoad").val();
var doneReading = function(zip){
extractEntries(zip);
};
var zipFile = new ZipFile(url, doneReading);
};
// this function extracts the entries from an instantiated zip
function extractEntries(zip){
$('#report').accordion('destroy');
// clear
$("#report").html('');
var extractCb = function(id) {
// this callback is invoked with the entry name, and entry text
// in my demo, the text is just injected into an accordion panel.
return (function(entryName, entryText){
var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>");
$("#"+id).html(content);
$("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>");
$('#report').accordion('destroy');
$('#report').accordion({collapsible:true, active:false});
});
}
// for each entry in the zip, extract it.
for (var i=0; i<zip.entries.length; i++) {
var entry = zip.entries[i];
var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>";
// contrive an id for the entry, make it unique
var randomId = "id-"+ Math.floor((Math.random() * 1000000000));
entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId +
"'></span></span></div>\n";
// insert the info for one entry as the last child within the report div
$("#report").append(entryInfo);
// extract asynchronously
entry.extract(extractCb(randomId));
}
}
The demo works in a couple of steps: The readFile fn is triggered by a click, and instantiates a ZipFile object, which reads the zip file. There's an asynchronous callback for when the read completes (usually happens in less than a second for reasonably sized zips) - in this demo the callback is held in the doneReading local variable, which simply calls extractEntries, which
just blindly unzips all the content of the provided zip file. In a real app you would probably choose some of the entries to extract (allow the user to select, or choose one or more entries programmatically, etc).
The extractEntries fn iterates over all entries, and calls extract() on each one, passing a callback. Decompression of an entry takes time, maybe 1s or more for each entry in the zipfile, which means asynchrony is appropriate. The extract callback simply adds the extracted content to an jQuery accordion on the page. If the content is binary, then it gets formatted as such (not shown).
It works, but I think that the utility is somewhat limited.
For one thing: It's very slow. Takes ~4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than .5s in a .NET program. EDIT: The Javascript ZipFile unpacks considerably faster than this now, in IE9 and in Chrome. It is still slower than a compiled program, but it is plenty fast for normal browser usage.
For another: it does not do streaming. It basically slurps in the entire contents of the zipfile into memory. In a "real" programming environment you could read in only the metadata of a zip file (say, 64 bytes per entry) and then read and decompress the other data as desired. There's no way to do IO like that in javascript, as far as I know, therefore the only option is to read the entire zip into memory and do random access in it. This means it will place unreasonable demands on system memory for large zip files. Not so much a problem for a smaller zip file.
Also: It doesn't handle the "general case" zip file - there are lots of zip options that I didn't bother to implement in the unzipper - like ZIP encryption, WinZip encryption, zip64, UTF-8 encoded filenames, and so on. (EDIT - it handles UTF-8 encoded filenames now). The ZipFile class handles the basics, though. Some of these things would not be hard to implement. I have an AES encryption class in Javascript; that could be integrated to support encryption. Supporting Zip64 would probably useless for most users of Javascript, as it is intended to support >4gb zipfiles - don't need to extract those in a browser.
I also did not test the case for unzipping binary content. Right now it unzips text. If you have a zipped binary file, you'd need to edit the ZipFile class to handle it properly. I didn't figure out how to do that cleanly. It does binary files now, too.
EDIT - I updated the JS unzip library and demo. It now does binary files, in addition to text. I've made it more resilient and more general - you can now specify the encoding to use when reading text files. Also the demo is expanded - it shows unzipping an XLSX file in the browser, among other things.
So, while I think it is of limited utility and interest, it works. I guess it would work in Node.js.
I'm using zip.js and it seems to be quite useful. It's worth a look!
Check the Unzip demo, for example.
I found jszip quite useful. I've used so far only for reading, but they have create/edit capabilities as well.
Code wise it looks something like this
var new_zip = new JSZip();
new_zip.load(file);
new_zip.files["doc.xml"].asText() // this give you the text in the file
One thing I noticed is that it seems the file has to be in binary stream format (read using the .readAsArrayBuffer of FileReader(), otherwise I was getting errors saying I might have a corrupt zip file
Edit: Note from the 2.x to 3.0.0 upgrade guide:
The load() method and the constructor with data (new JSZip(data)) have
been replaced by loadAsync().
Thanks user2677034
If you need to support other formats as well or just need good performance, you can use this WebAssembly library
it's promised based, it uses WebWorkers for threading and API is actually simple ES module
How to use
Install with npm i libarchive.js and use it as a ES module.
The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method.
import {Archive} from 'libarchive.js/main.js';
Archive.init({
workerUrl: 'libarchive.js/dist/worker-bundle.js'
});
document.getElementById('file').addEventListener('change', async (e) => {
const file = e.currentTarget.files[0];
const archive = await Archive.open(file);
let obj = await archive.extractFiles();
console.log(obj);
});
// outputs
{
".gitignore": {File},
"addon": {
"addon.py": {File},
"addon.xml": {File}
},
"README.md": {File}
}
I wrote "Binary Tools for JavaScript", an open source project that includes the ability to unzip, unrar and untar: https://github.com/codedread/bitjs
Used in my comic book reader: https://github.com/codedread/kthoom (also open source).
HTH!
If anyone's reading images or other binary files from a zip file hosted at a remote server, you can use following snippet to download and create zip object using the jszip library.
// this function just get the public url of zip file.
let url = await getStorageUrl(path)
console.log('public url is', url)
//get the zip file to client
axios.get(url, { responseType: 'arraybuffer' }).then((res) => {
console.log('zip download status ', res.status)
//load contents into jszip and create an object
jszip.loadAsync(new Blob([res.data], { type: 'application/zip' })).then((zip) => {
const zipObj = zip
$.each(zip.files, function (index, zipEntry) {
console.log('filename', zipEntry.name)
})
})
Now using the zipObj you can access the files and create a src url for it.
var fname = 'myImage.jpg'
zipObj.file(fname).async('blob').then((blob) => {
var blobUrl = URL.createObjectURL(blob)

Unable to download large data using javascript

I have a large data in form of JSON object in the javascript. I have converted it into the string using JSON.stringify(). Now my use case is to provide this large string in a text file to the user. So for this i have written below code.
HTML code
<button id='text_feed' type="submit">Generate ION Feed</button>
Javascript code
var text = //huge string
$("#text_feed").click(function() {
_generateFeed(text);
});
var _generateFeed = function(text) {
//some code here
$("#textLink").attr("href",
"data:attachment/txt," + encodeURIComponent(text)) [0].click();
});
};
Problem: When the string length is small , i am able to download the data .
But when the string length goes higher (> 10^5) , my page crashes.
This occurred because "encodeUriComponet(text)" is not able to encode large
data.
I also tried window.open("data:attachment/txt," + encodeURIComponent(text));
But again my page got crashed because of the same reason that encodeURIComponet was unable to encode such a large string.
Another approach: I was also thinking of writing the data into a file using HTML5 File write API , but it has support only in Chrome web browser , but i need to make this work for atleast firefox and chrome both.
Use Case
I don't want to do multiple downloads by breaking the data, as i need to have data in a single file in the end.
And my target is to support string of aprroximately 10^6 length. Can anyone help me how to download/write this amount of data into a single file.
From the OP:
I solved it as below.
var _generateFeed = function(text) {
/*some code here*/
var url = URL.createObjectURL( new Blob( [text], {type:'text/plain'} ) );
$("#textLink").attr("href",url)[0].click();
};
Notes:
URL.createObjectURL() is compatible with modern browsers and IE10+, but it is an unstable, experimental technology.
Objects created using URL.createObjectURL() "must be released by calling URL.revokeObjectURL() when you no longer need them." - MDN

Javascript error: disk is full

With relation to Exception "The disk is full " is thrown when trying to store data in usedata object in IE7+ which has been left unanswered:
I am heavily browsing a government website created with Oracle ADF using WatiN.
The website is located in a WPF window --> WindowsFormsHost --> WebBrowser control.
The website makes heavy use of this: http://msdn.microsoft.com/en-us/library/ie/ms531424(v=vs.85).aspx, via the save and load methods.
After 2-3 minutes of browsing, I get the following javascript error during one of the "save" calls:
The disk is full, character #, line ####.
When I get this error, the WebBrowser control is rendered completely useless (no further javascript commands can be executed) and my app must be restarted.
I have tried to clear browser cache, change it's location, clear localStorage, everything to no avail.
The PC that reproduces the error has IE10 installed, but via the registry I force IE8 / IE9 mode in the webbrowser control.
Is there any way to get around this problem?
Information on this is very scarce, so any help will be greatly appreciated.
I'm on linux, so no way to test this now, but the files in question are not stored in browser cache, but in
Note: As of IE 10 this no longer holds water. Ref. edit 2 below.
W7+: %HOMEPATH%\AppData\Roaming\Microsoft\Internet Explorer\UserData
# c:\users\*user-name*\...
XP : %HOMEPATH%\Application Data\Microsoft\Internet Explorer\UserData
# c:\Documents and Settings\*user-name*\...
Or, I guess, this one also work (as a short-cut), :
%APPDATA%\Roaming\Microsoft\Internet Explorer\UserData
Modify or delete the files there.
Note that the files are tagged as protected operating system files, so to view in Explorer you have to change view to include these. If you use cmd, as in Command Prompt you have to include the /a flag as in:
dir /a
Edit 1:
Note, that index.dat is the one holding information about allocated size etc. so it won't (probably) help to only delete/move the xml files.
Edit 2:
OK. Had a look at this in Windows 7 running IE 10.
In IE 7 (on XP) the above mentioned path have an index.dat file that gets updated on save by userData. The file holds various information such as size of the index file, number of sub folders, size of all files. Then an entry for each file with a number identifying folder, url where it was saved from, name of xml file, dates etc. Wrote a simple VBScript parser for this, but as IE 10 does not use the index.dat file it is a waste.
Under IE 10 there is no longer various index.dat files but a central database file in:
%APPDATA%\Local\Microsoft\Windows\WebCache\
On my system the database file is named WebCacheV01.dat, the V part seems to differ between systems and is perhaps an in-house version number rather then a file type version.
The files are tightly locked, and as such, if one want to poke at them one solution is to make a shadow copy by using tools such as vscsc, Shadowcopy etc.
Anyhow, hacking WebCacheVxx.dat would need a lot more work, so no attempts on that on my part (for now at least).
But, register that the file gets an entry with path to the old location – so e.g. on write of someElement.save("someStorageName");, WebCacheVxx.dat gets an entry like:
...\AppData\Roaming\Microsoft\Internet Explorer\UserData\DDFFGGHH\someStorageName[1].xml
and a corresponding file is created in the above path.
The local container.dat, however, is not updated.
userData
As for the issue at hand, clearing localStorage will not help, as userData is not part of that API.
Can not find a good example on how to clear userData. Though, one way is to use the console.
Example from testing on this page:
Save some text.
Hit F12 and enter the following to clear the data:
/* ud as an acronym for userData */
var i, at,
ud_name = "oXMLBranch",
ud_id = "oPersistText",
ud = document.getElementById(ud_id);
/* To modify the storage one have to ensure it is loaded. */
ud.load(ud_name);
/* After load, ud should have a xmlDocument where first child should hold
* the attributes for the storage. Attributes as in named entries.
*
* Example: ud.setAttribute("someText", "Some text");
* ud.save(ud_name);
*
* XML-document now has attribute "someText" with value "Some text".
*/
at = ud.xmlDocument.firstChild.attributes;
/* Loop attributes and remove each one from userData. */
for (i = 0; i < at.length; ++i)
ud.removeAttribute(at[i].nodeName);
/* Finally update the file by saving the storage. */
ud.save(ud_name);
Or as a one-liner:
var ud_name = "oXMLBranch", ud_id = "oPersistText", i, at, ud = document.getElementById(ud_id); ud.load(ud_name); at = ud.xmlDocument.firstChild.attributes; for (i = 0; i < at.length; ++i) ud.removeAttribute(at[i].nodeName); ud.save(ud_name);
Eliminating one restriction
There are some issues with this. We can eliminate at least one, by ignoring the ud_id and instead create a new DOM object:
var i, at,
ud_name = "oXMLBranch",
ud = document.createElement('INPUT');
/* Needed to extend properties of input to include userData to userdata. */
ud.style.behavior = 'url(#default#userdata)';
/* Needed to not get access denied. */
document.body.appendChild(ud);
ud.load(ud_name);
at = ud.xmlDocument.firstChild.attributes;
for (i = 0; i < at.length; ++i)
ud.removeAttribute(at[i].nodeName);
/* Save, or nothing is changed on disk. */
ud.save(ud_name);
/* Clean up the DOM tree */
ud.parentNode.removeChild(ud);
So by this one should be able to clear userData by knowing name of the storage, which should be same as the file name (excluding [1].xml) or by looking at the page source.
More issues
Testing on the page mentioned above I reach a disk is full limit at 65,506 bytes.
Your problem is likely not that the disk is full, but that a write attempt is done where input data is above limit. You could try to clear the data as mentioned above and see if it continues, else you would need to clear the data about to be written.
Then again this would most likely break the application in question.
In other words, error text should have been something like:
Write of NNN bytes is above limit of NNN and not disk is full.
Tested by attaching to window onerror but unfortunately source of error was not found:
var x1, x2, x3, x4 ;
window.onerror = function () {
x1 = window.event;
x2 = x1.fromElement; // Yield null
x3 = x1.srcElement; // Yield null
x4 = x1.type;
};
End note
Unless the clear userData method solves the issue, I do not know where to go next. Have not found any option to increase the limit by registry or other means.
But perhaps it get you a little further.
There might be someone over at Super User that is able to help.

NetUtil.asyncCopy from one file to append to another in Firefox extension

I'm trying to use NetUtil.asyncCopy to append data from one file to the end of another file from a Firefox extension. I have based this code upon a number of examples at https://developer.mozilla.org/en-US/docs/Code_snippets/File_I_O, particularly the 'Copy a stream to a file' example. Given what it says on that page, my code below:
Creates nsIFile objects for the file to copy from and file to append to and initialises these objects with the correct paths.
Creates an output stream to the output file.
Runs the NetUtil.asyncCopy function to copy between the file (which, I believe, behaves as an nsIInputStream) and the output stream.
I run this code as append_text_from_file("~/CopyFrom.txt", "~/AppendTo.txt");, but nothing gets copied across. The Appending Text and After ostream dumps appear on the console, but not the Done or Error dumps.
Does anyone have any idea what I'm doing wrong here? I'm fairly new to both Firefox extensions and javascript (although I am a fairly experienced programmer) - so I may be doing something really silly. If my entire approach is wrong then please let me know - I would have thought that this approach would allow me to append a file easily, and asynchronously, but it may not be possible for some reason that I don't know about.
function append_text_from_file(from_filename, to_filename) {
var from_file = Components.classes["#mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
from_file.initWithPath(from_filename);
var to_file = Components.classes["#mozilla.org/file/local;1"].createInstance(Components.interfaces.nsILocalFile);
to_file.initWithPath(to_filename);
dump("Appending text\n");
var ostream = FileUtils.openFileOutputStream(to_file, FileUtils.MODE_WRONLY | FileUtils.MODE_APPEND)
dump("After ostream\n");
NetUtil.asyncCopy(from_file, ostream, function(aResult) {
dump("Done\n");
if (!Components.isSuccessCode(aResult)) {
// an error occurred!
dump(aResult);
dump("Error!\n")
}
});
}
asyncCopy() requires an input stream not a file.
you can do this:
var fstream = Cc["#mozilla.org/network/file-input-stream;1"].createInstance(Ci.nsIFileInputStream);
fstream.init(from_file, 0x01, 4, null);
NetUtil.asyncCopy(fstream, ostream, function(aResult)....

Unzipping files

I want to display OpenOffice files, .odt and .odp at client side using a web browser.
These files are zipped files. Using Ajax, I can get these files from server but these are zipped files. I have to unzip them using JavaScript, I have tried using inflate.js, http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt, but without success.
How can I do this?
I wrote an unzipper in Javascript. It works.
It relies on Andy G.P. Na's binary file reader and some RFC1951 inflate logic from notmasteryet. I added the ZipFile class.
working example:
http://cheeso.members.winisp.net/Unzip-Example.htm (dead link)
The source:
http://cheeso.members.winisp.net/srcview.aspx?dir=js-unzip (dead link)
NB: the links are dead; I'll find a new host soon.
Included in the source is a ZipFile.htm demonstration page, and 3 distinct scripts, one for the zipfile class, one for the inflate class, and one for a binary file reader class. The demo also depends on jQuery and jQuery UI. If you just download the js-zip.zip file, all of the necessary source is there.
Here's what the application code looks like in Javascript:
// In my demo, this gets attached to a click event.
// it instantiates a ZipFile, and provides a callback that is
// invoked when the zip is read. This can take a few seconds on a
// large zip file, so it's asynchronous.
var readFile = function(){
$("#status").html("<br/>");
var url= $("#urlToLoad").val();
var doneReading = function(zip){
extractEntries(zip);
};
var zipFile = new ZipFile(url, doneReading);
};
// this function extracts the entries from an instantiated zip
function extractEntries(zip){
$('#report').accordion('destroy');
// clear
$("#report").html('');
var extractCb = function(id) {
// this callback is invoked with the entry name, and entry text
// in my demo, the text is just injected into an accordion panel.
return (function(entryName, entryText){
var content = entryText.replace(new RegExp( "\\n", "g" ), "<br/>");
$("#"+id).html(content);
$("#status").append("extract cb, entry(" + entryName + ") id(" + id + ")<br/>");
$('#report').accordion('destroy');
$('#report').accordion({collapsible:true, active:false});
});
}
// for each entry in the zip, extract it.
for (var i=0; i<zip.entries.length; i++) {
var entry = zip.entries[i];
var entryInfo = "<h4><a>" + entry.name + "</a></h4>\n<div>";
// contrive an id for the entry, make it unique
var randomId = "id-"+ Math.floor((Math.random() * 1000000000));
entryInfo += "<span class='inputDiv'><h4>Content:</h4><span id='" + randomId +
"'></span></span></div>\n";
// insert the info for one entry as the last child within the report div
$("#report").append(entryInfo);
// extract asynchronously
entry.extract(extractCb(randomId));
}
}
The demo works in a couple of steps: The readFile fn is triggered by a click, and instantiates a ZipFile object, which reads the zip file. There's an asynchronous callback for when the read completes (usually happens in less than a second for reasonably sized zips) - in this demo the callback is held in the doneReading local variable, which simply calls extractEntries, which
just blindly unzips all the content of the provided zip file. In a real app you would probably choose some of the entries to extract (allow the user to select, or choose one or more entries programmatically, etc).
The extractEntries fn iterates over all entries, and calls extract() on each one, passing a callback. Decompression of an entry takes time, maybe 1s or more for each entry in the zipfile, which means asynchrony is appropriate. The extract callback simply adds the extracted content to an jQuery accordion on the page. If the content is binary, then it gets formatted as such (not shown).
It works, but I think that the utility is somewhat limited.
For one thing: It's very slow. Takes ~4 seconds to unzip the 140k AppNote.txt file from PKWare. The same uncompress can be done in less than .5s in a .NET program. EDIT: The Javascript ZipFile unpacks considerably faster than this now, in IE9 and in Chrome. It is still slower than a compiled program, but it is plenty fast for normal browser usage.
For another: it does not do streaming. It basically slurps in the entire contents of the zipfile into memory. In a "real" programming environment you could read in only the metadata of a zip file (say, 64 bytes per entry) and then read and decompress the other data as desired. There's no way to do IO like that in javascript, as far as I know, therefore the only option is to read the entire zip into memory and do random access in it. This means it will place unreasonable demands on system memory for large zip files. Not so much a problem for a smaller zip file.
Also: It doesn't handle the "general case" zip file - there are lots of zip options that I didn't bother to implement in the unzipper - like ZIP encryption, WinZip encryption, zip64, UTF-8 encoded filenames, and so on. (EDIT - it handles UTF-8 encoded filenames now). The ZipFile class handles the basics, though. Some of these things would not be hard to implement. I have an AES encryption class in Javascript; that could be integrated to support encryption. Supporting Zip64 would probably useless for most users of Javascript, as it is intended to support >4gb zipfiles - don't need to extract those in a browser.
I also did not test the case for unzipping binary content. Right now it unzips text. If you have a zipped binary file, you'd need to edit the ZipFile class to handle it properly. I didn't figure out how to do that cleanly. It does binary files now, too.
EDIT - I updated the JS unzip library and demo. It now does binary files, in addition to text. I've made it more resilient and more general - you can now specify the encoding to use when reading text files. Also the demo is expanded - it shows unzipping an XLSX file in the browser, among other things.
So, while I think it is of limited utility and interest, it works. I guess it would work in Node.js.
I'm using zip.js and it seems to be quite useful. It's worth a look!
Check the Unzip demo, for example.
I found jszip quite useful. I've used so far only for reading, but they have create/edit capabilities as well.
Code wise it looks something like this
var new_zip = new JSZip();
new_zip.load(file);
new_zip.files["doc.xml"].asText() // this give you the text in the file
One thing I noticed is that it seems the file has to be in binary stream format (read using the .readAsArrayBuffer of FileReader(), otherwise I was getting errors saying I might have a corrupt zip file
Edit: Note from the 2.x to 3.0.0 upgrade guide:
The load() method and the constructor with data (new JSZip(data)) have
been replaced by loadAsync().
Thanks user2677034
If you need to support other formats as well or just need good performance, you can use this WebAssembly library
it's promised based, it uses WebWorkers for threading and API is actually simple ES module
How to use
Install with npm i libarchive.js and use it as a ES module.
The library consists of two parts: ES module and webworker bundle, ES module part is your interface to talk to library, use it like any other module. The webworker bundle lives in the libarchive.js/dist folder so you need to make sure that it is available in your public folder since it will not get bundled if you're using bundler (it's all bundled up already) and specify correct path to Archive.init() method.
import {Archive} from 'libarchive.js/main.js';
Archive.init({
workerUrl: 'libarchive.js/dist/worker-bundle.js'
});
document.getElementById('file').addEventListener('change', async (e) => {
const file = e.currentTarget.files[0];
const archive = await Archive.open(file);
let obj = await archive.extractFiles();
console.log(obj);
});
// outputs
{
".gitignore": {File},
"addon": {
"addon.py": {File},
"addon.xml": {File}
},
"README.md": {File}
}
I wrote "Binary Tools for JavaScript", an open source project that includes the ability to unzip, unrar and untar: https://github.com/codedread/bitjs
Used in my comic book reader: https://github.com/codedread/kthoom (also open source).
HTH!
If anyone's reading images or other binary files from a zip file hosted at a remote server, you can use following snippet to download and create zip object using the jszip library.
// this function just get the public url of zip file.
let url = await getStorageUrl(path)
console.log('public url is', url)
//get the zip file to client
axios.get(url, { responseType: 'arraybuffer' }).then((res) => {
console.log('zip download status ', res.status)
//load contents into jszip and create an object
jszip.loadAsync(new Blob([res.data], { type: 'application/zip' })).then((zip) => {
const zipObj = zip
$.each(zip.files, function (index, zipEntry) {
console.log('filename', zipEntry.name)
})
})
Now using the zipObj you can access the files and create a src url for it.
var fname = 'myImage.jpg'
zipObj.file(fname).async('blob').then((blob) => {
var blobUrl = URL.createObjectURL(blob)

Categories

Resources