Ok, here we discussed the essence of the problem: in some browsers like Chrome and Opera HttpRequests to local files is turned off by default.
Now the question is: how to build such HTML+javascript viewer of HTML documents, that:
would run locally on any (or most of) browser(s) without additional tuning;
would not use frames;
would have an ability to work with many different files(5-10k);
It can't be done in straight HTML/Javascript if you want to load files via Javascript using AJAX requests. There are good security reasons to not allow local files script access to other files on the local system (see my answer here for more details), so most browsers will not allow this without special user configuration.
So your options are:
Don't load files with Javascript, use frames or another mechanism. If, as you state in the other question, you're shipping all this on CD, you might want to consider using some sort of build system that allows you to create static files using templates and either a database or flat-file content - Jekyll is one option I know of.
Ship an executable along with the files that can either run a local webserver or run HTML files in an application context. I think Appcelerator Titanium might fit the bill.
Related
I'm working on an app that needs to access a collection of external files. It's basically a music player. It works as-expected under a web server, but I also want it to work locally in the browser.
General overview:
index.htm (Small index file with markup, gather external js, css)
index.js (All the app code here)
dir.js (An array of file paths of all music files)
/AHX/ (location of the music files)
ahx.js (music player code)
The two main difficulties for this are:
JavaScript cannot list directory contents, even if it is a child directory. Instead I express file paths as an array of strings.
Loading external files only possible using XMLHttpRequest, which has security restrictions when running local/offline, but works in other environments (under HTTP or Chrome App, perhaps other platforms, not sure).
Oddly, in the latest Firefox, 2) is not an issue anymore. XMLHttpRequest works locally without disabling security.fileuri.strict_origin_policy. I'm not sure if that is standard behavior, but Chrome doesn't appear to allow it.
In any case, my current solution is generating a list of file-paths in a .js file (previously I used a txt file that required XHR), and using XMLHttpRequest to load the music files. This of course means I need to keep the folder structure and the file-path database in sync, using a shell script to rebuild the dir.js file.
XHR is only supposed to work over HTTP, so the app requires a web server. I want the app to work locally (and not just force the user to install as a Chrome App). So I am asking this question to find alternative methods of reading the data.
One alternative I tried is encoding all 1000 files in base64 strings and storing it in a JS object. This produces a rather large 8MB .js file. It doesn't appear to be slow to load, but I am assuming it isn't exactly efficient... Plus it is a pain to update/maintain.
localStorage, IndexedDB and Web SQL are all options, but there is no way to pre-populate the storage before the app runs. Perhaps utilize File API for a one-time setup of the storage database.
So back to my question: What are some solutions to accessing a large collection of binary files (200+ files, over 6MB etc) locally (i.e. opening the .html file directly)?
Edit: The app in question on GitHub, to clear up any confusion on my use case. But in general, I'm looking for ways to automatically read these music files from the app locally, without cross-origin errors. Also, here is the 'js-database' version. It stores all 1000 files in a 8MB js file likes so:
[{data:"base64-string-of-data-here",path:"original-path-here"}, ...]
In that way it bypasses the need for XHR.
Edit2: A solution using jszip and IndexedDB appears promising. It is not possible to load multiple files from multiple selected folders, but if the directory tree is zipped, jszip can access an array of all files in the format /FOLDER_HERE/FILE_HERE. The paths and binary data can then be imported into IndexedDB in a one-time setup. It also works fine on file:// URLs which is important.
It is also possible that jszip could be used to effectively build/update a large JSON structure of BASE64 strings of the contents, which doesn't require any setup by the user. Still need to be tested though.
don't take this as a definitive answer, this subject interests me too, if people around dont want to take time to elaborate an answer, please comment, it will be more useful than votes..
from what i learnt in javascript resources, consider that you cannot really bypass the security aspect of the question. Even open source, you should warn explicitly if you didn't take in account the security. People could distribute a modified version of the resources for example. It depends on what is done with the resources.
If this is for a player i recommend treating it as a data resource, not as a script resource, because of security (as long as you don't eval strings or such). JSON data could do the job here, but that would need to process the 1000 files. Not so hard to write a script that processes the files though.
HTML5 file API
I haven't used it yet, so i can just give you one or two links. With the downside that it restricts your player to recent browsers.
https://www.html5rocks.com/en/tutorials/file/dndfiles/
HTML5 File API read as text and binary
(i know, not an answer) use a library:
Except that in this case, this might be an answer, just because there is no real universal data retreivement in javascript. A good library would add that and a support for old browers.
Among these solutions, for example jQuery JSONP allows to do dynamical cross-domain GET requests. With data formatting (and not script), it is much safer to inject. But keep in mind that you should be aware in detail what your player does with the binary, and in which way it can be a risk.
http://api.jquery.com/jQuery.getJSON/
direct inclusion of script: not recommended
<script src="./sameFolderFile.js"></script>
As for direct script inclusion in a local folder structure, it actually works in local. IE says there is ActiveX content and asks for use permission, but it works in firefox and chrome. The tag can be dynamically added, but there is a big security risk here: malicious javascript code added in the resources will be executed. This can lead to risks for the users
We are building a web service for data analysis and would like to access netcdf files from the local machine where the browser is running. Javascript offers a file browser, but (for security reasons as I learned) it will automatically upload a file after selection, instead of allowing (read-only) access to it. This presents a show-stopper, because the netcdf files can be HUGE. Note, that the netcdf format and API explicitly allow slicing and extraction of individual variables, which is one reason why the format is so popular.
Now, some research into this issue revealed that the server-client architecture normally doesn't allow access to the local file structure to prevent spying. On the other hand, in HTML5 there is a file API which supports exactly the kind of operations we need -- except that you can access portions of a file by specifying byte ranges, but there is no netcdf API available; hence one would be left again with copying the entire file before being able to slice it on the server.
Of course, the other option are web services such as OpenDAP which are meant to do exactly what we want, i.e. access parts of a netcdf file over the internet. However, this of course requires that every user would have to install the OpenDAP server before they could access their local files from within the web service. (Or, at a minimum they would have to install a web server so that one could access the file via http://localhost/...).
So: does anyone know of a solution to read specific portions of a local netcdf file from a web application? Specifically, are there javascript tools available for this?
Currently you can use netcdfjs, that it's a javascript library that allows to read NetCDF v3 files. Because it's a NodeJS package you can run it server side, and for regular size files you can read it online, here you have an example.
There are no Javascript tools to read NetCDF, but there are Java libraries: http://www.unidata.ucar.edu/downloads/netcdf/index.jsp.
You could deploy a Java Applet to read the NetCDF file locally. From there you could have the Applet try to process the file, or communicate with a backend service, or even try to call on the Applet from Javascript: http://docs.oracle.com/javase/tutorial/deployment/applet/invokingAppletMethodsFromJavaScript.html.
There are no specific Javascript libs for reading NetCDF, as far as I know. But let me drop some thoughts:
As you say, you can install a local webserver, and use a server-side language to execute a program that reads the file in your disk. You could use a command-line software called ncdump-json. I intentionally wrote this software for that purpose.
You say that you don't want to install software like OpenDAP or webservers, but maybe a desktop app, a standalone .exe would be OK for you. Using node.js plus chromium (this is called nw.js) you have a program that is at the same time a nodejs server, and a chromium browser. This way you can write a web application that reads the data (e.g. using the embedded nodejs server and spawning ncdump-json...).
And now the simplest scenario, a 100% client-side solution... I think this can only be achieved using Javascript... or anything that can "compile to" Javascript. We need to have the netCDF libraries in Javascript, is that possible? I guess so. Emscripten is a LLVM-to-Javascript compiler, and netCDF is C so it has to be possible to compile it with clang/LLVM, therefore it has to be possible to use Emscripten to have a netCDF JS library version with almost no effort, without having to write from scratch (and also maintain) a JS port. If I'm not wrong, to random access the files there's the method slice in the html5 file api, so that should not be a problem.
Hope I helped.
I have a text file i want to read in my html page,both are in the same directory and i am not running a server. I intend that my users use the script offline(basically text manipulation based on expressions and preserving new line characters) .
I tried ajax call but mostly cross domain origin problem occured and i know most of the users will have this security tighened up in many browsers , so its not of use to circumvent this in only in my browser.
I want to support many browsers including old browsers as in IE7,8 etc which do not support HTML5 filereader.for the same reason reading using filesystemobject or activex is not good.
Reading the file after user select it as input , is this possible?Otherwise i would have no option then using other technologies like php,java etc which may expect my user to setup these.
Please excuse me if i am repeating this but i am a beginner web developer. I know that reading local files via javascript is not possible but is there any other way?
If you can't support FileReader, then the answer is pretty much no (at least, if you want to support a large range of browsers rather than rely on convenient feature x of browser y). Unless : you indeed increase the requirements for running the application and get some sort of local server going (for instance node.js, Apache, TomCat, etc. but like you said this will greatly increase the requirements and become cumbersome for users).
You could also rethink what it is you're trying to achieve. What are the contents of the file you want to read ? Can't these contents be part of the HTML file you're serving to your users (i.e. a large JSON Object inside a script-tag ?)
On possibility of using node.js:
Node.js is quite easy to install and assuming you are requiring from your users to install it, you can use it as a local server, which is a nodejs script of about two lines in size :). Running it locally would also omit the need to upload anything anywhere as you can directly read from the file system using the fs-object (see sitepoint.com/accessing-the-file-system-in-node-js).
STILL: from both a design and ease-of-use-point of view you might want to resort to using either another technology, or include the text file content inside the HMTL file
Recently I received a package with web page. I see inside (beside normal html and js files) there are some JS files. It looks like this:
4A3674A3247236B3C8294D2378462378.cache.js
FE728493278423748230C48234782347.cache.js
compilation-mappings.txt
Inside .js files I see Javascript which is obfuscated or minified. Inside compilation-mappings.txt the cache.js are referenced. Are these files generated by some kind of WEB IDE? Unfortunately I have no chance to get information how this wep page was developed.
That is a web project coded in Java and compiled to JS using the GWT project tools.
GWT compiler does a lot of the work you would have to do manually when coding JS by hand, and some other tasks which are almost impossible in a normal JS project: obfuscate, compress, death-code removal, different optimization per browser, renaming of the scripts, code splitting, etc.
What you have in your app is the result of this compilation:
First you should have a unique index.html file, because GWT is used to produce RIA (Rich Internet Applications) also known as SPI (Single Page Interface).
The unique html file should have a reference to a javascript file named application_name.nocache.js. Note the .nocache. part, meaning that the web server should set the appropriate headers, so as it is not cached by proxies nor browsers. This file is very small becaust it just have the code to identify the browser and ask for the next javascript file.
This first script knows which NNNN.cache.js have to load each browser. The NNNN prefix is a unique number which is generated when the app is compiled, and it is different for each browser. GWT supports 6 different browser platforms, so normally you would have 6 files like this. Note the .cache. part of the name, meaning that this files could be cached for ever. They are large files because have all the code of your application.
So the normal workflow of your app is that the browser ask for the index.html file which can be cached. This file has the script tag to get the small start script applicaton.nocache.js which should be always requested to the server. It has just the code for loading the most recent permutation for your browser NNNN.cache.js which will be downloaded cached in your browser for ever.
You have more info about this stuff here
The goals of this naming convention is that the next time the user goes to the app, it will be in cache the index.html and NNNN.cache.js files, asking only for the application.nocache.js which is really small. It guarantees that the user loads always the most recent version of the app, that the browser will download just once the code of your app, that proxies or cache devices do not break your app when releasing a new version, etc.
Said that, it is almost impossible to figure out what the code does inspecting the javascript stuff because of the big obfuscation. You need the original .java files to understand the code or make modifications.
I can't say for sure, but often a string will be attached to the name of a javascript file so that when a new version is deployed clients will not use a cached version of the old one.
(ie, if you have myScript.js and change it, the browser will say "I already have myScript.js, Idon't need it. If it goes from being myScript1234.js to myScript1235.js the browser will go fetch it)
It is possible the framework in use generated those files as part of it's scheme to handle client side cache issues. Though without knowing more details of what framework they used, there's no way of knowing for sure.
I have had some thoughts recently on how to handle shared javascript and css files across a web application.
In a current web application that I am working on, I got quite a large number of different javascripts and css files that are placed in an folder on the server. Some of the files are reused, while others are not.
In a production site, it's quite stupid to have a high number of HTTP requests and many kilobytes of unnecessary javascript and redundant css being loaded. The solution to that is of course to create one big bundled file per page that only contains the necessary information, which then is minimized and sent compressed (GZIP) to the client.
There's no worries to create a bundle of javascript files and minimize them manually if you were going to do it once, but since the app is continuously maintained and things do change and develop, it quite soon becomes a headache to do this manually while pushing out new updates that features changes to javascripts and/or css files to production.
What's a good approach to handle this? How do you handle this in your application?
I built a library, Combres, that does exactly that, i.e. minify, combine etc. It also automatically detects changes to both local and remote JS/CSS files and push the latest to the browser. It's free & open-source. Check this article out for an introduction to Combres.
I am dealing with the exact same issue on a site I am launching.
I recently found out about a project named SquishIt (see on GitHub). It is built for the Asp.net framework. If you aren't using asp.net, you can still learn about the principles behind what he's doing here.
SquishIt allows you to create named "bundles" of files and then to render those combined and minified file bundles throughout the site.
CSS files can be categorized and partitioned to logical parts (like common, print, vs.) and then you can use CSS's import feature to successfully load the CSS files. Reusing of these small files also makes it possible to use client side caching.
When it comes to Javascript, i think you can solve this problem at server side, multiple script files added to the page, you can also dynamically generate the script file server side but for client side caching to work, these parts should have different and static addresses.
I wrote an ASP.NET handler some time ago that combines, compresses/minifies, gzips, and caches the raw CSS and Javascript source code files on demand. To bring in three CSS files, for example, it would look like this in the markup...
<link rel="stylesheet" type="text/css"
href="/getcss.axd?files=main;theme2;contact" />
The getcss.axd handler reads in the query string and determines which files it needs to read in and minify (in this case, it would look for files called main.css, theme2.css, and contact.css). When it's done reading in the file and compressing it, it stores the big minified string in server-side cache (RAM) for a few hours. It always looks in cache first so that on subsequent requests it does not have to re-compress.
I love this solution because...
It reduces the number of requests as much as possible
No additional steps are required for deployment
It is very easy to maintain
Only down-side is that all the style/script code will eventually be stored within server memory. But RAM is so cheap nowadays that it is not as big of a deal as it used to be.
Also, one thing worth mentioning, make sure that the query string is not succeptible to any harmful path manipulation (only allow A-Z and 0-9).
What you are talking about is called minification.
There are many libraries and helpers for different platforms and languages to help with this. As you did not post what you are using, I can't really point you towards something more relevant to yourself.
Here is one project on google code - minify.
Here is an example of a .NET Http handler that does all of this on the fly.