How to multithread a download in client-side javascript

How to multithread a download in client-side javascript - javascript

I have very large (50-500GB) files for download, and the single-threaded download offered by the browser engine (edge, chrome, or firefox) is a painfully slow user experience. I had hoped to speed this up by using multithreading to download chunks of the file, but I keep running into browser sandbox issues.
So far the best approach I've found would be to download and stuff all the chunks into localStorage and then download that as a blob, but I'm concerned about the soft limits on storing that much data locally (as well as the performance of that approach when it comes to stitching all the data together).
Ideally, someone has already solved this (and my search skills weren't up to the task of finding it). The only thing I have found have been server-side solutions (which have straightforward file system access). Alternately, I'd like another approach less likely to trip browser security or limit dialogs and more likely to provide the performance my users are seeking.
Many thanks!

One cannot. Browsers intentionally limit the number of connections to a website. To get around this limitation with today’s browsers requires a plugin or other means to escape the browser sandbox.
Worse, because of a lack of direct file system access, the data from multiple downloads has to be cached and then reassembled into the final file, instead of having multiple writers to the same file (and letting the OS cache handle optimization).
TLDR: Although it is possible to have multiple download threads, the maximum is low (4), and the data has to be handled repeatedly. Use a plugin or an actual download program such as FTP or Curl.

Related

Scan and access local file directory in Firefox and IE?

I'm doing some research on whether or not it's possible for a web app (meant to be used and distributed internally) to scan and read files from a local directory (on user machine). I came across a couple of terms as following:
NPAPI: no longer supported by majority of web browser
ActiveX: IE only
Sandbox: Chrome uses this kind of technology, plus it's not fitting to the requirement so I have to look elsewhere
I feel like ActiveX might be the only option even though I haven't actually written any ActiveX control before (not sure if it's possible).
Also the goal is to support more than one kind of web browser, so other than IE I thought Firefox might be capable of achieving the requirement, since no search result so far said otherwise.
Could someone please give me some pointer? I just need to know if it's at all possible to build a ActiveX control or Firefox extension to scan and read files from a local directory. If it is, then what is the downside other than security vulnerability.

Why can't JavaScript send commands to the OS level?

If HTML, CSS, and JavaScript are processed by the user's computer, why can't JavaScript send commands to the OS level? I know if this happened, hackers could exploit a lot of computers but what prevents it from happening?

Simple answer : its the browser, you see browser is like any other program on your computer given enough permissions it can do whatever it wants to through system calls.It can access your hard drive (and not just simple filesystem i mean block/sector level access) reading/deleting whatever it wishes it can even read/edit your MBR!.Other fun stuff like ejecting CD tray/ put os to shutdown or sleep/ formatting your drives xD / infinite nag screens/ disabling network adapters/ and other crazy cool stuff you can imagine all can be done if browser makers wish to expose those functionality through javascript, for eg. if microsoft in some distant future were to expose some system API through system object much analogous to the window object in current javascript spec. .You write a script like this one <script>system.ejectDrive['cd']</script> , browser may translate in into actuall winapi call mciSendCommand(mPar.wDeviceID, MCI_SET, MCI_SET_DOOR_OPEN, 0); and bingo ! its cool but what if a hacked ebay server sent you code for wiping your D:\ drive clean?.Now you can imagine why browser makers take security so seriously.You might wonder why i'm so obsessed with disk drive ejection xD, actually i chose this example due to its physical nature . A random bit changed in his computer's memory may mean nothing to an average user unless it has some sort of " physical effect " even though that effect might be dependent on that bit somehow.

JavaScript can have access to your computer in some cases:
https://nakedsecurity.sophos.com/2016/06/20/ransomware-thats-100-pure-javascript-no-download-required/
Some ransomwares have their js files as e-mail attachments and tempt users to open it locally.
It is not about JavaScript, it is about the software that runs the script, and how the software interprets it.
With js in browser, it can only do what the browser allows it to.
With Node.JS, you can write "JavaScript" to start your own server.
When run locally in Windows via a double click, it can be very dangerous.

How do you detect memory limits in JavaScript?

Can browsers enforce any sort of limit on the amount of data that can be stored in JavaScript objects? If so, is there any way to detect that limit?
It appears that by default, Firefox does not:
var data;
$("document").ready(function() {
data = [];
for(var i = 0; i < 100000000000; i++) {
data.push(Math.random());
}
});
That continues to consume more and more memory until my system runs out.
Since we can't detect available memory, is there any other way to tell we are getting close to that limit?
Update
The application I'm developing relies on very fast response times to be usable (it's the core selling point). Unfortunately, it also has a very large data set (more than will fit into memory on weaker client machines). Performance can be greatly improved by preemptively loading data strategically (guessing what will be clicked). The fallback to loading the data from the server works when the guesses are incorrect, but the server round trip isn't ideal. Making use of every bit of memory I can makes the application as performant as possible.
Right now, it works to allow the user to "configure" their performance settings (max data settings), but users don't want to manage that. Also, since it's a web application, I have to handle users setting that per computer (since a powerful desktop has a lot more memory than an old iPhone). It's better if it just uses optimal settings for what is available on the systems. But guessing too high can cause problems on the client computer too.

While it might be possible on some browsers, the right approach should be to decide what limit is acceptable for the typical customer and optionally provide a UI to define their limit.
Most heavy web apps get away with about 10MB JavaScript heap size. There does not seem to be a guideline. But I would imagine consuming more than 100MB on desktop and 20MB on mobile is not really nice. For everything after that look into local storage, e.g. FileSystem API (and you can totally make it PERSISTENT)
UPDATE
The reasoning behind this answer is the following. It is next to never user runs only one application. More so with counting on the browser having only one tab open. Ultimately, consuming all available memory is never a good option. Hence determining the upper boundary is not necessary.
Reasonable amount of memory user would like to allocate to the web app is a guess work. E.g. highly interactive data analytics tool is quite possible in JS and might need millions of data points. One option is to default to less resolution (say, daily instead of each second measurements) or smaller window (one day vs. a decade of seconds). But as user keeps exploring the data set, more and more data will be needed, potentially crippling the underlying OS on the agent side.
Good solution is to go with some reasonable initial assumption. Let's open some popular web applications and go to dev tools - profiles - heap snapshots to take a look:
FB: 18.2 MB
GMail: 33 MB
Google+: 53.4 MB
YouTube: 54 MB
Bing Maps: 55 MB
Note: these numbers include DOM nodes and JS Objects on the heap.
It seems to be then, people come to accept 50MB of RAM for a useful web site. (Update 2022: nowadays averaging closer to 100MB.) Once you build your DOM Tree, fill your data structures with test data and see how much is OK to keep in RAM.
Using similar measurements while turning device emulation in Chrome, one can see the consumption of the same sites on tablets and phones, BTW.
This is how I arrived at 100 MB on desktop and 20 MB on mobile numbers. Seemed to be reasonable too. Of course, for occasional heavy user it would be nice to have an option to bump max heap up to 2 GB.
Now, what do you do if pumping all this data from the server every time is too costly?
One thing is to use Application Cache. It does create mild version management headaches but allows you to store around 5 MB of data. Rather than storing data though, it is more useful to keep app code and resources in it.
Beyond that we have three choices:
SQLite - support was limited and it seems to be abandoned
IndexDB - better option but support is not universal yet (can I use it?)
FileSystem API
Of them, FileSystem is most supported and can use sizable chunk of storage.

In Chrome the answer is Sure!
Go to the console and type:
performance.memory.jsHeapSizeLimit; // will give you the JS heap size
performance.memory.usedJSHeapSize; // how much you're currently using
arr = []; for(var i = 0; i < 100000; i++) arr.push(i);
performance.memory.usedJSHeapSize; // likely a larger number now

I think you'll want something like the following:
const memory = navigator.deviceMemory
console.log (`This device has at least ${memory}GiB of RAM.`)
You can check out the following: https://developer.mozilla.org/en-US/docs/Web/API/Navigator/deviceMemory
Note: This feature is not supported across all browsers.

Since a web app can't have access to any system-related information (like the available amount of memory), and since you would prefer not having to ask users to manually set their performance settings, you must rely on a solution that allows you to get such information about the user's system (available memory) without asking them. Seems impossible ? Well, almost...
But I suggest you do the following : make a Java applet that will automatically get the available memory size (e.g. using Runtime.exec(...) with an appropriate command), provided your applet is signed, and return that information to the server or directly to the web page (with JSObject, see http://docs.oracle.com/javafx/2/api/netscape/javascript/JSObject.html).
However, that would assume your users can all run a Java applet within their browsers, which is not always the case. Therefore, you could ask them to install a small piece of software on their machines that will measure how much memory your app should use without crashing the browser, and will send that information to your server. Of course, you would have to re-write that little program for every OS and architecture (Windows, Mac, Linux, iPhone, Android...), but it's simpler that having to re-write the whole application in order to gain some performance. It's a sort of in-between solution.
I don't think there is an easy solution. There will be some drawbacks, whatever you choose to do. Remember that web applications don't have the reputation of being fast, so if performance is critical, you should consider writing a traditional desktop application.

Auditing front end performance on web application

I am currently trying to performance tune the UI of a company web application. The application is only ever going to be accessed by staff, so the speed of the connection between the server and client will always be considerably more than if it was on the internet.
I have been using performance auditing tools such as Y Slow! and Google Chrome's profiling tool to try and highlight areas that are worth targeting for investigation. However, these tools are written with the internet in mind. For example, the current suggestions from a Google Chrome audit of the application suggests is as follows:
Network Utilization
Combine external CSS (Red warning)
Combine external JavaScript (Red warning)
Enable gzip compression (Red warning)
Leverage browser caching (Red warning)
Leverage proxy caching (Amber warning)
Minimise cookie size (Amber warning)
Parallelize downloads across hostnames (Amber warning)
Serve static content from a cookieless domain (Amber warning)
Web Page Performance
Remove unused CSS rules (Amber warning)
Use normal CSS property names instead of vendor-prefixed ones (Amber warning)
Are any of these bits of advice totally redundant given the connection speed and usage pattern? The users will be using the application frequently throughout the day, so it doesn't matter if the initial hit is large (when they first visit the page and build their cache) so long as a minimal amount of work is done on future page views.
For example, is it worth the effort of combining all of our CSS and JavaScript files? It may speed up the initial page view, but how much of a difference will it really make on subsequent page views throughout the working day?
I've tried searching for this but all I keep coming up with is the standard internet facing performance advice. Any advice on what to focus my performance tweaking efforts on in this scenario, or other auditing tool recommendations, would be much appreciated.

One size does not fit all with these things; the item that immediately jumps out as something that will have a big impact is "leverage browser caching". This reduces bandwidth use, obviously, but also tells the browser it doesn't need to re-parse whatever you've cached. Even if you have plenty of bandwidth, each file you download requires resources from the browser - a thread to manage the download, the parsing of the file, managing memory etc. Reducing that will make the app feel faster.
GZIP compression is possibly redundant, and potentially even harmful if you really do have unlimited bandwidth - it consumes resources both on the server and the client to compress the data. Not much, and I've never been able to measure - but in theory it might make a difference.
Proxy caching may also help - depending on your company's network infrastructure.
Reducing cookie size may help - not just because of the bandwidth issue, but again managing cookies consumes resources on the client; this also explains why serving static assets from cookie-less domains helps.
However, if you're going to optimize the performance of the UI, you really need to understand where the slow-down is. Y!Slow and Chrome focus on common problems, many of them related to bandwidth and the behaviour of the browser. They don't know if one particular part of the JS is slow, or whether the server is struggling with a particular dynamic page request.
Tools like Firebug help with that - look at what's happening with the network, and whether any assets take longer than you expect. Use the JavaScript profiler to see where you're spending the most time.

Most of these tools provides steps or advice for one time check. However it solves few issues, it does not tell you how your user experiences your site. Always Real user monitoring is a right solution to measuring live user performances. You can use Navigation Timing API to measure page load time and resource timings.
If you want to look for service, you can try https://www.atatus.com/ which provides Real User monitoring, Ajax Monitoring, Transaction monitoring and JavaScript error tracking.

Here is a list of additional services you can use to test website speed:
http://sixrevisions.com/tools/free-website-speed-testing/

Displaying a local gstreamer stream in a browser

I have a camera feed coming into a linux machine using a V4l2 interface as the source for a gstreamer pipeline. I'm building an interface to control the camera, and I would like to do so in HTML/javascript, communicating to a local server. The problem is getting a feed from the gst pipeline into the browser. The options for doing so seem to be:
A loopback from gst to a v4l2 device, which is displayed using flash's webcam support
Outputting a MJPEG stream which is displayed in the browser
Outputting a RTSP stream which is displayed by flash
Writing a browser plugin
Overlaying a native X application over the browser
Has anyone had experience solving this problem before? The most important requirement is that the feed be as close to real time as possible. I would like to avoid flash if possible, though it may not be. Any help would be greatly appreciated.

You already thought about multiple solutions. You could also stream in ogg/vorbis/theora or vp8 to an icecast server, see the OLPC GStreamer wiki for examples.
Since you are looking for a python solution as well (according to your tags), have you considered using Flumotion? It's a streaming server written on top of GStreamer with Twisted, and you could integrate it with your own solution. It can stream over HTTP, so you don't need an icecast server.
Depending on the codecs, there are various tweaks to allow low-latency. Typically, with Flumotion, locally, you could get a few seconds latency, and that can be lowered I believe (x264enc can be tweaked to reach less than a second latency, iirc). Typically, you have to reduce the keyframe distance, and also limit the motion-vector estimation to a few nearby frames: that will probably reduce the quality and raise the bitrate though.

What browsers are you targeting? If you ignore Internet Explorer, you should be able to stream OGG/Theora video and/or WebM video direct to the browser using the tag. If you need to support IE as well though you're probably reduced to a flash applet. I just set up a web stream using Flumotion and the free version of Flowplayer http://flowplayer.org/ and it's working very well. Flowplayer has a lot of advanced functionality that I have barely even begun to explore.

Develop Reference

JavaScript is the programming language of the Web.