Run code off the main thread? - javascript

I know it's possible to run js-ctypes off the main thread so it acts async by using ChromeWorker. But ChromeWorkers can't use XPCOM.
So I was wondering if there is a way to run other synchronous stuff off the main thread?
I was hoping to use it for things like nsIZipWriter, nsIToolkitProfileService::Lock/Unlock`, etc.

In Javascript, the only way to run off-the-main-thread code is WebWorker/ChromeWorker, which indeed does not have XPCOM access.
Actually, there used to be a way to use XPCOM from workers, and I was initially upset when it got removed again, but now I appreciate that it was the right thing to do: Much (most?) of XPCOM is not thread-safe, not even when using what appears to be self-contained instances of XPCOM classes, because in the end many of things end up calling some non-thread-safe services as part of their implementation. This leads to data and/or memory corruption and eventual crashes and data loss. Problem here was/is that it does not always corrupt memory, because there is not always a data race, and instead just causes havoc each X-times you run the code. People often used to develop and test their stuff and it happened to worked or at least looked like it worked, but once more people (aka. the users) started executing code, crashes started to pile up.
It is possible to run code off-the-main-thread in C++ code, but it has the same problem, much of XPCOM not being thread-safe, and therefore you'll need to be vary careful what you run in a different thread, i.e. only access stuff that was explicitly marked thread-safe, but even with such a marker there might be thread-safety bugs.
So, you cannot use XPCOM in another thread from JS (unless there are dedicated components doing this for you, like nsIAsyncStreamCopier) and even running XPCOM in another thread from C++ requires a lot of knowledge, skill and time to debug things if there are crashes after all.
If you really want to, then things like a zip-writer could be reasonably easy implemented in JS and run in a Worker. E.g. the zip format isn't particularly hard to implement, in particular if you don't need actual compression, and OS.File allows you to mostly conveniently do file I/O from a worker.

I think yes you can run sync stuff in an async way.
See: https://developer.mozilla.org/en-US/Add-ons/Code_snippets/Threads

Related

Simulate JS execution to read heap memory

I have a problem where I need to see if a particular JavaScript source code takes a lot of heap space. Ideally I would like to have access to heap memory usage and data type of objects in the heap. The trouble is that it seems I'll have to execute the code to have access to heap mem allocation information.
The code, however, are malicious (heap spray attacks) so I would like to avoid full execution. Is there a way for me to simulate the execution instead? I've read that I can use sbrk or API hook (MSFT Detours) to get memory usage for a particular process (usually the JS interpreter/engine), but it looks like these use cases actually executed the code.
EDIT:
I would need to access heap memory as part of a pipeline for multiple JS files so it would be ideal having memory info via a command or through an API.
If you use Chrome you can use the Perfomance tab of Developer Tools. Just press record refresh the page or apply JS script:
If you want to see JS memory you can also use Task Manager. -> More Tools -> Task Manager
What does it mean to "simulate execution"?
Generally speaking: JavaScript engines are made to execute JavaScript. For real.
For analyzing malicious code, you'll probably want to look into sandboxing/isolating it as much as possible. In theory, executing it normally in a browser should be enough -- in practice though, security bugs do sometimes exist in browsers, and malicious code will attempt to exploit those, so for this particular purpose that probably won't be enough.
One approach is to add a whole other layer of sandboxing. Find yourself a JavaScript-on-JavaScript interpreter. Or pick a non-JIT-compiling JavaScript engine, and compile it to WebAssembly, and run that from your app. You can then inspect the memory of the WebAssembly instance running the malicious code; this memory is exposed as an ArrayBuffer to your JavaScript app. (I don't have a particular recommendation for such a JS engine, but I'm sure they exist.) It might be a bit of effort to get such a setup going (not sure; haven't tried), but it'd give you perfect isolation from evil code.

Can node prevent an infinite loop?

Since node runs a single threaded model with event looping I wonder how node prevents the entire application to fail if you write a code like:
while(true){ doSomething()}
where doSomething is a synchronous function (a blocking piece of code)
Note that it doesn't make any sense to write a function like doSomething but nothing prevents you to make a mistake
The problem here is that, since it's single threaded, it won't allow any other parts of the application to run (for instance, a web server would stop accepting new connections) because this function would never end. In a Multi threaded environment you would loose this thread alone.
Is there anything that node can do for you to prevent these kind of problems?
I wonder how node prevents the entire application to fail if you write an infinite loop
nodejs does not prevent such an infinite loop. It will just run that loop forever or until some resource is exhausted (if the loop is consuming some resource like memory).
If node can't prevent this kind of situations, is this a design fault or there's no way to prevent these kind of problems?
I don't think most people consider it a design fault - though that's purely an opinion and different people may have a different opinion. It is a consequence of the way nodejs was designed which has many other benefits.
The only way to prevent such problems is to not write faulty code that does this. Honestly, it's not too hard to avoid writing this type of code once you're aware that it's an issue to avoid.
The problem here is that, since it's single threaded, it won't allow any other parts of the application to run (for instance, a web server would stop accepting new connections) because this function would never end. In a Multi threaded environment you would loose this thread alone
Correct. This is something you learn when coding in nodejs. I've never found it a hard thing to avoid. nodejs is an single-threaded event driven system, not a multi-threaded system. As such, you program with events, not long running loops that poll or check conditions. It is a rather straightforward concept to learn and use once you understand this is how nodejs works. It is different than some other environments. But, how to use asynchronous operations in nodejs is just something you have to learn to program in that environment. It's not avoidable and is just part of the character of nodejs. There is no way that nodejs could have the type of architecture it has without having to learn this to program in it. If you want a different architecture (for whatever personal reason), then pick a different environment, not nodejs.
The single-threadedness massively simplifies many other things (far, far fewer opportunities for race conditions) and improves scalability in some circumstances (with asynchronous I/O) vs. threaded environments. For situations where you want multiple CPUs to be applied to your problem, it is generally straightforward in node.js to either use the built-in clustering module or to fire up worker processes and feed them work. Data is often shared among multiple processes via some sort of database (either file-based or RAM-based) that handles much of the multi-process synchronization for you.
It doesn't. This seems like less of a question and more an open statement. Node will loop infinitely and all your parallel code will stop running.
it's not possible to find such issue in the node.js program itself. however a node.js script with an infinite loop will use lead to 100% cpu . so this can be monitored and you can use tools to restart the program. I don't recommand to do this, you should fix your infinite loop first, but it s sometimes hard to find the issue with large codebase. last time it happened to be I used a remote debugger to find the infinite loop.

Webpack HMR vs Skewer mode in emacs

I have recently started looking into webpack, because of cool features that enable writing true CSS modules, and smart bundling and stuff, and there is HMR, thats why I am here. I have seen examples of React Redux projects that made it possible to update javascript code without reloading browser. WOW, I thought it is impossible.
I wanted to know more, especially how it works under the hood, to make it work with my current project which is Vanilla JS.
In the mean time, my interest in functional programming languages brought me to Emacs. I have found out that there is a skewer-mode available in emacs editor that do update javascript and HTML! in real-time without realoding browser.
I know that they both use local server to push the changes to the browser and some script on client that somehow updates the code. But how do they preserve the state of application. In terms of React projects its kind of imaginable, because of component based nature of apps, you can just replace component with new one, but I am not sure how do they search for variables and reassign new values to them. Maybe they do use some eval magic. But I am not sure.
So how do they exactly work? Maybe I am looking from the wrong angle, I just don't have a clear picture.
Emacs has live update of HTML too, can webpack HMR do that?
(I don't care much about HTML because I do it in JS. But I think it can explain difference between these two.)
Which is better in doing so?
What is the pros and cons of each or are they just different parts of the world and can be integrated to become something even better?
Maybe there is a even better options without the need of middleware like local webserver, but just editor plugin communicating with some browser extension?
P.S.: I don't mind learning tools that can optimize my work, because it always pays off.
So how do they exactly work?
From the Webpack HMR documentation,
In general the module developer writes handlers that are called when a dependency of this module is updated. He can also write a handler that are called when this module is updated.
Each module type needs update logic written for it.
From the skewer-mode repository,
Expressions are sent on-the-fly from an editing buffer to be evaluated in the browser, just like Emacs does with an inferior Lisp process in Lisp modes.
Your code is sent to the browser as a string, and runs through a global eval.
Which is better in doing so? What is the pros and cons of each?
If you use libraries that have HMR plugins written for them, it might be worth using this feature. Libraries without HMR hooks will not benefit from it. Webpack's HMR seems extremely complex, and its documentation and its plugins warn about HMR's "experimental" state. Therefore, it is probably not reliable, and thus could be counter-productive to your development. For instance, the reloading modules need to correctly clean up the non-reloading ones. What if some script adds listeners to window or document and doesn't provide a way to remove them?
If you want your text editor to serve as an additional REPL for your browser, then you can use skewer-mode. To effect any change in your application, some part of it must be exposed via a global variable. Maybe you export one global object with a bunch of submodules attached to it, e.g. window.APP = {}, APP.Dialog, APP.Form... or, maybe you just release hundreds of implicit global variables and functions into your environment. You can see changes in your application by evaluating the definitions of these modules / functions / variables, and then evaluating some code that uses them, e.g. by calling a function APP.initialize() which bootstraps your app, or by triggering a redraw in a view library you use (usually by performing a user action like clicking an element).
If your application is not written such that it can be modified in a browser console (e.g. if you use a module compiler like Browserify or Webpack, which wraps your code in one big closure), then you won't be able to do much with skewer-mode. Also, consider whether it would be faster to manually eval code snippets / files and re-run initialization code (and potentially create impossible application state that you will waste time debugging), or to just refresh the page and recreate your previous state.
The benefit you gain from either of these tools is heavily reliant on the way your application is structured. I can see them creating pleasant development workflows under exactly the right conditions (what I describe above). Otherwise, they seem too likely to cause harm to be worthwhile.

What exactly are web workers and when to use them

I was reading up something about XMLHttpRequest (Is there any reason to use a synchronous XMLHttpRequest?) here on SO where I read on a thread from 2010 that, with the introduction of 'threads' in HTML5, developers might start to use synchronous APIs. Searching a bit on google, I found the MDN page on web workers.
I am writing Javascript and Node from about a year now (assume a beginner), and I am still to encounter something that makes use of these web workers. Maybe I need to read more code.
Now my question is, even though they seem to be very useful, why isn't it seen much in the wild? Also, what are the general use cases and guidelines when using them? Is it possible to reap the multithreaded processing benefits in Nodejs environment? If so, why are all Nodejs APIs still asynchronous?
Thank you.
A web-worker is strictly a clientside thing, so it has nothing to do with Node.js (EDIT: actually, see this module).
You might have heard that JavaScript is strictly single-threaded: if a function is doing some heavy calculation, nothing else is getting done, including animating icons, repainting the window, nothing. Thus, clientside JS should always avoid heavy computation, large loops and anything else that might usurp the thread for more than a fraction of a second.
Web-workers are the solution for that. Each web-worker is running in its own thread, and it can block as much as it wants - it won't affect the normal operation of the web page. The tradeoff is that it cannot have any access to the DOM: the fact that it doesn't affect the rendering means you cannot affect rendering with it. :) If a web-worker wants to render something, it would have to send a message to the main thread to do it.
Implementation-wise, each web-worker needs to be in a separate JS file. The reason why you don't see more of them is probably twofold: the average Joe probably doesn't know how to use them, and they are only needed when you need serious computation and don't want it to block your main thread - which is not that common in the first place, and when it is, the computation is commonly offloaded to the server (on clientside) or to separate processes (in Node.js).
Read more on HTML5 Rocks.

Ruby plugin for web browser?

Am I correct that if someone wrote a Ruby plugin for a web browser and a user installed that plugin then it would be possible to replace javascript with ruby on the frontend?
Aren't there any plugins for this? Or even for using other languages than javascript on the browser side?
You could use http://ironruby.net/ in a Silverlight Plugin, but I have not a clue about how easy DOM interaction is this way.
But I BEG YOU don't do it! Please, use the Open Web Stack to solve your problems.
If you don't leave your Ruby world of comfort, you will not only hurt your users experience "WTF? Why do I need Silverlight for this page?" but you will also get stuck in your small little Ruby world without learning anything new and exciting.
It would be better for both of you, if you'd just go ahead and learn JavaScript.
Because remember: "Learning is a good thing!"
One thing is A FACT: as of 2010 JavaScript does not have a thread stopping "sleep" function (other than the one that just burns CPU cycles).
I have been working with JavaScript for at least a year before posting this comment and I have come to a conclusion that the lack of a thread-stopping sleep function is a real show-stopper for threading related code.
A consequence of the lack of the sleep function is that it's not possible to simulate a Ruby/C#/C++/etc. like threading model in JavaScript, which in turn means that it's not possible to translate any of the threading enabled languages to JavaScript, no matter, what one does, unless the JavaScript is supplemented with a (preferably non-CPU-cycle-burning) sleep function.
If one surfs around, then one can find many comments that state that the sleep function is not even necessary, that the setTimeout is sufficient, etc., but I guess that people, who state that, have not tried to implement a threading framework in JavaScript. (Think of mutexes, critical sections. I refuse to go into a discussion that the critical sections/synchronization are/is not necessary for cases, where widget content consists of multiple data components that form an "atomic whole".)
The second show-stopper for the whole DOM-model is the implementation that renders DOM elements IN THE BACKGROUND THREAD.
Here's, what happens:
In Javascript:
create_my_awsome_widget_in_DOM();
edit_my_awsome_widget_by_editing_DOM_inside_it()
if_we_are_lucky_we_reach_here_without_crashing_the_app()
As the DOM is rendered in background (read: in a separate thread), there will be a race condition between the thread that initiated the DOM editing, by making a call to the create_my_awsome_widget_in_DOM(), and the DOM rendering. If the rendering thread is "quick enough" to render the DOM before the JavasSript thread calls the edit_my_awsome_widget_by_editing_DOM_inside_it(), everything works fine, but if it's the other way around, then the JavaScript starts to modify region of the DOM that does not (yet) exist.
Essentially it means that due to the background DOM rendering the create_my_awsome_widget_in_DOM() and edit_my_awsome_widget_by_editing_DOM_inside_it() are executed in a random order and obviously the application crashes, if the edit_my_awsome_widget_by_editing_DOM_inside_it() is called before the create_my_awsome_widget_in_DOM().
There might be a way to do it indirectly. Here is the original presentation at RubyConf 2008. The topic:
This talk is about the many paths towards getting ruby running in your web browser. I'll first talk about why this is even a good idea. I'll then talk briefly about each approach I've investigated and the differing amounts of FAIL I encountered with each. Next I'll focus on the most promising contender, rubyjs, a ruby compiler which outputs javascript.
The project rubyjs still exists, but it appears to be dead. The idea probably was a little too crazy.
mruby seems like an interesting option for running ruby in a web browser:
http://qiezi.me/projects/mruby-web-irb/mruby.html
It's not a typical plugin as it does not require installation, it's javascript (compiled from C) running ruby code.
Technically that would be correct, assuming the browser/plugin also provided an extensive API to deal with the DOM and such. I am not aware of any plugins that make this possible, but it's an interesting idea.

Categories

Resources