Parallel programming / Synchronization using JavaScript Web Workers

Parallel programming / Synchronization using JavaScript Web Workers - javascript

Are there any synchronization primitives like Barriers, Semaphors, Locks, Monitors, ... available in JavaScript / Web Workers or is there some library available empowering me to make use of such things (I'm thinking of something like java.util.concurrent in Java)?
Do Workers have obscure properties which differentiate them from Threads (can they share memory with the main thread, for example)? Is there some kind of limit how many workers can be spawned (like, for security reasons or something...)? Do I have to take special care of something?

Web workers don't have a concept of shared memory; all messages that are passed between threads are copied. With that being said, you don't have Barriers, Semaphores, Locks, and Monitors, because you don't need them in the web worker model.
The concept of shared memory was proposed back in Feb 2011 but the status is now wontfix due to developer complexity =>
https://lists.webkit.org/pipermail/webkit-unassigned/2011-February/287595.html
There is also a nice blurb about web workers here.
http://blogs.msdn.com/b/ie/archive/2011/07/01/web-workers-in-ie10-background-javascript-makes-web-apps-faster.aspx
Hope this helps

In short: no there aren't any synchronization primitives in javascript but there is also no need for them since JavaScript is inherently single threaded :). Workers can only access there own scope (no dom manipulation just calculations) and send messages to the main ui thread where the normal js resides. I'm not sure about the maximum count of workers but there sure is a limit, you could try it out in a browser :)
Hope this helps!

Here you have a library based on jQuery made for that purpose: http://www.megiddo.ch/jcon-q-rency.
Of course the model is not really identical to java.util.concurrent since we are not dealing with the same environment, as explained in the other answers...

Related

How to implement multi threading in Angular?

https://www.npmjs.com/package/threads
It seems to me we can use this package in Angular for running threads.
But I feel difficulties on implementing this.
Is there anyway to use threading in Angular?
How can I use thread in Angular?

Angular does not have "threads", which by the way can mean many different things, in different contexts, environments, platforms, CPUs, and operating systems. Threads can be a way to accomplish parallelism; or they can be a way to organize your code as a set of concurrent processes; or they can be a way to manage access to shared resources; or any or all of the above.
Angular works in a browser. Browsers run JavaScript. The closest thing we have to threads in our browser world is web workers. To greatly oversimplify, web workers are not light-weight threads; in other words, you wouldn't want to create 100,000 of them. But if you are looking for a simple way to offload some computation away from the main browser task, so that it does not lock up the browser while you are computing, then you are probably interested in web workers.
Web workers do not really need any special library, or wrapping, or scaffolding. They're easy enough to just write directly. However, if you're interested in some ways to facilitate the process of using web workers within an Angular context, then google for "angular web workers".
I have no special knowledge of the library you mention. At first glance, it appears to be a way to abstract concurrent algorithms over different threading implementations appropriate for the node.js platform vs. the browser. If you're planning on working in Angular, then most likely the node.js platform part is irrelevant, so this entire library is not anything you should be interested in.

Web Workers - do they create actual threads?

I have always thought that web workers create separate threads, but today I ran into the spec on w3c website. Below is a citation about web workers:
This allows for thread-like operation with message-passing as the
coordination mechanism.
The question is - if it is thread-like, not actual thread what is an advantage(performance wise) of using this technology?
Any help will be appreciated!

Yes, web workers create actual threads (or processes, the spec is flexible on this). According to the Web Workers specification, when a worker is created the first step is:
Create a separate parallel execution environment (i.e. a separate thread or process or equivalent construct), and run the rest of these steps in that context.
For the purposes of timing APIs, this is the official moment of creation of the worker.
(W3C Web Workers specification section 4.4)
So it is explicitly specified that code running in Web Workers run in actual threads or processes.
Although it is possible to implement Workers without threads (note the "equivalent construct" language) for use on systems that don't support threads, all browser implementations implement Web Workers as threads.

A web worker runs in a single thread isolated from the main thread, the way they pass messages around is thread-like and works differently depending on whether you're using dedicated (can only be accessed from the script that created it) or shared (can be accessed by any script within the same domain via a port object) workers.
EDIT:
Updated answer to reflect my comment from months ago. While a SINGLE web worker runs in an isolated thread it doesn't mean each additional worker will run in the same thread.

According to MDN,
Web Workers are a mechanism by which a script operation can be made to run in a background thread separate from the main execution thread of a web application. The advantage of this is that laborious processing can be performed in a separate thread, allowing the main (usually the UI) thread to run without being blocked/slowed down.
So, each worker does not create a separate thread, but all workers are running in a single separate thread.
I guess, like just in other things, the implementation and approach may differ from browser to browser.

JavaScript Execution Engine Unspecified?

I started to learn JavaScript recently. I've been working in the creation of applications with Node.js and Angular for a few months now.
One of the main aspects that was puzzling me was how it is possible to write asynchronous code in JavaScript in which I do not have to worry about things like thread synchronization, race conditions, etc.
So, I found a couple of interesting articles([1],[2]) that explained how I can be guaranteed that any piece of code that I write will always be executed by a single thread at the time. Bottom line, all my asynchronous code is simply scheduled to be executed at some point within an event loop. This sounds pretty much like the OS scheduler would work in a machine with a single processor, where every process is scheduled to use the processor for a limited amount of time, giving us the fake sense of parallelism. And the callbacks would be like interrupts.
The articles do not provide any particular references, so I thought that the best source on how the JavaScript execution engine work should certainly be the language specification, and so I got me the latest copy of EcmaScript 5.1.
To my great surprise I discovered that this execution behavior is not specified there. How come? This looks like a fundamental design choice done in all JavaScript execution engines in browsers and in node. Interestingly, I have not been able to find a place where this is specified for any specific engine. In fact, I have no clue how people find out this is the way things work to the point that is so categorically affirmed in books and blogs like the ones cited above.
So, I have a set of what I consider interesting questions. I would appreciate any answers providing insights, remarks or simply references pointing me in the right direction to understand the following:
Since the EcmaScript does not specify that the JavaScript execution engine should work with an event loop, how come may implementations of JavaScript seem to work this way, not only in browsers, but also in Node.js?
Does that mean I could implement a new JavaScript engine which is EcmaScript-compatible that in fact provides true multithreading capabilities with features like sychronization locks, conditions, etc?
Does this execution model using an event loop precludes me from taking advantage of multicores if I want to execute an intense CPU-bound task? I mean, I can surely divide the task in chunks (as explained in one of the articles), but this is still executed serially, not in parallel. So, how could a JavaScript engine take advantage of multicores to run my code?
Do you know of any other reputable sources where this behavior for any particular JavaScript engine implementation is formally specified?
How could the code be portable between libraries and engines if we cannot assume a few things about the execution environments?
It looks like too many questions, perhaps making this post too broad to be answered. If it gets closed I will try to ask them in different threads. But they all revolve around the fact that I want to understand better why JavScript and Node were designed with an event loop, and if this is specified somewhere (besides the browsers source code) that I could read and gain a deeper understanding of designs and decisions taken here and more importantly, to know exactly what is the source of information for people writing books and posts about it.

There are certain assumptions/weak references you make which lead you to this conclusion. Some of them are:
ECMAScript ECMA-XXX vs JavaScript vs JavaScriptEngine:
ECMAscript is a language specification, given by ECMA International. JavaScript is the most widely used web language that conforms to ECMAscript. For most part ECMAScript and JavaScript are synonymous (remember there is ActionScript). JavaScriptEngine is the implementation (interpreter) of JavaScript language code. It is a program in flesh and bones worked from ground-up unlike ECMAScript which only describes JavaScript's end goals and behaviour and JavaScript the code that uses the ECMAScript standard. You will find that an engine will do more than just conform to ECMAScript standard. They are at the ends of the specification/implementation spectrum. Example of this is ECMA-262/JavaScript/V8.
Event loop in browser vs Event loop in node.JS (JSEngine vs JSEnvironment):
This looks like a fundamental design choice done in all JavaScript execution engines in browsers and in node.
If you are using node.JS you may have used core libraries fs/net/http. These use event emitters which are hooked with the event loop provided with libuv. This is an extension to the JavaScriptEngine V8, forming node.JS platform. The event loop here involves objects like threads, sockets, files or abstract requests. But the event did not originate here. It was in first used in browsers. A browser implements a DOM which requires events for working with HTML elements. See the DOM specification and one implemented for Mozilla. They use events and require a event loop built on top of the JSEngine for browser use. Chrome adds DOM interface to the V8 engine it embeds.
Yes, you will feel this is common, because of the necessary DOM API in all browsers. Node developers brought forward this novel evented processing to server with the help of libuv which provides non-blocking, asynchronous abstraction for low-level operations required on server. As pointed already, not all server frameworks use event loop. Take example of Rhino which literally uses Java Classes for file,sockets (everything). If you actually use core Java IO, file operations are synchronous.
Now answering your questions in order:
explained in point 2 above
Yes, you can. Take a look at Rhino, there are many others. It may be possible in node but node is geared to be a high performance webserver and that might be against its zen.
Like I said event loop sits on JSEngine. It is a design pattern, that works best with IO. Multi-threaded design works better with high CPU-loads. If you want to use multiple cores in node.JS take a look at cluster module. For browsers you have webworkers
That varies from engine to engine. And how it is embedded. Browsers will have DOM and therefore event loop. Servers can vary. Check their specifications.
For browser it is possible to make it portable between them to a good extent. No promises for server.

Event loop doesn't have anything to do with javascript itself, it's a part of environment, not js engine. Since javascript was designed primarily to manipulate user interface, it was used heavily with event loop. But event loop is a part of UI implementation, not just in javascript, but in any language.
Yes, you can. But it will not be just engine, more like environment/platform. I think (but not quite sure) that you can use threads and related stuff in Rhino.
Yes, it does. In node this is usually solved by spawning more processes and in browser you can use WebWorkers.
I can't imagine a better source then specification. If something isn't there, it's just not a part of javascript (aka EcmaScript)

I have spent a good amount of time today trying to find the answers to my own questions, guided by some of the comments and other answers left for me here. I share my findings here in case others may consider them useful.
Event-Driven Design in JavaScript for Browsers
The decision to design JavaScript this way seems mostly related to the requirements of the DOM Event Architecture. In this specification we can find explicit requirements related to the implementation of events order and the event loop. The HTML5 specification goes even further, and define the terms explicitly and state specific requirements for the event loop implementation.
This must have certainly driven the design of the JavaScript execution engines in browsers. In this article Timing and Synchronization in JavaScript published by Opera we can clearly see that these requirements are the driving force behind the design of the Opera browser. Also in this another article from Mozilla, named Concurrency Model and Event Loop, we can find a clear explanation of the same event-driven design concepts as implemented by Mozilla (although the document seems outdated).
The use of an event loop to deal with this kind of applications is not new.
Handling user input is the most complex aspect of interactive
programming. An application may be sensitive to multiple input
devices, such as mouse and keyboard, and may multiplex these among
multiple input devices (e.g. different windows). Managing this
many-to-many mapping is usually in the province of of User Interface
Management Systems (UIMS) toolkits. Since most UIMS are implemented
in sequential languages they must resort to various techniques to
emulate the necessary concurrency. Typically this toolkits use an
event-loop that monitors the stream of input events and maps the events to call-back functions (or event handlers) provided by the
application programmer.
- Jonh H. Reppy - Concurrent Programming in ML
The use of event loops is present in other famous UI toolkits like Java Swing and Winforms. In Java all UI work must be done within the EventDispatchThread whearas in Winforms all UI work must be done within the thread that created the Window object. So, even when these languages support true multithreading they still require all UI code to be run in a single thread of execution.
Douglas Crockford explains the history of the event loop in JavaScript in this great video called Loopage (worth watching).
Event-Driven Design in JavaScript for Node
Now, the decision of using an event-driven design for Node.js is a bit less evident. Crockford gives a good explanation in the video shared above. But also, in the book, The Past, Present and Future of JavaScript, its author Axel Rauschmayer says:
2009—Node.js, JavaScript on the server. Node.js lets you implement
servers that perform well under load. To do so, it uses event-driven
non-blocking I/O and JavaScript (via V8). Node.js creator Ryan Dahl
mentions the following reasons for choosing JavaScript:
“Because it’s bare and does not come with I/O APIs.” [Node.js can thus introduce its own non-blocking APIs.]
“Web developers use it already.” [JavaScript is a widely known language, especially in a web context.]
“DOM API is event-based. Everyone is already used to running without threads and on an event loop.” [Web developers are not scared of
callbacks.]
So, it looks like Ryan Dahl, creator of Node.js, took into account the current design of JavaScript in browsers to decide which should be the implementation of his non-blocking, event-driven solution for Node.js.
The latest implementation of Node.js seems to use a library called libuv, designed for the implementation of this kind of applications. This library is a core part of the design of node. We can find the definition of event loops in its documentation. Evidently this plays an important role in the current implementation of Node.js.
About Other EcmaScript Compatible Engines
The EcmaScript specification does not provide requirements about how the concurrency needs to be handled in JavaScript. Therefore, this is decided by the implementation of the language. Other models of concurrency could easily be used without making the implementation incompatible with the standard.
The best two examples I found were the new Nashorn JavaScript Engine created for Oracle for the JDK8, and Rhino JavaScript Engine created by Mozilla. They both are EcmaScript compatible, and they both allow the creation of Java classes. Nothing in these engines requires the use of event-driven programming to deal with concurrency. These engines have access to the Java class library and since they run on top of the JVM they probably have access to other concurrency models offered in this platform.
Consider the following example take from JavaScript, The Definitive Guide to illustrate how to use Rhino JavaScript.
print(x); // Global print function prints to the console
version(170); // Tell Rhino we want JS 1.7 language features
load(filename,...); // Load and execute one or more files of JavaScript code
readFile(file); // Read a text file and return its contents as a string
readUrl(url); // Read the textual contents of a URL and return as a string
spawn(f); // Run f() or load and execute file f in a new thread
runCommand(cmd, // Run a system command with zero or more command-line args
[args...]);
quit() // Make Rhino exit
You can see a new thread can be spawned to run a JavaScript file in an independent thread of execution.
About Event-Driven Design, Multicores and True Concurrency
The best explanation I found on this subject comes from the book JavaScript The Definitive Guide. In this book, David Flanagan explains:
One of the fundamental features of client-side JavaScript is that it
is single-threaded: a browser will never run two event handlers at the
same time, and it will never trigger a timer while an event handler is
running, for example. Concurrent updates to application state or to
the document are simply not possible, and client-side programmers do
not need to think about, or even understand, concurrent programming. A
corollary is that client-side JavaScript functions must not run too
long: otherwise they will tie up the event loop and the web browser
will become unresponsive to user input. This is the reason that Ajax
APIs are always asynchronous and the reason that client-side
JavaScript cannot have a simple, synchronous load() or require()
function for loading JavaScript libraries.
The Web Workers specification very carefully relaxes the
single-threaded requirement for client-side JavaScript. The “workers”
it defines are effectively parallel threads of execution. Web workers
live in a self-contained execution environment, however, with no
access to the Window or Document object and can communicate with the
main thread only through asynchronous message passing. This means that
concurrent modifications of the DOM are still not possible, but it
also means that there is now a way to use synchronous APIs and write
long-running functions that do not stall the event loop and hang the
browser. Creating a new worker is not a heavyweight operation like
opening a new browser window, but workers are not flyweight threads
either, and it does not make sense to create new workers to perform
trivial operations. Complex web applications may find it useful to
create tens of workers, but it is unlikely that an application with
hundreds or thousands of workers would be practical.
What About Node.js True Parallelism?
Node.js is a fast-evolving technology, and perhaps that's why it is difficult to find opinions that are up-to-date. But basically, since it follows the same event-driven model as the browsers do, it is impossible to simply program a piece of code and expect it will take advantage of our multiple cores in the server. Since Node.js is implemented using non-blocking technologies, we could assume that every time we do some form of I/O (i.e. read a file, send something through a socket, write to a database, etc.), under the hood, the node engine could be spawning multiple threads and maybe taking advantage of the cores, but our code would still be run serially.
These days, it looks like node.js clustering is the solution for this problem. There are also some libraries like Node Worker that seem to implement the Web Worker concept in node. These libraries basically let us spawn new independent processes within node.js. (Although I have not experimented with this yet).
What About Portability?
It looks like there is no way that, in terms of the concurrency models, we can guarantee that all these libraries will play nice in all environments.
Although in the realm of browsers they all seem to work similarly, and since Node.js runs in an event loop, many things may still work, but there not guarantees that this should work in other engines. I guess this is probably one of the disadvantages of EcmaScript compared to other more extensive specifications like those defining the Java Virtual Machine or the CLR.
Perhaps something gets standardize later. In the future of EcmaScript, more concurrency ideas are being discussed today. See the EcmaSript Wiki: Strawman Proposals Communicating Event-Loop Concurrency and Distribution

Anything recent in Concurrency in the JS ecosystem ? Last was 4 months ago

**
Is there anything like Actors in JavaScript and its ecosystem (Node, CoffeeScript, Backbone etc) ?
With the widespread use of AJAX it seems perfect for asynch message-passing.

If you are using Javascript in the browser, take a look at Web workers:
https://developer.mozilla.org/en-US/docs/Web/Guide/Performance/Using_web_workers
From the page:
Dedicated Web Workers provide a simple means for web content to run scripts in background threads
You communicate to the web workers using message passing.

Because JavaScript is traditionally single-threaded, it would be difficult to make Actors or a similar async message-passing technique without exposing some of the internals to the users of the library. If I understand correctly, Actors wait synchronously for messages, and it's just sending which is happening asynchronously. It's much more idiomatic in JavaScript to both read and write asynchronously and use callbacks to deal with the results of the communication.
Of course, there are ways around this, so this other question and the presentation linked in the top answer and this list of node.js modules for dealing with control flow are decent starting points for how you might go about implementing your own.

Is there a way to do multi-threaded coding in NodeJS?

Based on my understanding, only I/O in NodeJS is non-blocking. If we do, for example, lots of heavy math operations, other users cannot access to the server until it's done.
I was wondering if there is a non-blocking way to do heavy computation in NodeJS? Just curious.

If you have long-running calculations that you want to do with Node, then you will need to start up a separate process to handle those calculations. Generally this would be done by creating some number of separate worker processes and passing the calculations off to them. By doing this, you keep the main Node event loop unblocked.
On the implementation side of things, you have two main options.
The first being that you manually spin off child processes using Node's child-process API functions. This one is nice because your calculations wouldn't even have to be javascript. The child process running the calculations could even be C or something.
Alternatively, the Web Worker specification, has several implementations available through NPM if you search for 'worker'. I can't speak to how well these work since I haven't used any, but they are probably the way to go.
Update
I'd go with option one now. The current child process APIs support sending messages and objects between processes easily in a worker-like way, so there is little reason to use a separate worker module.

You can use Hook.io to run a separate node process for your heavy computation and communicate between the two. Hook.io is particularly useful because it has auto-healing meshes meaning that if one of your hooks (processes) crashes it can be restarted automatically.

Use multiple NodeJS instances and communicate over sockets.

Use multiple node instances and communicate over node-zeromq, HTTP, plain TCP sockets, IPC (e.g. unix domain sockets), JSON-RPC or other means. Or use the web workers API as suggested above.
The multiple instances approach has its advantages and disadvantages. Disadvantages are that there is a burden of starting those instances and implementing own exchange protocols. Advantages are that scaling to many computers (as opposed to many cores/processors within a single computer) is possible.

I think this is way to late, but this is an awesome feature of nodejs you've to know about.
The only way abot multithreading is by spawning new processes, right so far.
But nodejs has an implemented message feature between spawned node-forks.
http://nodejs.org/docs/latest/api/child_processes.html#child_process.fork
Great work from the devs, you can pass objects etc. as message to your childs and backwards

You can use node cluster module.
https://nodejs.org/api/cluster.html

I would use JXCore - it's a polished nodejs based engine which runs your code but has several options including the multi threading you are searching for. Running this in production is a charm!
Project's source: https://github.com/jxcore/jxcore
Features include:
Support for core Node.JS features
Embeddable Interface
Publish to Mobile Platforms (Android, iOS ..)
Supports Multiple JavaScript Engines
Multi-threading Capabilities
Process Configuration & Monitor
In-memory File System
Application Packaging
Support for the latest JavaScript features (ES6, ASM.JS ...)
Support for Universal Windows Platform (uwp) api

Develop Reference

JavaScript is the programming language of the Web.