How is Javascript translated to bytecode?

How is Javascript translated to bytecode? - javascript

I can't find information on the web how W3C languages compile to machine code. I know that the gap between the web and the processor must be somehow the browser, but how does it work and what are the steps till Javascript is executed in the processor?
Links to scientific documents would be also greatly appreciated.

It's up to the implementation; the specification is the full description of the language and how it's supposed to work, implementations are free to satisfy that implementation in any way they like. Some implementations seem (from the outside) to run it purely as an interpreter in the old sense; others may or may not compile to bytecode; V8 (the JavaScript engine in Chrome, Chromium, Brave, Node.js, and others) used to compile to machine code (twice, for hotspots in the app), but now starts out parsing to bytecode and running it in an interpreter and only compiling hotspots as necessary (details). (There's also a V8 mode where it only interprets, which they're experimenting with for environments where compiling at runtime isn't an option, such as iOS where non-Apple apps aren't allowed to allocate executable memory.)
The V8 team (V8 being the JavaScript engine in Chromium and Chrome) periodically publish descriptions of how they get the fantastic speed out of V8 that they do. You may find some of that on the V8 blog.
Naturally, you can also kick around the code of any of the open-source implementations. V8 and SpiderMonkey (Mozilla's engine) are the two major open-source ones I know.

This may help : http://www.ecma-international.org/publications/standards/Ecma-262.htm
There is no spec for how to translate into bytecode (That is up to the browser developers) but there are specs about how the language should behave

For Firefox there's some specifications on its bytecodes:
https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Bytecodes
https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Internals/Bytecode
For V8 it's compiled to native code directly:
http://jayconrod.com/posts/51/a-tour-of-v8-full-compiler

Javascript (as it's name suggests) is a dynamic scripting language. Meaning that it's code is analysed and executed at runtime by the web-browser's Javascript engine.
It is up to the Web-browser, how it wants to deal with Javascript. Some may generate an intermediate language, or bytecode. Some may directly analyse and execute it.
Here are the steps to the simplest way to execute Javascript (parsing and executing at runtime):
Parsing and Preprocessing (recursive descent or otherwise)
Analysis
Execution
Chrome's Javascript Engine compiles Javascript to native platform-specific machine code (for optimum performance) It also has a Garbage Collection Mechanism.

In addition to the useful, specific answers already given, the phrase 'adaptive optimisation' is probably worth looking up if performance is your main interest. JavaScript and its interpreters are just the latest instance of systems that need to convert something else to machine code at runtime, so there's plenty of wider reading. The bytecode forms of Pascal, Smalltalk, Java, etc can easily enough be viewed as an intermediate form in the process of running a defined language on arbitrary hardware — Apple's SquirrelFish explicitly creates a bytecode and uses a JIT compiler on that, for example.

Related

Are Runtime environments and Compilers/Interpreters the same?

I'm just getting started with Node.js and I have quite a bit of background in Python and C++. I got to know that Node.js is a runtime environment but I'm having a rough time understanding what actually it does to a code that makes it different from a compiler. It would be better if someone can explain how specifically runtime environments differ from typical compilers and interpreters.

Let's take it this way:
Node.js is a JavaScript runtime built on Chrome’s V8 JavaScript engine.
V8 is the Google Javascript engine, the same engine is used by Google Chrome.
There are other JS engines like SpiderMonkey used by Firefox, JavaScriptCore used by Safari, and Edge was originally based on Chakra but it has been rebuilt to use the V8 engine.
We must understand the relationship first before moving to how the V8 works.
JavaScript engine is independent of the browser in which it's hosted. This key feature enabled the rise of Node.js. V8 was chosen to be the engine that powered Node.js.
Since V8 is independent and open-source, we can embed it into our C++ programs, and Nodejs itself is just a program written in C++. Nodejs took the V8 and enhanced it by adding features that a server needs.
JavaScript is generally considered an interpreted language, but modern JavaScript engines no longer just interpret JavaScript, they compile it.
Since you have a C++ background, C++ performs what is called ahead-of-time (AOT) compilation, the code is transformed during compilation into machine code before the execution.
JavaScript on the other side is internally compiled by V8 with just-in-time (JIT) compilation is done during execution. While the code is being executed by the interpreter it will keep track of functions that are called very often and mark them as hot functions then it will compile them to machine code and store them.

A compiler is a program that converts code from one language to another. In Java for example we have a the java compiler javac that you can run on your .java files to compile your code into platform independent java file (can be understood and executed by any jvm).
Since you are new to JavaScript, you will encounter the transpilers (like babel) that turns your next generation JavaScript code into a legacy JavaScript code that can be handled by all browsers (even old ones).
A runtime is a more vague concept. It can go from being a set of functions to run a compiled code on a specefic operating system to being the whole environment in which your program runs.
For the case NodeJS, it's the environment on which you can run a JavaScript program out of the browser. It took the V8 Engine of Chrome that runs JavaScript on the Chrome browser and made it available everywhere. That's how JavaScript moved from being a Client side Programming Language that runs only on the browser to a server side Programming Language that can run on servers equiped with the runtime environment NodeJS.

A few simple points that might help:
C/C++ code will be compiled by the C/C++ compiler into the machine code and will not need any runtime environment to run (mostly, except the run-time libraries)
Python code needs Python interpreter to execute it. As you say you have background in C++/Pythons you must be familiar with all these nitty-gritty
JavaScript was meant to run in the browser and it does mostly, however some smart folks thought of running it outside the browsers and they created the JavaScript execution engine (which is kind of stand alone JavaScript executor) and Node.js is just one of them. It simply runs the JavaScript code outside browser on it's own. The execution is still interpretation of JavaScript code so it is just an interpreter with hell lot of support functions for managing the runtime dependencies, package management, etc.

What engine does WebAssembly use?

In Chrome, JavaScript runs on the V8 Engine, but what is the engine that runs WebAssembly code?
How is the browser suddenly able to give improved performance with WebAssembly? Is this WebAssembly engine always available in browser, or has it been added to browsers recently?

WebAssembly is only supported by all major browser (Chrome, Firefox, Safari, Edge) since Nov/2017, meaning WebAssembly is not supported by older version of browsers. (blog post from mozilla)
To understand why WebAssembly is faster then Javascript there is a excellent series by Lin Clark (link).
The conclusion from the article is quote
WebAssembly is faster than JavaScript in many cases because:
fetching WebAssembly takes less time because it is more compact than JavaScript, even when compressed.
decoding WebAssembly takes less time than parsing JavaScript.
compiling and optimizing takes less time because WebAssembly is closer to machine code than JavaScript and already has gone through optimization on the server side.
reoptimizing doesn’t need to happen because WebAssembly has types and other information built in, so the JS engine doesn’t need to speculate when it optimizes the way it does with JavaScript.
executing often takes less time because there are fewer compiler tricks and gotchas that the developer needs to know to write consistently performant code, plus WebAssembly’s set of instructions are more ideal for machines.
garbage collection is not required since the memory is managed manually.

WebAssembly is a new web standard instruction set that is executed by the browser. Within Chrome WebAssembly runs within V8 https://v8project.blogspot.com/2016/03/experimental-support-for-webassembly.html?m=1

There are many runtimes for WebAssembly such as the browser (V8, SpyderMonkey, JavaScriptCore).
Also, there are runtimes which implement the WASI such as:
Wasmtime which uses Cranelift and in turn, LLVM
WAVM which uses LLVM
Wasm3 which is an interpreter in wasm
GraalVM WASM which uses a JVM implementation and then LLVM
Lunatic which uses WasmTime and native bindings
and more
WebAssembly can run on many runtimes and so in turn, it can operate on many machines and vms

Why is client-side web still using an interpreted language? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
To my knowledge JavaScript is the only language that will execute on the client side after the HTML file has been retrieved from the server. As far as I know JavaScript is by no means compiled in anyway and thus it is an interpreted language.
With Web becoming more and more popular, so much so that some say mobile and desktop applications will soon cease to exist.
We see new technologies like WebGL, that makes use of JS.
When I develop for WebGL I have to optimize so much more to achieve a reasonable performance benchmark, then what I would have to for PC or Console.
So why do we still use an interpreted client side language? Is it a security issue, hardware (cross platform) issue or is it simply because it's difficult to introduce such a large change into the web architecture? I know web developers have headaches over even the smallest and simplest changes, like using CSS 3, simply because not everyone has their browser up to date.
Am I correct in thinking that interperated languages are slower then compiled ones? Am I making sense or are my assumptions completely incorrect? I am by no means a JS/web expert, please educate me.

To my knowledge JavaScript is the only language that will execute on the client side after the HTML file has been retrieved from the server.
This is wrong. At least in HTML 4.01, the <script> element has a type attribute that allows you to specify any language you want. The HTML 4.01 specification itself has examples in VBScript and Tcl.
Older versions of Internet Explorer, for example, support VBScript as an alternative scripting language to ECMAScript. There are versions of Chrome that support Dart as an alternative scripting language. There was a project that added support for PHP, Perl, Python, Ruby, Tcl and others to Firefox.
You can also use any language that has a compiler that outputs ECMAScript, and almost every language nowadays has that, e.g. there are compilers that can compile C, C++, Java, Scala, Ruby, C♯, F♯, and many many others to ECMAScript. There are also languages that were explicitly designed to be supersets of ECMAScript (e.g. TypeScript), semantically close to ECMAScript (e.g. CoffeeScript), or easily compilable to ECMAScript (e.g. Dart).
As far as I know JavaScript is by no means compiled in anyway and thus it is an interpreted language.
This is wrong. All currently existing in-browser ECMAScript execution engines have at least one compiler. Many have several compilers. At least one has no interpreter whatsoever:
V8 is purely compiled, there is never any interpretation going on, it always compiles ECMAScript source code to binary native machine code. The original version had a single compiler, the current version has two.
SpiderMonkey always compiles ECMAScript to SpiderMonkey bytecode. This bytecode is then first interpreted a couple of times to collect statistics, and then the "hot" parts are compiled to binary native machine code by one of several compilers.
Nitro always compiles ECMAScript to Nitro bytecode. This bytecode is then first interpreted a couple of times to collect statistics, and then the "hot" parts are compiled to binary native machine code by another compiler.
Chakra always compiles ECMAScript to Chakra bytecode. This bytecode is then first interpreted a couple of times to collect statistics, and then the "hot" parts are compiled to binary native machine code by another compiler.
In fact, the terms "interpreted language" and "compiled language" don't even make sense. Languages aren't interpreted or compiled. Languages just are. Compilation and interpretation aren't traits of a language, they are traits of a compiler or interpreter (duh!) A language is a set of mathematical rules and restrictions. Nothing more. The concept of "language" and the concept of "interpretation" live on two completely different levels of abstraction. If English were a typed language, the term "interpreted language" would be a type error.
Every language can be implemented by an interpreter and every language can be implemented by a compiler. There are interpreters for C and C++, there are compilers for ECMAScript, PHP, Python, and Ruby.
Am I correct in thinking that interperated languages are slower then compiled ones?
No. First off, like I explained above, there simply is no such thing as an interpreted or compiled language. There are only interpreted or compiled implementations of languages. Secondly, it doesn't make sense to say that a language is slower than another language. All you can say is that some particular program when executed by a particular version of a particular implementation in a particular environment on a particular machine runs faster than a different program executed by a different version of a different implementation in a different environment on a potentially different machine.
In general, the performance of typical programs on a particular implementation is mostly a function of how much money, resources, effort, brainpower, research, engineering, and manpower was expended on it, not on any particular trait of the language. The Oracle HotSpot JVM is fast not because it is a compiled implementation, not because Java is statically typed (in fact, that's completely irrelevant because the HotSpot JVM executes JVM bytecode, it doesn't even know anything about Java!), but because Oracle and Sun have poured massive amounts of resources into it. Ironically, Java was actually pretty slow until Sun bought a Smalltalk(!!!) company and their VM technology. Yes, that's right: HotSpot JVM is actually just a slightly modified Smalltalk VM, i.e. a VM for a dynamic language.
In fact, the VM HotSpot is based on, was open-sourced and is also the VM V8 is based on (which is not surprising since V8 was developed by some of the same people who developed the HotSpot JVM and the Smalltalk VM it was based upon).
Note that there are two attempts on getting a new language accepted by the browser vendors:
asm.js is a language that is a syntactic and semantic subset of ECMAScript (meaning any asm.js program is also a semantically identical ECMAScript program, and a browser that doesn't know anything about asm.js will simply think it is ECMAScript and execute it as ECMAScript and it will just work) with certain restrictions that make it a good target for compilers (e.g. it is relatively easy to create a compiler that compiles C to asm.js) and at the same time a good source for native code generation (i.e. its semantics are relatively close to the semantics of modern mainstream general-purpose CPUs).
Likewise, WebAssembly is a (binary) language that has basically the same goals as asm.js, except there is no requirement that it be a proper subset of ECMAScript. It is its own independent language, inspired by asm.js, LLVM bitcode, ANDF, CIL, JVM bytecode, Pascal P-Code, and others.
Asm.js already has some browser support, and because of the fact that it is just a subset of ECMAScript even works in browsers without support … just slower. WebAssembly is gaining traction, although it is still in the experimentation and prototyping phase.

JavaScript is loaded as source code, thus needs to be interpreted.
You couldn't have much lower levels of code as it would no longer be ubiquitously compatible across devices.
One of the perks of JavaScript is your code will run on virtually any devices web browser.
If you were to compile this code, it would become architecture / device specific.
Naturally interpreted languages will run slower than compiled languages as compiled code can be ran blindly by the CPU, where as compiled code needs to be checked / ran line by line; you can however apply optimizations to limit this.

To my knowledge JavaScript is the only language that will execute on
the client side after the HTML file has been retrieved from the
server.
Not always true. still we have other options. like ActionScript which runs in flash player, or VB Script. But they are out of fashion.
When I develop for WebGL I have to optimize so much more to achieve a
reasonable performance benchmark, then what I would have to for PC or
Console.
Yes i think we can do really nice graphics with WebGL. but its still run in a browser sandbox. how js behave, the sameway WebGL also behave in the sense of File access or other core criteria. Think like this way, If you develop a mass brave game like watch dogs or gta. do you think browser can handle those capabilities. But Pc, Console do.
So why do we still use an interpreted client side language? Is it a
security issue, hardware (cross platform) issue or is it simply
because it's difficult to introduce such a large change into the web
architecture?
I guess both of them. compiled code run directly into a machine. then whats is the role for a browser here. so we loose the security stuff.
Also javascript have large size of communities. if we need to totally change the web architecture, the we need a totally different client. that client will replace all current browsers. is it quite not possible. but will change day by day. ES6 is a good start.
I know web developers have headaches over even the smallest and
simplest changes, like using CSS 3, simply because not everyone has
their browser up to date.
Yes 100% true. But there is no other way for this problem. As a developer must keep the compatibility.
Am I correct in thinking that interperated languages are slower then
compiled ones?
Yes it will be fast. But recent browser have good js engines, like V8 which chrome use. that really fast than we predict. basic thing is JS introduce as light weighted architecture. that days server send the html, JS will modify DOM if require, but recent days server is only send the data, JS is creating the DOM. so load is more head on the JS side. This is going fine with the help of good JS engines.

Does Chrome NaCl have anything to do with V8?

Does one enable the other, or does one affect the other?
It seems V8 lets native C++ access Javascript, and NaCl lets you run native code in the browser.
Sorry for the naive question. I'm behind on recent developments in Javascript, was surprised that modern browsers actually JIT-compile all!

Short answer- No.
Longer answer,
Chrome ships with the V8 JS engine and uses it to execute JavaScript embedded in web pages. V8 in Chrome cannot be extended to access C++ or vice versa.
NaCl is a toolchain and runtime environment allowing you to compile your existing C++ code into a safe executable and then execute it securely from a web page.
V8 can be used on its own by embedding it inside your own C++ application, extending it as you see fit.
HTH,
John

http://research.google.com/pubs/archive/37204.pdf
It is possible to run (a modified version of) v8 inside NaCl. The code sequences emitted by the JIT must conform to the sandbox safety rules.
It is unlikely to be possible to do the converse.
:-)
-bsy

Javascript Engines Advantages

I am confused about JavaScript engines right now. I know that V8 was a big deal because it compiled JavaScript to native code.
Then I started reading about Mozilla SpiderMonkey, which from what I understand is written in C and can compile JavaScript. So how is this different from V8 and if this is true, why does Firefox not do this?
Finally, does Rhino literally compile the JavaScript to Java byte code so you would get all the speed advantages of Java? If not, why do people not run V8 when writing scripts on their desktops?

There are various approaches to JavaScript execution, even when doing JIT. V8 and Nitro (formerly known as SquirrelFish Extreme) choose to do a whole-method JIT, meaning that they compile all JavaScript code down to native instructions when they encounter script, and then simply execute that as if it was compiled C code. SpiderMonkey uses a "tracing" JIT instead, which first compiles the script to bytecode and interprets it, but monitors the execution, looking for "hot spots" such as loops. When it detects one, it then compiles just that hot path to machine code and executes that in the future.
Both approaches have upsides and downsides. Whole-method JIT ensures that all JavaScript that is executed will be compiled and run as machine code and not interpreted, which in general should be faster. However, depending on the implementation it may mean that the engine spends time compiling code that will never be executed, or may only be executed once, and is not performance critical. In addition, this compiled code must be stored in memory, so this can lead to higher memory usage.
The tracing JIT as implemented in SpiderMonkey can produce extremely specialized code compared to a whole-method JIT, since it has already executed the code and can speculate on the types of variables (such as treating the index variable in a for loop as a native integer), where a whole-method JIT would have to treat the variable as an object because JavaScript is untyped and the type could change (SpiderMonkey will simply "fall off" trace if the assumption fails, and return to interpreting bytecode). However, SpiderMonkey's tracing JIT currently does not work efficiently on code with many branches, as the traces are optimized for single paths of execution. In addition, there's some overhead involved in monitoring execution before deciding to compile a trace, and then switching execution to that trace. Also, if the tracer makes an assumption that is later violated (such as a variable changing type), the cost of falling off trace and switching back to interpreting is likely to be higher than with a whole-method JIT.

V8 is the fastest, because it compiles all JS to machine code.
SpiderMonkey (what FF uses) is fast too, but compiles to an intermediate byte-code, not machine code. That's the major difference with V8. EDIT- Newer Firefox releases come with a newer variant of SpideMonkey; TraceMonkey. TraceMonkey does JIT compilation of critical parts, and maybe other smart optimizations.
Rhino compiles Javascript into Java classes, thus allowing you to basically write "Java" applications in Javascript. Rhino is also used as a way to interpret JS in the backend and manipulate it, and have complete code understanding, such as reflection. This is used for example by the YUI Compressor.
The reason why Rhino is used instead of V8 all over the place is probably because V8 is relatively new, so a lot of projects have already been using Rhino/Spidermonkey as their JS engine, for example Yahoo widgets. (I assume that's what you're referring to with "scripts on their desktops")
edit-
This link might also give some insight of why SpiderMonkey is so widely adopted.
Which Javascript engine would you embed in your application?

If you want to see how the various in-browser Javascript engines stack up, install Safari 4 (yes it runs on Windows now too!), Chrome V8, Firefox 3.5, and IE 8 (if you are on windows) and run the benchmark:
http://www2.webkit.org/perf/sunspider-0.9/sunspider.html
I believe as Pointy said above, the new Firefox 3.5 uses TraceMonkey that also compiles to intermedit code on the fly using some form of JIT. So it should compare to V8 somewhat favorably. At least it won't be 10x slower than V8 like Firefox 3 SpiderMonkey (without JIT) was.
For me... safari 4.0.3 was 2.5x faster than Tracemonky in Firefox 3.5.3 on Win XP. IE8 was much much slower. I don't have Chrome installed at the moment.
Don't know about Rhino compiling to java bytecode. If it's still interpreting the dynamic features of Javascript such as being able to add attributes to object instances at runtime (example obj.someNewAttribute="someValue" which is allowed in Javascript)... I'm not so sure that it's entirely "compiled" to bytecode, and you might not get any better performance other than you don't have to compile from Javascript source code text every time your Javascript runs. Remember that Javascript allows very dynamic syntax such as eval("x=10;y=20;z=x*y"); which means you can build up strings of code that get compiled at runtime. That's why I'd think Rhino would be mixed-mode interpreted/compiled even if you did compile to JVM bytecode.
The JVM is still an interpreter, albeit a very good one with JIT support. So I like to think of Rhino-on-JVM as 2 interpreter layers (interpreter-on-interpreter) or interpreter^2. Whereas most of your other Javascript engines are written in C, and as such should perform more like interpreter^1. Each interpreter layer can add 5-10x performance degradation as compared to C or C++ (ref Perl or Python or Ruby for example), but with JIT the performance hit can be much lower on the order of 2-4x. And the JVM has one of the most robust & mature JIT engines ever.
So your mileage will definitely vary and you probably would benefit from doing some serious benchmarks if you want a real answer for your intended application on your own hardware & OS.
Rhino can't be too awfully slow, since I know lots of folks are using it. I think it's main appeal is not its speed, but the fact that is easy-to-code/light-weight/embeddable/interpreter that has hooks into Java libraries, which makes it perfect for scripting/configuration/extensibility of your software project. Some text editors like UltraEdit are even embedding Javascript as an alternative macro scripting engine. Every programmer seems to be able to stumble through javascript pretty easily, so it's easy to pick up as well.
One advantage to Rhino is that it runs pretty much anywhere the JVM runs. In my experience, trying to get stand-alone TraceMonkey or SpiderMonkey to build & run from command line can be a bit painful on systems like Windows. And embedding in your own application can be even more time consuming. But the payback to having an embeddable language would be worth it for a big project, as compared to having to "roll your own" mini scripting solution if that's what you're looking to do.
Scripting with Rhino is really easy if you have Java and the rhino jar, you just write your javascript and run it from command line. I use it all the time for simple tasks.

To respond the question, why native code Vs Byte code...
The native code is faster and for google a strategic choice because they have plan to JS one of them at least is ChromeOS.
A good video about this question is posted on Channel9 with an interview with Lars Bak the man behind V8 can be found here

2022 Update
V8 as also SpiderMonkey do support MultiPass Compilation as also SinglePass Compilation.
Java Rhino got deprecated and replaced by a new concept called Truffle Framework it is part of the GraalVM Stack which is in fact a JVMCI added to the JVM where CI means Compiler Interface. That allows the Truffle Language Implementer Framework to Implement any Language and translate that to Java Byte Code in MultiPass and Single Pass. the ECMAScript implementation is called GraalJS and it comes in 2 Flavors
GraalJS the 1:1 Rhino Replacement including compatability mode for incremental adoption and graal-node a NodeJS Fork where v8 is replaced with the GraalJS Engine.
as sayed Truffle is a Polyglot Language Implementer Framework so that works with any GraalVM Supported Language.
hope that helps and makes some sense since the jar versions do most best work when the whole stack has aligned versions you should alaways install the complet stack:
https://github.com/graalvm/graalvm-ce-builds/releases
download and install that then use the so called gu command to add graaljs
follow the instructions here: https://github.com/oracle/graaljs
Yes GraalVM can in general Compile Single Binary Executeable Files of all Polyglot Languages this is called AOT Compilation it has faster boot times but less throughput it can also do JIT and it can do a mix of both.

Develop Reference

JavaScript is the programming language of the Web.