Compiling Assemblyscript to Wasm, performance

Compiling Assemblyscript to Wasm, performance - javascript

I saw that there already is a compiler for compiling TypeScript to WebAssembly (Wasm), here is the link.
I also heard from multiple sources that compiling JS to Wasm wouldn't be feasible, because of JavaScript's dynamic nature and dynamic types.
However, TypeScript does offer typed variables which JavaScript lacks. And in the future Wasm might be able to even interact with the DOM/interact with other web API.
Question:
Would writing applications in TypeScript and compiling it to Wasm offer any performance benefits when compared to writing a web application in JavaScript?

The realistic answer is: no. There are a few common misunderstandings about TypeScript. One is that it is less dynamic than JavaScript. That is not true, it is in fact every bit as dynamic as JS, because it encompasses all of JavaScript's semantics (including all the crazy corner cases) and its type system is far too weak and unsound to provide guarantees that an ordinary offline compiler could use for static optimisations. At best, the types can be used as hints that a dynamic VM could try to optimise for first, well knowing that they might turn out to be incorrect.
(Also, I'm not aware of a TypeScript-to-Wasm compiler. You're probably thinking of AssemblyScript, but while that reuses TypeScript's syntax, its semantics is very different.)

You referenced AssemblyScript in your question. Assembly script is an extremely strict subset of typescript. Don't confuse it with typescript it's self. The big difference being that Typescript encompasses all of the dynamic attributes that we all know and love (hate) in Javascript. AssemblyScript on the other hand does not.
Examples of a couple of the big differences are; in AssemblyScript you can not have closures. You also can not have union types.

Related

Why does CTRL-click on JavaScript Math function in VSCode show internal TypeScript code?

I wanted to investigate the internals of the Math object in JavaScript as running in Node. I assumed I would get information on what C++ methods the V8 engine called, etc. but I get information showing TypeScript interfaces for the Math object.
Can anyone explain why TypeScript interfaces are being shown when one displays information on V8's implementation of JavaScript's Math object?

The underlying detailed engine implementation of functions
Wouldn't make sense to the vast majority of script-writers
are not standardized - there are different environments, VSCode wouldn't assume that the file being examined would always be running in V8
In comparison, the TypeScript documentation for these sorts of built-in functions is standardized, and is relatively simple to understand, even for those who may not have seen it before.

In addition to what #CertainPerformance said:
How engines implement built-in functions is completely up to them: it could be a JavaScript implementation (in which case you could certainly argue that it could be instructive for IDEs like VSCode to show the source), it could be C++ (or whichever language the engine is written in), it could be hand-written assembly; in case of current versions of V8, Math.min is implemented using V8's own DSL ("domain specific language") called "Torque", which despite looking a bit like TypeScript is a totally different beast under the hood (in short: it gets compiled to C++ source, which in turn is compiled to an executable that produces assembly code which is then embedded into the V8 binary).
Compiled (C/C++/Rust/...) binaries in general (in Release builds) don't include information about which functions they consist of (much less those functions' source code), so there's no way for IDEs to get at that information even if they wanted to. (This is why "Debug" builds are a thing.)
If you still want to study V8's implementation, you can find many of the Math builtins here.
So, showing the TypeScript definitions of the signatures of the Math builtins is pretty much the best that a JavaScript IDE can do.

What do those javascript front-end build tools mean when they say "compile" my js codes?

I saw those javascript front-end build tools, e.g. webpack, using the word "compile" from time to time. I am not sure what does compile javascript codes mean exactly, at least not like compile c/c++ codes.
I think I understand the "build" process in general, like bundle all js codes into one big file, minify/uglify the codes, using babel to transforms ES6 syntax(transpile). But what does compiling mean here, how does it fit in the whole building process or it is just another name for the whole build process?
Currently, I thought it may be just another name for using Babel to transforms ES6 syntax.
PS. after reading this SO Is Babel a compiler or transpiler? I believe my question is not same as that. Because it is not just related to Bable. For example, webpack also uses the term compiler https://webpack.js.org/api/compiler/ I do not understand its meaning there!
Browserify uses compiler as well e.g, https://github.com/robrichard/browserify-compile-templates "Compiles underscore templates from HTML script tags into CommonJS in a browserify transform"

It's better to describe the process as "transpilation."
Javascript always executes in a specific environment: in Chrome and Electron, it's the V8 engine; in Firefox, it's SpiderMonkey; etc. Each of these engines supports a specific set of language features and not others. As an example, some engines only support var and do not support const or let. Some support async/await, and others only support Promise.
But web developers know about these other features, and they want to use them, even when they're writing for an engine that doesn't support those features. Why? Most new language features are designed with the goal of making it possible to express complicated concepts in simpler and cleaner ways. This is extremely important, because the number one job of code is to make its purpose clear.
Thus, most language features are essentially syntactic sugar for existing functionality. In those cases, it's always possible to express a routine using both new and old syntax. This is a logical necessity.
A transpiler like Babel can read a script written using advanced syntax, and then re-express the script using a restricted set of language features. Relying on an intermediate representation called an abstract syntax tree, it can produce code that is guaranteed to be functionally equivalent, even though it does the work using very different, more widely-supported control structures.
Perhaps we web developers have gotten lazy in our terminology. When we talk of "compiling" javascript, we aren't usually talking about converting the script to something like bytecode. We're talking about transpilation.
Other kinds of build tasks are also becoming quite common. These days, the front-end is obsessed with a million flavors of "templating," because it's extremely tedious and confusing to describe DOM changes using pure javascript, and because application complexity is increasingly very rapidly. Some frameworks require you to convert source code to other intermediary forms that are later consumed by the web application at runtime. Others permit devs to describe UI using invented syntaxes that no browser is even attempting to support natively. Which tasks are needed varies by application depending on which frameworks are being used, the particulars of the application architecture, and the contours of the deployment environment, and that's just a start.
At its foundation, a web page is built using HTML, CSS, and javascript. That much hasn't changed. But today, most serious applications are built almost entirely in javascript (or something very much like it) and sass. Building the application is the process of applying a set of transformations to the source code to yield the final artifacts in those three bedrock languages.
We lump all that stuff under the term "compile."

You pretty much hit the nail on the head. When the Compile (or more appropriately transpilation) operation happens on a JavaScript project it can mean a number of things. As you mentioned these could range from minification, applying polyfills, shims, or the literal act of "compiling" the scripts into a single bundle file for platform/browser consumption.
Transpilation when using super sets of the JavaScript language such as TypeScript, ActionScript, or UnityScript describes the process of converting the source x-script back into native JavaScript which can be in turn be interpreted by a browser (since the browser doesn't recognize the superset languages).
However you are absolutely correct. We aren't compiling our JavaScript into binary, but the term gets thrown around a lot which can lead to confusion. All that said, we are closing in on the age of adoption of WebAssembly and ASMJs which promises to bring the age of bytecode running in the browser which will bring about some interesting possibilities, but alas... That's a story for another day ;)

You're right when you say these front-end Javascript tools don't use the word compile in same context in what your used to with build tools for languages like C/C++. C/C++ compilers turn source code into machine code.
These JavaScript build tools-- like Webpack-- use the word compile in a sense thats more metaphorical than conventional.
When web build tools use the word compile, they're using it in the sense that they are transpiling, minifying (a.k.a uglyfying), and bundling the source files so they are better optimized for client browsers and network requests. (Smaller file sizes, better browser compatibility, less HTTP requests from bundled assets, etc.)

How does assembly (asm.js) work in the browser?

Asm.js comes from a new category of JavaScript apps: C/C++ applications that’ve been compiled into JavaScript. It’s a subset of JavaScript that’s been spawned by Mozilla’s Emscripten project.
But how does it work, and why would I use it?

Why compile to JavaScript?
JavaScript is the only language which works in all web browsers. Although only JavaScript will run in the browser, you can still write in other languages and still compile to JavaScript, thereby allowing it to also run in the browser. This is made possible by a technology known as emscripten.
Emscripten is an LLVM based project that compiles C and C++ into highly performant JavaScript in the asm.js format. In short: near native speeds, using C and C++, inside of the browser. Even better, emscripten converts OpenGL, a desktop graphics API, into WebGL, which is the web variant of that API.
How does asm.js fit into the picture?
Asm.js, short for Assembly JavaScript, is a subset of JavaScript. An asm.js program will behave identically whether it is run in an existing JavaScript engine or an ahead-of-time (AOT) compiling engine that recognizes and optimizes asm.js—except for speed, of course!
In terms of speed, it’s difficult to offer a precise measurement of how it compares to native code, but preliminary benchmarks of C programs compiled to asm.js are usually within a factor of 2 slowdown over native compilation with clang, the compiler frontend for C, C++, and Obj-C programming languages. It’s important to note that this is a “best” case for single-threaded programs. More on this limitation of the JavaScript language below.
On the backend, Clang uses LLVM, which is a library for constructing, optimizing and producing intermediate and/or binary machine code (those 0s and 1s again). LLVM can be used as a compiler framework, where you provide the “front end” (parser and lexer such as Clang) and the “back end” (code that converts LLVM representation to actual machine code)
Further reading: Alon Zakai of Mozilla has a fantastic slide deck which goes into further detail about how this all works.
So how cool is asm.js? Well it has its own Twitter account, #asmjs. While the asm site is a bit sparse, it does cover the W3C spec, in addition to having a thorough FAQ. Even better, Mozilla coordinated the Humble Mozilla Bundle in 2014, which allowed you to buy a bunch of gamest that took advantage of asm.js.
Why not just turn your JavaScript code into asm.js?
JavaScript can’t really be compiled to asm.js and offer much of a benefit, because of its dynamic nature. It’s the same problem as when trying to compile it to C or even to native code – a VM with it would be necessary to take care of those non-static aspects. You could write asm.js by hand, however.
If one could already translate standard Javascript in a fully static manner, there would be no need for asm.js. Asm.js exists so for the promise that Javascript will get faster without any effort from the developer. It would be very difficult for the JIT to understand a dynamic language as well as a static compiler.
To better understand this, it is important to comprehend why asm.js offers a performance benefit at all; or why statically-typed languages perform better than dynamically-typed ones. One reason is “run-time type checking takes time,” and a more thought out answer would include the enhanced feasibility of optimizing statically-typed code. A final perk of going from a statically typed language such as C is the fact that the compiler knows the type of each object when it is being compiled.
Asm.js is a restricted subset of JS that can be easily translated to bytecode. The first step required would need to break down all the advanced features of JS to that subset for getting this advantage, which is a bit complicated. But JavaScript engines are optimized and designed to translate all those advanced features directly into bytecode – so an intermediate step like asm.js doesn’t offer much of an advantage.
I go into greater detail and pictures in this post.

Managing a large codebase in Node.js

I am about to embark on a Node.js project with a fairly large codebase. I would prefer to keep my code separate from node_modules.
I would ideally like to work with namespaces and folders as I think it would be a nice way to manage things. However, assuming I have read correctly, this would mean that I would have to import any files/"classes" I require using the path to the file which would be quite messy and hard to manage.
What is the defacto method for managing a large amount/ of code for a Node.js project?

My advice is to use a static typed language, since they provide automatic functionality which helps managing large codebases. That could for example be dart, typescript or coffeescript, all being able to produce javascript.
This post explains well the benefits (especially compared to JS):
http://arstechnica.com/information-technology/2014/06/why-do-dynamic-languages-make-it-difficult-to-maintain-large-codebases/
Some of the main problems stated in the linked article:
There is no modularization system; there are no classes, interfaces,
or even namespaces. These elements are in other languages to help
organize large codebases.
The inheritance system—prototype
inheritance—is both weak and poorly understood. It is by no means
obvious how to correctly build prototypes for deep hierarchies (a
captain is a kind of pirate, a pirate is a kind of person, a person is
a kind of thing...) in out-of-the-box JavaScript.
There is no
encapsulation whatsoever; every property of every object is yielded up
to the for-in construct, and is modifiable at will by any part of the
program.
There is no way to annotate any restriction on storage; any
variable may hold any value.
If you started with JS and don't want to abandon your current code base, you could switch to typescript.
Shortly before my JS project reached 5000 lines of code (in about 15 files), I moved it to typescript. It took me about 4 hours to get it back to running.
This post gives some insights from someone movig Node.js to a typescript environment:
http://tech.kinja.com/my-experience-with-typescript-710191610

Is there any way to compile Standard ML to JavaScript taking advantage of MLTon?

The only way I could imagine would be using Emscripten, but MLTon has no LLVM backend. Is it possible somehow?

I don't think it is, and as I've commented on your other question, I don't see much point in doing so. Many of the optimisations MLton performs are not that relevant on top of an aggressive jit compiler. On the other hand, you would need to compile not just the program, but also port the MLton runtime to JavaScript. In particular, this involves the memory management system. With the Emscripten route, you probably would need to run MLton's garbage collector nested inside JavaScript. That's usually a terrible idea. Especially if you also want to interact with the JS environment in interesting ways, because then you would have to marshall and finalise back-and-forth across the language boundaries, which tends to imply horrible performance and a high potential for space leaks.
For this use case, the direct SMLtoJS compiler is what you want (although the site seems down right now).

Develop Reference

JavaScript is the programming language of the Web.