Benefits of inline caching - javascript

Is there any way to figure out how many and what accesses in JavaScript program are actually utilizing inline caching technique? or are majorly contributing to improve overall performance using inline caching?

If you refer to V8 Chrome page then you will find it there:
V8 compiles JavaScript source code directly into machine code when it is first executed. There are no intermediate byte codes, no interpreter. Property access is handled by inline cache code that may be patched with other machine instructions as V8 executes.
During initial execution of the code for accessing a property of a given object, V8 determines the object's current hidden class. V8 optimizes property access by predicting that this class will also be used for all future objects accessed in the same section of code and uses the information in the class to patch the inline cache code to use the hidden class. If V8 has predicted correctly the property's value is assigned (or fetched) in a single operation. If the prediction is incorrect, V8 patches the code to remove the optimisation.

Related

Why is accessing myInstance.property1.subproperty2.subproperty3 costly in JavaScript but free in C++?

I have always understood it to be a cost-savings operation in JavaScript to avoid repeatedly referencing a nested property on an object. Instead of writing a.b.c.d over and over, you would favor let x = a.b.c.d; and then use x (what I've often heard colloquially called "caching the reference.")
It recently came up in conversation with a friend that such a thing would be completely unnecessary and foolish in C++.
Is that true? If so, why? I guess it has to do with the difference in the underlying language implementation between a C++ object and a JavaScript object, but what is the difference exactly?
JavaScript objects are closer to C++'s std::map (or std::unordered_map) than they are to C++ classes.
C++ has the advantage of having separate compilation and run steps. The compiler can really take as long as it likes to analyze large chunks of your program and heavily optimize them. When you're writing C++ you aren't really writing a program for the CPU to execute. You're describing the behavior of a program and the compiler will use that description to come up with a program for you. Your browser's JavaScript runtime (likely a JIT compiler) simply doesn't have the time to do the same level of analysis and optimization. It has to compile and run your program quickly enough that users don't perceive any delay. That's not to say a JavaScript runtime won't do any optimization, but it will tend to be more incremental and localized than what a C++ compiler that takes 20 minutes to compile a program can do.
All of the attributes of a C++ class are known at compile time. When you access an attribute of an object in C++, the compiler will resolve that access at compile time; likely to a handful of memory loads or a single function call instruction. Since it's all resolved at compile time, it doesn't matter how deeply an attribute lookup is nested. The runtime performance will be the same. Additionally, the compiler will do that sort of memorization for you, likely by keeping a repeatedly-accessed attribute in a register.
The same is not true of JavaScript. JavaScript objects don't have a defined set of properties. They can be added and removed throughout the lifetime of the object. That means that the JavaScript runtime has to track an object's properties using some sort of associative data structure (likely a hash table). When you request an attribute of an object, the runtime has to look through that data structure to find the value that you want, and it has to do that lookup for every level of nesting.

JavaScript compilation in V8

In the V8 home (the Google's JavaScript engine) we read this:
V8 compiles and executes JavaScript source code
Does it mean that JavaScript is not an interpreted language in V8?
Does V8 use a just-in-time compilation approach for JavaScript?
Edit: There is another existing question which already address my first question, but not the second.
Does it mean that JavaScript is not an interpreted language in V8?
The answer to this is "it depends".
Historically V8 has compiled directly to machine code using its "full-codegen" compiler, which produces unoptimized code which uses inline caching to implement most operations such as arithmetic operations, load and stores of variables and properties, etc.
The code generated by full-codegen keeps track of how "hot" each function is, by adjusting a counter when the function is called and when it jumps back to the top of loops.
It also keeps track of the types of the variables used in each expression.
If it determines that a function (or part of a function) is very hot, and it has collected enough type information, it triggers the "Crankshaft" compiler which generates much better code.
However, the V8 developers are actively working on moving to a different system where they start off with an interpreter called "Ignition" and then use a compiler called "Turbofan" to produce optimized code for hot functions.
Here are a couple of posts from the V8 developers blog describing this:
Firing up the Ignition Interpreter
Help us test the future of V8
Does V8 use a just-in-time compilation approach for JavaScript?
Yes, in a number of ways.
Firstly, it has a lazy parsing and lazy compilation mechanism. This means that when it parses a Javascript source file it parses the outermost scope eagerly, generating the full-codegen code immediately.
However, for functions defined within the file, it skips over them and just records the name of the function and the location of its source code. It generates a dummy function which simply calls into the V8 runtime to trigger the actual compilation of the function.
Secondly, it has a two stage compiler pipeline as described above, using either full-codegen+crankshaft or ignition+turbofan.
When the compilation is triggered it will initially generate unoptimized code or ignition bytecode (which it can do very quickly), and then later if the code is hot it triggers an optimized re-compilation (which is much slower but generates much better code).

Javascript: Where getter/setter values are stored? [duplicate]

I was thinking about this today and I realized I don't have a clear picture here.
Here are some statements I think to be true (please correct me if I'm wrong):
the DOM is a collection of interfaces specified by W3C.
when parsing HTML source code, the browser creates a DOM tree which has nodes that implement DOM interfaces.
the ECMAScript spec has no reference of browser host objects (DOM, BOM, HTML5 APIs etc.).
how the DOM is actually implemented depends on browser internals and is probably different among most of them.
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
I am curious about what happens behind the scenes when I call document.getElementById('foo'). Does the call get delegated to browser native code by the interpreter or does the browser have JS implementations of all host objects? Do you know about any optimizations they do in regard to this?
I read this overview of browser internals but it didn't mention anything about this. I will look through the Chrome and FF source when I have time, but I thought about asking here first. :)
All of your bullet points are correct, except:
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
should be "...and translate it to native code". SpiderMonkey (the JS engine in Firefox) worked as a bytecode interpreter for a long time before the current JS speed arms race.
On Mozilla's JS-to-DOM bridge:
The host objects are typically implemented in C++, though there is an experiment underway to implement DOM in JS. So when a web page calls document.getElementById('foo'), the actual work of retrieving the element by its ID is done in a C++ method, as hsivonen noted.
The specific way the underlying C++ implementation gets called depends on the API and also changed over time (note that I'm not involved in the development, so might be wrong about some details, here's a blog post by jst, who was actually involved in creating much of this code):
At the lowest level every JS engine provides APIs to define host objects. For example, the browser can call JS_DefineFunctions (as demonstrated in the SpiderMonkey User Guide) to let the engine know that whenever script calls a function with the specified name, a provided C callback should be called. Same for other aspects of the host objects (e.g. enumeration, property getters/setters, etc.)
For the core ECMAScript functionality and in some tricky DOM cases the JS engine/the browser uses these APIs directly to define host objects and their behaviors, but it requires a lot of common boilerplate code for e.g. checking parameter types, converting them to the appropriate C++ types, error handling etc.
For reasons I won't go into, let's say historically, Mozilla made heavy use of XPCOM for many of its objects, including much of the DOM. One feature of XPCOM is its binding to JS called XPConnect. Among other things, XPConnect can take an interface definition in IDL (such as nsIDOMDocument; or more precisely its compiled representation), expose an object with the specified properties to the script, and later, when a script calls getElementById, perform the necessary parameter checks/conversions and route the call directly to a C++ method (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
The way XPConnect worked was quite inefficient: it registered generic functions as callbacks to be executed when a script accesses a host object, and these generic functions figured out what they needed to do in every particular case dynamically. This post about quickstubs walks you through one example.
"Quick stubs" mentioned in the previous link is a way to optimize JS->C++ calls time by trading some code size for it: instead of always using generic C++ functions that know how to make any kind of call, the specialized code is automatically generated at the Firefox build time for a pre-defined list of "hot" calls.
Later on the JIT (tracemonkey at that time) was taught to generate the code calling C++ methods as part of the native code generated for "hot" paths in JS. I'm not sure how the newer JITs (jaegermonkey) work in this regard.
With "paris bindings" the objects are exposed to webpage JS without any reliance on XPConnect, instead generating all the necessary glue JSClass code based on WebIDL (instead of XPCOM-era IDL). See also posts by developers who worked on this: jst and khuey. Also see How is the web-exposed DOM implemented?
I'm fuzzy on details of the three last points in particular, so take it with a grain of salt.
The most recent improvements are listed as dependencies of bug 622298, but I don't follow them closely.
JS calls to DOM methods like getElementById cause the JS engine to call into the C++ code that implements the DOM. For example, in Firefox, the call ends up in nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn).
As you can see, Firefox maintains a hashtable that maps ids to elements in C++ as an optimization in this case, so it doesn't walk the whole DOM tree looking for the id.
The DOM is implemented as a language-independent library pretty much in all major browser implementations, which means it's in a different library from the Javascript engine. For example in IE, the JS engine is implemented in jscript.dll while the DOM is implemented in mshtml.dll. Safari has Nitro(JS) and WebCore(DOM). Chrome has V8(JS) and WebCore(DOM), and Firefox has SpiderMonkey/TraceMonkey(JS) and Gecko(DOM).
What this means is that anytime your JS has to access the DOM, it has to reach over to the DOM library - which is inherently slow because of all the marshaling that has to take place. An analogy that has been used is 2 pieces of land connected by a toll bridge, any time you touch the DOM, you must cross over the bridge and cross back - paying a performance toll.
References
Video: Building High Performance Web Applications and Sites
Book: High Performance Javascript (Chapter 3 on the DOM)

Are DOM objects javascript objects?

This is something I cannot find an official answer about. For some, DOM objects are JS objects, for others they differ. What is the right answer?
By searching in stackoverflow, you may see controversial opinions.
For example, does the object document.body belongs to DOM API only or may it be considered as part of javascript engine too?
Does Javascript create an internal representation of it or does it just communicates with DOM to access it?
The DOM API is a collection of standards which have implementations in a variety of programming languages.
The DOM available to JavaScript in a browser provides things in the form of JavaScript objects. Large portions of it are written in native code (so are handled by libraries not written in JavaScript but made available through a JavaScript API).
Where JavaScript leaves off and native code begins doesn't really matter, it is an implementation detail and probably varies from browser to browser. The point of having a standard API is that developers using it interact with that API and don't need to worry about how it is implemented under the hood.
Strictly speaking, no. The JavaScript runtime has access to them, and in that capacity they can function as JavaScript objects. But they are defined in a way that is not bound to any particular language, and in most DOM implementations, they're native code. Most DOM implementations take care to make the objects function the same way you'd expect other objects in the chosen language to work, but that's not always the same way that JavaScript objects do: for example, you can't go around adding dynamic properties to objects when you're working in Java.
For most practical purposes, when you're working in the browser or in some other JavaScript runtime, yes. As I stated above, most DOM implementations try to make the DOM objects work the same way as other objects in the language, and for JavaScript, that means making them work like "real" JavaScript objects. Although IE took a while to really get this right (you need IE9+ to take full advantage), these days you can pretty much use DOM objects the same way you'd use any other JavaScript object.
If you inspect deeply the __proto__ of document.body for instance, you would find this :
HTMLBodyElement > HTMLElement > Element > Node > EventTarget > Object
So yes : in the browser's context, DOM objects are JS objects, this is not reciprocal of course.
But DOM API is not exclusive to Javascript, it defines interfaces which can be implemented in any languages, for instance Python has a DOM API too and in this case, DOM objects are Python objects.
The DOM objects are not part of the JavaScript language, they are part of the environment that is provided when JavaScript runs in a browser.
When JavaScript runs in another environment, for example in Node.js, then there is no DOM. Instead there are other objects that make up the environment that the script works with.
The DOM objects are there just for JavaScript so the script works directly with the objects, there is no extra wrapper to make them available to JavaScript.

Why is checking for an attribute using dot notation before removing faster than removing the attribute outright?

I asked this question, and it turned out that when removing an attribute from an element, checking whether the element exists first using elem.xxx!==undefined makes the runtime faster. Proof.
Why is it quicker? There's more code to go through and you'll have to encounter the removeAttribute() method whichever way you go about this.
Well, first thing you need to know is that elem.xxx is not the same as elem.getAttribute() or any other method relative to the attribute.
elem.xxx is a property of a DOM element while attribute and element on the HTML inside the DOM, both are similar but different. For exemple, take this DOM element: <a href="#"> and this code :
//Let say var a is the <a> tag
a.getAttribute('href');// == #
a.href;// == http://www.something.com/# (i.e the complet URL)
But let take a custom attribute : <a custom="test">
//Let say var a is the <a> tag
a.getAttribute('custom');// == test
a.custom;// == undefined
So you can't really compare the speed of both since they don't achieve the same result. But one is clearly faster since properties are a fast access data while attribute use the get/hasAttribute DOM functions.
Now, Why without the condition is faster? Simply because removeAttribute doesn't care is the attribute is missing, it check if it is not.
So using hasAttribute before removeAttribute is like doing the check twice, but the condition is a little slower since it need to check if the condition is satisfied to run the code.
I have a suspicion that the reason for the speed boost are trace trees.
Trace trees were first introduced by Andreas Gal and Michael Franz of the University of California, Irvine, in their paper Incremental Dynamic Code Generation with Trace Trees.
In his blog post Tracing the Web Andreas Gal (the co-author of the paper) explains how tracing Just-in-Time compilers works.
To explain tracing JIT compilers as sententiously as possible (since my knowledge about the subject isn't profound) a tracing JIT compiler does the following:
Initially all the code to be run is interpreted.
A count is kept for the number of times each code path is executed (e.g. the number of times the true branch of an if statement is executed).
When the number of times a code path is taken is greater than a predefined threshold the code path is compiled into machine code to speed up execution (e.g. I believe SpiderMonkey executes code paths executed more than once).
Now let's take a look at your code and understand what is causing the speed boost:
Test Case 1: Check
if (elem.hasAttribute("xxx")) {
elem.removeAttribute("xxx");
}
This code has a code path (i.e. an ifstatement). Remember that tracing JITs only optimize code paths and not entire functions. This is what I believe is happening:
Since the code is being benchmarked by JSPerf it's being executed more than once (an understatement). Hence it is compiled into machine code.
However it still incurs the overhead of the extra function call to hasAttribute which is not JIT compiled because it's not a part of the conditional code path (the code between the curly braces).
Hence although the code inside the curly braces is fast the conditional check itself is slow because it's not compiled. It is interpreted. The result is that the code is slow.
Test Case 2: Remove
elem.removeAttribute("xxx");
In this test case we don't have any conditional code paths. Hence the JIT compiler never kicks in. Thus the code is slow.
Test Case 3: Check (Dot Notation)
if (elem.xxx !== undefined) {
elem.removeAttribute("xxx");
}
This is the same as the first test case with one significant difference:
The conditional check is a simple non-equivalence check. Hence it doesn't incur the full overhead of a function call.
Most JavaScript interpreters optimize simple equivalence checks like this by assuming a fixed data type for both the variables. Since the data type of elem.xxx or undefined is not changing every iteration this optimization makes the conditional check even faster.
The result is that the conditional check (although interpreted) does not slow down the compiled code path significantly. Hence this code is the fastest.
Of course this is just speculation on my part. I don't know the internals of a JavaScript engine and I hence my answer is not canonical. However I opine that it is a good educated guess.
Your proof is incorrect...
elem.class !== undefined always evaluates to false and thus elem.removeAttribute("class") is never called, therefore, this test will always be quicker.
The correct property on elem to use is className, e.g.:
typeof elem.className !== "undefined"
As Karl-André Gagnon pointed out, accessing a [native] JavaScript property and invoking a DOM function/property are two different operations.
Some DOM properties are exposed as JavaScript properties via the DOM IDL; these are not the same as adhoc JS properties and require DOM access. Also, even though the DOM properties are exposed, there is not strict relation with DOM attributes!
For instance, inputElm.value = "x" will not update the DOM attribute, even though the element will display and report an updated value. If the goal is to deal with DOM attributes, the only correct method is to use hasAttribute/setAttribute, etc.
I've been working on deriving a "fair" micro-benchmark for the different function calls, but it is fairly hard and there is alot of different optimization that occurs. Here my best result, which I will use to argue my case.
Note that there is no if or removeAttribute to muddle up the results and I am focusing only on the DOM/JS property access. Also, I attempt to rule out the claim that the speed difference is merely due to a function call and I assign the results to avoid blatant browser optimizations. YMMV.
Observations:
Access to a JS property is fast. This is to be expected1,2
Calling a function can incur a higher cost than direct property access1, but is not nearly as slow as DOM properties or DOM functions. That is, it is not merely a "function call" that makes hasAttribute so much slower.
DOM properties access is slower than native JS property access; however, performance differs widely between the DOM properties and browsers. My updated micro-benchmark shows a trend that DOM access - be it via DOM property or DOM function - may be slower than native JS property access2.
And going back to the very top: Accessing a non-DOM [JS] property on an element is fundamentally different than accessing a DOM property, much less a DOM attribute, on the same element. It is this fundamental difference, and optimizations (or lack thereof) between the approaches across browsers, that accounts for the observed performance differences.
1 IE 10 does some clever trick where the fake function call is very fast (and I suspect the call has been elided) even though it has abysmal JS property access. However, considering IE an outlier or merely reinforcement that the function call is not what introduces the inherently slower behavior, doesn't detract from my primary argument: it is the DOM access that is fundamentally slower.
2 I would love to say DOM property access is slower, but FireFox does some amazing optimization of input.value (but not img.src). There is some special magic that happens here. Firefox does not optimize the DOM attribute access.
And, different browsers may exhibit entirely different results .. however, I don't think that one has to consider any "magic" with the if or removeAttribute to at least isolate what I believe to be the "performance issue": actually using the DOM.

Categories

Resources