I am studying v8 sources.
I have spent 3 weeks but I couldn't find that how 8v call DOM's function.
Example for,
<script>
document.writeln("Hello V8");
</script>
I want to know process of call sequences, writeln() function of DOM.
Could you explain about this or give me some hints.
You could check V8HTMLDocumentCustom.cpp file where the .writeln function is found:
void V8HTMLDocument::writelnMethodCustom(const v8::FunctionCallbackInfo<v8::Value>& args){
HTMLDocument* htmlDocument = V8HTMLDocument::toNative(args.Holder());
htmlDocument->writeln(writeHelperGetString(args), activeDOMWindow()->document());
}
As you can see there are several headers included, some of those includes lead you to other headers, where you find files like V8DOMConfiguration.h
V8DOMConfiguration.h has some comments:
class V8DOMConfiguration {
public:
// The following Batch structs and methods are used for setting multiple
// properties on an ObjectTemplate, used from the generated bindings
// initialization (ConfigureXXXTemplate). This greatly reduces the binary
// size by moving from code driven setup to data table driven setup.
What I get from this is that Chrome V8 creates "Wrapper Worlds" with objects, recreating DOMs for each one of them, then it just pass data to the active window created, though.
I'm not well versed in V8, however this is a starting point. Maybe someone with a deeper knowledge of it can explain it better.
Update
As #Esailija points out V8 engine when run without a browser has no DOM available. As DOM is part of Webkit/Blink, linked references point to them. Once browser has rendered DOM then V8 Objects are matched with DOM tree elements. There's a related question about this here: V8 Access to DOM
Related
I was thinking about this today and I realized I don't have a clear picture here.
Here are some statements I think to be true (please correct me if I'm wrong):
the DOM is a collection of interfaces specified by W3C.
when parsing HTML source code, the browser creates a DOM tree which has nodes that implement DOM interfaces.
the ECMAScript spec has no reference of browser host objects (DOM, BOM, HTML5 APIs etc.).
how the DOM is actually implemented depends on browser internals and is probably different among most of them.
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
I am curious about what happens behind the scenes when I call document.getElementById('foo'). Does the call get delegated to browser native code by the interpreter or does the browser have JS implementations of all host objects? Do you know about any optimizations they do in regard to this?
I read this overview of browser internals but it didn't mention anything about this. I will look through the Chrome and FF source when I have time, but I thought about asking here first. :)
All of your bullet points are correct, except:
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
should be "...and translate it to native code". SpiderMonkey (the JS engine in Firefox) worked as a bytecode interpreter for a long time before the current JS speed arms race.
On Mozilla's JS-to-DOM bridge:
The host objects are typically implemented in C++, though there is an experiment underway to implement DOM in JS. So when a web page calls document.getElementById('foo'), the actual work of retrieving the element by its ID is done in a C++ method, as hsivonen noted.
The specific way the underlying C++ implementation gets called depends on the API and also changed over time (note that I'm not involved in the development, so might be wrong about some details, here's a blog post by jst, who was actually involved in creating much of this code):
At the lowest level every JS engine provides APIs to define host objects. For example, the browser can call JS_DefineFunctions (as demonstrated in the SpiderMonkey User Guide) to let the engine know that whenever script calls a function with the specified name, a provided C callback should be called. Same for other aspects of the host objects (e.g. enumeration, property getters/setters, etc.)
For the core ECMAScript functionality and in some tricky DOM cases the JS engine/the browser uses these APIs directly to define host objects and their behaviors, but it requires a lot of common boilerplate code for e.g. checking parameter types, converting them to the appropriate C++ types, error handling etc.
For reasons I won't go into, let's say historically, Mozilla made heavy use of XPCOM for many of its objects, including much of the DOM. One feature of XPCOM is its binding to JS called XPConnect. Among other things, XPConnect can take an interface definition in IDL (such as nsIDOMDocument; or more precisely its compiled representation), expose an object with the specified properties to the script, and later, when a script calls getElementById, perform the necessary parameter checks/conversions and route the call directly to a C++ method (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
The way XPConnect worked was quite inefficient: it registered generic functions as callbacks to be executed when a script accesses a host object, and these generic functions figured out what they needed to do in every particular case dynamically. This post about quickstubs walks you through one example.
"Quick stubs" mentioned in the previous link is a way to optimize JS->C++ calls time by trading some code size for it: instead of always using generic C++ functions that know how to make any kind of call, the specialized code is automatically generated at the Firefox build time for a pre-defined list of "hot" calls.
Later on the JIT (tracemonkey at that time) was taught to generate the code calling C++ methods as part of the native code generated for "hot" paths in JS. I'm not sure how the newer JITs (jaegermonkey) work in this regard.
With "paris bindings" the objects are exposed to webpage JS without any reliance on XPConnect, instead generating all the necessary glue JSClass code based on WebIDL (instead of XPCOM-era IDL). See also posts by developers who worked on this: jst and khuey. Also see How is the web-exposed DOM implemented?
I'm fuzzy on details of the three last points in particular, so take it with a grain of salt.
The most recent improvements are listed as dependencies of bug 622298, but I don't follow them closely.
JS calls to DOM methods like getElementById cause the JS engine to call into the C++ code that implements the DOM. For example, in Firefox, the call ends up in nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn).
As you can see, Firefox maintains a hashtable that maps ids to elements in C++ as an optimization in this case, so it doesn't walk the whole DOM tree looking for the id.
The DOM is implemented as a language-independent library pretty much in all major browser implementations, which means it's in a different library from the Javascript engine. For example in IE, the JS engine is implemented in jscript.dll while the DOM is implemented in mshtml.dll. Safari has Nitro(JS) and WebCore(DOM). Chrome has V8(JS) and WebCore(DOM), and Firefox has SpiderMonkey/TraceMonkey(JS) and Gecko(DOM).
What this means is that anytime your JS has to access the DOM, it has to reach over to the DOM library - which is inherently slow because of all the marshaling that has to take place. An analogy that has been used is 2 pieces of land connected by a toll bridge, any time you touch the DOM, you must cross over the bridge and cross back - paying a performance toll.
References
Video: Building High Performance Web Applications and Sites
Book: High Performance Javascript (Chapter 3 on the DOM)
This is something I cannot find an official answer about. For some, DOM objects are JS objects, for others they differ. What is the right answer?
By searching in stackoverflow, you may see controversial opinions.
For example, does the object document.body belongs to DOM API only or may it be considered as part of javascript engine too?
Does Javascript create an internal representation of it or does it just communicates with DOM to access it?
The DOM API is a collection of standards which have implementations in a variety of programming languages.
The DOM available to JavaScript in a browser provides things in the form of JavaScript objects. Large portions of it are written in native code (so are handled by libraries not written in JavaScript but made available through a JavaScript API).
Where JavaScript leaves off and native code begins doesn't really matter, it is an implementation detail and probably varies from browser to browser. The point of having a standard API is that developers using it interact with that API and don't need to worry about how it is implemented under the hood.
Strictly speaking, no. The JavaScript runtime has access to them, and in that capacity they can function as JavaScript objects. But they are defined in a way that is not bound to any particular language, and in most DOM implementations, they're native code. Most DOM implementations take care to make the objects function the same way you'd expect other objects in the chosen language to work, but that's not always the same way that JavaScript objects do: for example, you can't go around adding dynamic properties to objects when you're working in Java.
For most practical purposes, when you're working in the browser or in some other JavaScript runtime, yes. As I stated above, most DOM implementations try to make the DOM objects work the same way as other objects in the language, and for JavaScript, that means making them work like "real" JavaScript objects. Although IE took a while to really get this right (you need IE9+ to take full advantage), these days you can pretty much use DOM objects the same way you'd use any other JavaScript object.
If you inspect deeply the __proto__ of document.body for instance, you would find this :
HTMLBodyElement > HTMLElement > Element > Node > EventTarget > Object
So yes : in the browser's context, DOM objects are JS objects, this is not reciprocal of course.
But DOM API is not exclusive to Javascript, it defines interfaces which can be implemented in any languages, for instance Python has a DOM API too and in this case, DOM objects are Python objects.
The DOM objects are not part of the JavaScript language, they are part of the environment that is provided when JavaScript runs in a browser.
When JavaScript runs in another environment, for example in Node.js, then there is no DOM. Instead there are other objects that make up the environment that the script works with.
The DOM objects are there just for JavaScript so the script works directly with the objects, there is no extra wrapper to make them available to JavaScript.
I'm currently reading a book about building single page web applications. The application is currently in its very early stages, but the author incorporated two functions into the shell code, stateMap and jqueryMap. stateMap is for placing dynamic information shared across the module...I think I understand that. However, the jqueryMap is used to "cache jquery collections. This function should be in almost every shell and feature module we write. The use of the jqueryMap cache can greatly reduce the number of jQuery document traversals and improve performance."
Is anyone familiar with this technique? Can you explain this further?
Even elementary DOM lookups in JavaScript, by element ID or class name, take considerable time by the browser, especially if the document is large.
Consider the following:
<div id="my-window"> ... </div>
A 'normal' jQuery way to locate the above DIV element by its ID would be as follows:
var $my_window = $('#my-window'); // expensive document traversal!
(Note that $('...') always represents a collection in jQuery, even if there is only a single element that matches the selector.)
If over the lifetime of your page you need to refer to that DIV multiple times in your script, such repeated lookups consume extra CPU cycles, causing your page to appear slow and ultimately degrading end user experience. What the author is suggesting is to perform such expensive lookups only once and store the results in a local cache of sorts, which is just a JavaScript object holding a bunch of references. Those references can then be reused as needed very quickly:
var jqueryMap = {}, setJqueryMap;
setJqueryMap = function () {
jqueryMap = {
$my_window: $('#my-window'),
// store other references here
};
};
All you need to do is to call setJqueryMap function once when your module loads. Then you can refer to the desired element (or elements) by their 'cached' references:
setJqueryMap();
...
jqueryMap.$my_window // do something with the element
That way the repeated traversals are avoided, making your script to perform much faster.
I've created a very simple test case that creates a Backbone view, attaches a handler to an event, and instantiates a user-defined class. I believe that by clicking the "Remove" button in this sample, everything will be cleaned up and there should be no memory leaks.
A jsfiddle for the code is here: http://jsfiddle.net/4QhR2/
// scope everything to a function
function main() {
function MyWrapper() {
this.element = null;
}
MyWrapper.prototype.set = function(elem) {
this.element = elem;
}
MyWrapper.prototype.get = function() {
return this.element;
}
var MyView = Backbone.View.extend({
tagName : "div",
id : "view",
events : {
"click #button" : "onButton",
},
initialize : function(options) {
// done for demo purposes only, should be using templates
this.html_text = "<input type='text' id='textbox' /><button id='button'>Remove</button>";
this.listenTo(this,"all",function(){console.log("Event: "+arguments[0]);});
},
render : function() {
this.$el.html(this.html_text);
this.wrapper = new MyWrapper();
this.wrapper.set(this.$("#textbox"));
this.wrapper.get().val("placeholder");
return this;
},
onButton : function() {
// assume this gets .remove() called on subviews (if they existed)
this.trigger("cleanup");
this.remove();
}
});
var view = new MyView();
$("#content").append(view.render().el);
}
main();
However, I am unclear how to use Google Chrome's profiler to verify that this is, in fact, the case. There are a gazillion things that show up on the heap profiler snapshot, and I have no idea how to decode what's good/bad. The tutorials I've seen on it so far either just tell me to "use the snapshot profiler" or give me a hugely detailed manifesto on how the entire profiler works. Is it possible to just use the profiler as a tool, or do I really have to understand how the whole thing was engineered?
EDIT: Tutorials like these:
Gmail memory leak fixing
Using DevTools
Are representative of some of the stronger material out there, from what I've seen. However, beyond introducing the concept of the 3 Snapshot Technique, I find they offer very little in terms of practical knowledge (for a beginner like me). The 'Using DevTools' tutorial doesn't work through a real example, so its vague and general conceptual description of things aren't overly helpful. As for the 'Gmail' example:
So you found a leak. Now what?
Examine the retaining path of leaked objects in the lower half of the Profiles panel
If the allocation site cannot be easily inferred (i.e. event listeners):
Instrument the constructor of the retaining object via the JS console to save the stack trace for allocations
Using Closure? Enable the appropriate existing flag (i.e. goog.events.Listener.ENABLE_MONITORING) to set the creationStack property during construction
I find myself more confused after reading that, not less. And, again, it's just telling me to do things, not how to do them. From my perspective, all of the information out there is either too vague or would only make sense to someone who already understood the process.
Some of these more specific issues have been raised in #Jonathan Naguin's answer below.
A good workflow to find memory leaks is the three snapshot technique, first used by Loreena Lee and the Gmail team to solve some of their memory problems. The steps are, in general:
Take a heap snapshot.
Do stuff.
Take another heap snapshot.
Repeat the same stuff.
Take another heap snapshot.
Filter objects allocated between Snapshots 1 and 2 in Snapshot 3's "Summary" view.
For your example, I have adapted the code to show this process (you can find it here) delaying the creation of the Backbone View until the click event of the Start button. Now:
Run the HTML (saved locally of using this address) and take a snapshot.
Click Start to create the view.
Take another snapshot.
Click remove.
Take another snapshot.
Filter objects allocated between Snapshots 1 and 2 in Snapshot 3's "Summary" view.
Now you are ready to find memory leaks!
You will notice nodes of a few different colors. Red nodes do not have direct references from Javascript to them, but are alive because they are part of a detached DOM tree. There may be a node in the tree referenced from Javascript (maybe as a closure or variable) but is coincidentally preventing the entire DOM tree from being garbage collected.
Yellow nodes however do have direct references from Javascript. Look for yellow nodes in the same detached DOM tree to locate references from your Javascript. There should be a chain of properties leading from the DOM window to the element.
In your particular you can see a HTML Div element marked as red. If you expand the element you will see that is referenced by a "cache" function.
Select the row and in your console type $0, you will see the actual function and location:
>$0
function cache( key, value ) {
// Use (key + " ") to avoid collision with native prototype properties (see Issue #157)
if ( keys.push( key += " " ) > Expr.cacheLength ) {
// Only keep the most recent entries
delete cache[ keys.shift() ];
}
return (cache[ key ] = value);
} jquery-2.0.2.js:1166
This is where your element is being referenced. Unfortunally there is not much you can do, it is a internal mechanism from jQuery. But, just for testing purpose, go the function and change the method to:
function cache( key, value ) {
return value;
}
Now if you repeat the process you will not see any red node :)
Documentation:
Eliminating memory leaks in Gmail.
Easing JavaScript Memory Profiling In Chrome DevTools.
Here's a tip on memory profiling of a jsfiddle: Use the following URL to isolate your jsfiddle result, it removes all of the jsfiddle framework and loads only your result.
http://jsfiddle.net/4QhR2/show/
I was never able to figure out how to use the Timeline and Profiler to track down memory leaks, until I read the following documentation. After reading the section entitled 'Object allocation tracker' I was able to use the 'Record Heap Allocations' tool, and track some some Detached DOM nodes.
I fixed the problem by switching from jQuery event binding, to using Backbone event delegation. It's my understanding that newer versions of Backbone will automatically unbind the events for you if you call View.remove(). Execute some of the demos yourself, they are set up with memory leaks for you to identify. Feel free to ask questions here if you still don't get it after studying this documentation.
https://developers.google.com/chrome-developer-tools/docs/javascript-memory-profiling
Basically you need to look at the number of objects inside your heap snapshot. If the number of objects increases between two snapshots and you've disposed of objects then you have a memory leak. My advice is to look for event handlers in your code which do not get detached.
There is an introduction video from Google, which will be very helpful to find JavaScript memory leaks.
https://www.youtube.com/watch?v=L3ugr9BJqIs
You also might want to read :
http://addyosmani.com/blog/taming-the-unicorn-easing-javascript-memory-profiling-in-devtools/
It explains the use of the chrome developer tools and gives some step-by-step advices on how to confirm and locate a memory leak using heap snapshot comparison and the different hep snapshot views available.
You could also look at the Timeline tab in developer tools. Record the usage of your app and keep an eye on the DOM Node and Event listener count.
If the memory graph would indeed indicate a memory leak, then you can use the profiler to figure out what is leaking.
A couple of important notes in regards to identifying memory leaks using Chrome Developer tools:
1) Chrome itself has memory leaks for certain elements such as password and number fields. https://bugs.chromium.org/p/chromium/issues/detail?id=967438. Avoid using those while debugging as they polute your heap snapshot when searching for detached elements.
2) Avoid logging anything to the browser console. Chrome will not garbage collect objects written to the console, hence affecting your result. You can suppress output by placing the following code in the beginning of you script/page:
console.log = function() {};
console.warn = console.log;
console.error = console.log;
3) Use heap snapshots and search for "detach" to identify detached DOM elements. By hovering objects, you get access to all the properties including id and outerHTML which may help identify each element.
If the detached elements are still too generic to recognize, assign them unique IDs using the browser console prior to running your test, e.g.:
var divs = document.querySelectorAll("div");
for (var i = 0 ; i < divs.length ; i++)
{
divs[i].id = divs[i].id || "AutoId_" + i;
}
divs = null; // Free memory
Now, when you identify a detached element with, lets say id="AutoId_49", reload your page, execute the snippet above again, and find the element with id="AutoId_49" using the DOM inspector or document.querySelector(..). Naturally this only works if your page content is predictable.
How I run my tests to identify memory leaks
1) Load page (with console output suppressed!)
2) Do stuff on page that could result in memory leaks
3) Use Developer Tools to take a heap snapshot and search for "detach"
4) Hover elements to identify them from their id or outerHTML properties
I second the advice to take a heap snapshot, they're excellent for detecting memory leaks, chrome does an excellent job of snapshotting.
In my research project for my degree I was building an interactive web application that had to generate a lot of data built up in 'layers', many of these layers would be 'deleted' in the UI but for some reason the memory wasn't being deallocated, using the snapshot tool I was able to determine that JQuery had been keeping a reference on the object (the source was when I was trying to trigger a .load() event which kept the reference despite going out of scope). Having this information at hand single-handedly saved my project, it's a highly useful tool when you're using other people's libraries and you have this issue of lingering references stopping the GC from doing its job.
EDIT:
It's also useful to plan ahead what actions you're going to perform to minimize time spent snapshotting, hypothesize what could be causing the problem and test each scenario out, making snapshots before and after.
Adding my 2 cents here with the tools available in 2021: https://yonatankra.com/how-to-profile-javascript-performance-in-the-browser/
There's a short video version here: https://yonatankra.com/detect-memory-leak-with-chrome-dev-tools
In one article I have seen that it may be good to clear all expandos on window.unload event to prevent memory leaks.
I cannot understand why to do this.
Isn't the browser cleaning all the DOM and its relevant resources of it once you leave the page anyway?
Thanks,
burak ozdogan
Hey, great question. The problem is with circular references between JavaScript objects and DOM nodes.
Let's say you have a global JavaScript object which points to a DOM node and the node has an expando property back to the object. When the page unloads, the script engine "nulls-out" the JavaScript object so it no longer points to the DOM node. But, it cannot release the object from memory because there is still a reference to it (from the DOM). Then the script engine terminates.
Expando properties on the DOM are nothing but references to other objects. When the DOM is being cleaned up, it breaks those references but assumes that the objects are still being used. In this example, the DOM waits for the script engine to clean up the objects that belong to it, but the script engine has already terminated.
So, the problem is that the DOM only takes care of the memory that belongs to it and assumes the script engine will do the same.
I hope this helped.
See: http://msdn.microsoft.com/en-us/library/bb250448%28VS.85%29.aspx