This is something I cannot find an official answer about. For some, DOM objects are JS objects, for others they differ. What is the right answer?
By searching in stackoverflow, you may see controversial opinions.
For example, does the object document.body belongs to DOM API only or may it be considered as part of javascript engine too?
Does Javascript create an internal representation of it or does it just communicates with DOM to access it?
The DOM API is a collection of standards which have implementations in a variety of programming languages.
The DOM available to JavaScript in a browser provides things in the form of JavaScript objects. Large portions of it are written in native code (so are handled by libraries not written in JavaScript but made available through a JavaScript API).
Where JavaScript leaves off and native code begins doesn't really matter, it is an implementation detail and probably varies from browser to browser. The point of having a standard API is that developers using it interact with that API and don't need to worry about how it is implemented under the hood.
Strictly speaking, no. The JavaScript runtime has access to them, and in that capacity they can function as JavaScript objects. But they are defined in a way that is not bound to any particular language, and in most DOM implementations, they're native code. Most DOM implementations take care to make the objects function the same way you'd expect other objects in the chosen language to work, but that's not always the same way that JavaScript objects do: for example, you can't go around adding dynamic properties to objects when you're working in Java.
For most practical purposes, when you're working in the browser or in some other JavaScript runtime, yes. As I stated above, most DOM implementations try to make the DOM objects work the same way as other objects in the language, and for JavaScript, that means making them work like "real" JavaScript objects. Although IE took a while to really get this right (you need IE9+ to take full advantage), these days you can pretty much use DOM objects the same way you'd use any other JavaScript object.
If you inspect deeply the __proto__ of document.body for instance, you would find this :
HTMLBodyElement > HTMLElement > Element > Node > EventTarget > Object
So yes : in the browser's context, DOM objects are JS objects, this is not reciprocal of course.
But DOM API is not exclusive to Javascript, it defines interfaces which can be implemented in any languages, for instance Python has a DOM API too and in this case, DOM objects are Python objects.
The DOM objects are not part of the JavaScript language, they are part of the environment that is provided when JavaScript runs in a browser.
When JavaScript runs in another environment, for example in Node.js, then there is no DOM. Instead there are other objects that make up the environment that the script works with.
The DOM objects are there just for JavaScript so the script works directly with the objects, there is no extra wrapper to make them available to JavaScript.
Related
I was reading ECMA2019 (the same is true in ES6 too), where I found:
Each object in an ECMAScript engine is associated with a set of
internal methods that defines its runtime behaviour. These internal
methods are not part of the ECMAScript language. They are defined by
this specification purely for expository purposes. However, each
object within an implementation of ECMAScript must behave as specified
by the internal methods associated with it. The exact manner in which
this is accomplished is determined by the implementation.
I also found these Stack Overflow question1 and question2 and that their answers don't seem to give me the answer I am looking for.
My question is simple. If JavaScript engines decide not to implement some of them, then how would they ensure this statement of above spec -
However, each object within an implementation of ECMAScript must
behave as specified by the internal methods associated with it.
Let us take an example:
[[GetPrototypeOf]] , [[Get]] , [[Set]] , [[GetOwnProperty]] etc are essential internal methods. If a JavaScript engine refuses to implement them, how does it achieve this functionality? Clearly they have to implement it, just that they can choose to have different method name and different method signature as it is not enforced by spec on them?
Where am I wrong?
Similarly for internal slots too? If they don't have internal variables storing that state, how on earth will they maintain the state of that object when asked?
EDIT : I will add more details to clarify my question. Let us take an example of Object.getPrototypeOf(). This is an API for internal behaviour [[GetPrototypeOf]] and there are possible algorithm for implementing it. The question is not possible ways to implement it a behaviour - its about having a behaviour or not ! and still satisfying the spec overall object behaviour.
V8 developer here. I think this question has mostly been answered already in the comments, so I'll just summarize.
Are internal slot and internal methods actually implemented by JavaScript engines?
Generally not; the engine simply behaves as if its internals were structured in this way. Some parts of an implementation might be very close to the spec's structure, if it's convenient.
One way to phrase it would be: you could implement a JavaScript engine by first faithfully translating the spec text to code (in whichever language you choose to use for your engine), and then you'd be allowed to refactor the invisible internals in any way you want (e.g.: inline functions, or split them up, or organize them as a helper class, or add a fast path or a cache, or generally turn the code inside out, etc). Which isn't surprising, really: as long as the observable behavior remains the same, any program is allowed to refactor its internals. What the ECMAScript is making clear at that point is simply that the "internal slots" really are guaranteed to always be internal and not observable.
[[[Get]] etc] are essential internal methods. If a JavaScript engine refuses to implement them, how does it achieve this functionality?
It's not about refusing to implement something. You can usually implement functionality in many different ways, i.e. with many different ways of structuring your code and your objects. Engines are free to structure their code and objects any way they want, as long as the resulting observable behavior is as specified.
Let us take an example of Object.getPrototypeOf(). This is an API for internal behaviour [[GetPrototypeOf]]
Not quite. Object.getPrototypeOf is a public function that's specified to behave in a certain way. The way the spec describes it is that it must *behave as if there were an internal slot [[GetPrototypeOf]].
You seem to have trouble imagining an alternative way. Well, in many cases, engines will probably choose to have an implementation that's very close to having those internal slots -- perhaps mapped to fields and methods in a C++ class. But it doesn't have to be that way; for example, instead of class methods, there could be free functions: GetPrototypeImpl(internal::Object object) rather than internal::Object::GetPrototypeImpl(). Or instead of an inheritance/hierarchy structure, the engine could use switch-statements over types.
One of the most common ways in which engines' implementations deviate from the structure defined by the spec's internal slots is by having additional fast paths. Typically, a fast path performs a few checks to see if it is applicable, and then does the simple, common case; if the applicability check fails, it falls back to a slower, more complete implementation, that might be much closer to the spec's structure. Or maybe neither function on its own contains the complete spec'ed behavior: you could have GetPrototypeFromRegularObject and GetPrototypeFromProxy plus a wrapper dispatching to the right one, and those all together behave like the spec's hypothetical system of having a [[GetPrototypeOf]] slot on both proxies and regular objects. All of that is perfectly okay because from the outside you can't see a difference in behavior -- all you can see is Object.getPrototypeOf.
One particular example of a fast path is a compiler. If you implemented object behaviors as (private) methods, and loaded and called those methods every time, then your implementation would be extremely slow. Modern engines compile JavaScript functions to bytecode or even machine code, and that code will behave as if you had loaded and called an internal function with the given behavior, but it (usually) will not actually call any such functions. For example, optimized code for an array[index] access should only be a few machine instructions (type check, bounds check, memory load), there should be no call to a [[Get]] involved.
Another very common example is object types. The spec typically uses wording like "if the object has a [[StringData]] internal slot, then ..."; an engine typically replaces that with "if the object's type is what I've chosen for representing strings internally, then ...". Again, the difference is not observable from the outside: Strings behave as if they had a [[StringData]] internal slot, but (in V8 at least) they don't have such a slot, they simply have an appropriate object type that identifies them as strings, and objects with string type know where their character payload is, they don't need any special slot for that.
Edit: forgot to mention: see also https://v8.dev/blog/understanding-ecmascript-part-1 for another way to explain it.
Please tell me where are DOM objects are kept. It seems that some part of browser (rendering engine or maybe browser engine) creating them and keeping them.
For example, if we have a tag in HTML document, then DOM object is created, which inherits properties and methods from objects HTMLInputElement, HTMLElement, Element, Node, EventTarget, Object. It is not clear to me where they come from, seems that not from Javascript engine.
I cannot figure it out. Some people say that DOM-objects are Javascript objects, but not much. Pure JS-objects are created in JS engine.But DOM is written in C++ and allows to use DOM objects just like other JS objects. DOM objects look like other objects from JS language, and work like real JS objects.
How can these DOM-objects let create user-defined properties, that can bee "seen" in JS engine, if DOM-objects "live" not in JS engine?
I am coming to conclusion that DOM-objects are created in browser engine (ex Gecko). Which is written in C++. They inherit from classesNode, Element, HTMLElement
So it seems, that Node and HTMLElement - are host objects. In other words they are composed objects of browser engine Gecko, and not objects of JS engine. Some kind of instances of Gecko built-in classes.
But what the heck is Interface Definition Language?
Please show me where I am wrong in my understanding
Please tell me where are DOM objects are kept.
That's an implementation detail that matters little unless you are writing a browser.
How can these DOM-objects let create user-defined properties, that can bee "seen" in JS engine, if DOM-objects "live" not in JS engine?
The browser provides an API that exposes them to the JS engine.
But what the heck is Interface Definition Language?
A way to describe the API so you know (for example) what methods are available on a type of object.
I was thinking about this today and I realized I don't have a clear picture here.
Here are some statements I think to be true (please correct me if I'm wrong):
the DOM is a collection of interfaces specified by W3C.
when parsing HTML source code, the browser creates a DOM tree which has nodes that implement DOM interfaces.
the ECMAScript spec has no reference of browser host objects (DOM, BOM, HTML5 APIs etc.).
how the DOM is actually implemented depends on browser internals and is probably different among most of them.
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
I am curious about what happens behind the scenes when I call document.getElementById('foo'). Does the call get delegated to browser native code by the interpreter or does the browser have JS implementations of all host objects? Do you know about any optimizations they do in regard to this?
I read this overview of browser internals but it didn't mention anything about this. I will look through the Chrome and FF source when I have time, but I thought about asking here first. :)
All of your bullet points are correct, except:
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
should be "...and translate it to native code". SpiderMonkey (the JS engine in Firefox) worked as a bytecode interpreter for a long time before the current JS speed arms race.
On Mozilla's JS-to-DOM bridge:
The host objects are typically implemented in C++, though there is an experiment underway to implement DOM in JS. So when a web page calls document.getElementById('foo'), the actual work of retrieving the element by its ID is done in a C++ method, as hsivonen noted.
The specific way the underlying C++ implementation gets called depends on the API and also changed over time (note that I'm not involved in the development, so might be wrong about some details, here's a blog post by jst, who was actually involved in creating much of this code):
At the lowest level every JS engine provides APIs to define host objects. For example, the browser can call JS_DefineFunctions (as demonstrated in the SpiderMonkey User Guide) to let the engine know that whenever script calls a function with the specified name, a provided C callback should be called. Same for other aspects of the host objects (e.g. enumeration, property getters/setters, etc.)
For the core ECMAScript functionality and in some tricky DOM cases the JS engine/the browser uses these APIs directly to define host objects and their behaviors, but it requires a lot of common boilerplate code for e.g. checking parameter types, converting them to the appropriate C++ types, error handling etc.
For reasons I won't go into, let's say historically, Mozilla made heavy use of XPCOM for many of its objects, including much of the DOM. One feature of XPCOM is its binding to JS called XPConnect. Among other things, XPConnect can take an interface definition in IDL (such as nsIDOMDocument; or more precisely its compiled representation), expose an object with the specified properties to the script, and later, when a script calls getElementById, perform the necessary parameter checks/conversions and route the call directly to a C++ method (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
The way XPConnect worked was quite inefficient: it registered generic functions as callbacks to be executed when a script accesses a host object, and these generic functions figured out what they needed to do in every particular case dynamically. This post about quickstubs walks you through one example.
"Quick stubs" mentioned in the previous link is a way to optimize JS->C++ calls time by trading some code size for it: instead of always using generic C++ functions that know how to make any kind of call, the specialized code is automatically generated at the Firefox build time for a pre-defined list of "hot" calls.
Later on the JIT (tracemonkey at that time) was taught to generate the code calling C++ methods as part of the native code generated for "hot" paths in JS. I'm not sure how the newer JITs (jaegermonkey) work in this regard.
With "paris bindings" the objects are exposed to webpage JS without any reliance on XPConnect, instead generating all the necessary glue JSClass code based on WebIDL (instead of XPCOM-era IDL). See also posts by developers who worked on this: jst and khuey. Also see How is the web-exposed DOM implemented?
I'm fuzzy on details of the three last points in particular, so take it with a grain of salt.
The most recent improvements are listed as dependencies of bug 622298, but I don't follow them closely.
JS calls to DOM methods like getElementById cause the JS engine to call into the C++ code that implements the DOM. For example, in Firefox, the call ends up in nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn).
As you can see, Firefox maintains a hashtable that maps ids to elements in C++ as an optimization in this case, so it doesn't walk the whole DOM tree looking for the id.
The DOM is implemented as a language-independent library pretty much in all major browser implementations, which means it's in a different library from the Javascript engine. For example in IE, the JS engine is implemented in jscript.dll while the DOM is implemented in mshtml.dll. Safari has Nitro(JS) and WebCore(DOM). Chrome has V8(JS) and WebCore(DOM), and Firefox has SpiderMonkey/TraceMonkey(JS) and Gecko(DOM).
What this means is that anytime your JS has to access the DOM, it has to reach over to the DOM library - which is inherently slow because of all the marshaling that has to take place. An analogy that has been used is 2 pieces of land connected by a toll bridge, any time you touch the DOM, you must cross over the bridge and cross back - paying a performance toll.
References
Video: Building High Performance Web Applications and Sites
Book: High Performance Javascript (Chapter 3 on the DOM)
I am writing an HTML5 application that involves a lot of XML manipulation, part of this manipulation involves comparing the versions of two different XML Elements.
What I need is for every Element, Attr, and TextNode (all of which inherit from Node, AFAIK) object that gets created to have associated version information, but still be able to behave like a normal Element, Attr, or TextNode. The current working solution I am using to store the version information, is the following:
Node.prototype.MyAppAnnotation = {
Version : null
};
Now, I understand that augmenting built-in types is considered bad form, but beyond this technique, I'm at a loss for how to get the desired functionality. I don't think I can encapsulate the Node in a wrapper because I need the Node related properties and functions exposed on the wrapper. I might be able to write some sort of pass-through functions for the wrapper, but that seems really clunky.
I feel that because the app I'm writing is an HTML5 app, and as such only has to run on the most modern browsers (all of which support the augmentation of built-ins), makes this technique appropriate. Also, by providing a sufficiently obscure name to my augmentation object, I can avoid all naming collisions (except for intentional collisions). I've also explored inheritance-based solution using Google's Closure library. However, it appears that because Element, Node and TextNode don't have direct constructors (i.e. they're created off of a Document object), this technique will not work either.
I was wondering if someone could either a) recommend an elegant way of achieving this effect without augmenting Element, or b) provide a compelling reason for why I shouldn't break the "don't augment built-ins" rule in this case.
Many Thanks,
Jarabek
Your idea is theoretically valid, but there's a weird feeling I get when reading about it.
First of all - you don't have to augment any prototypes. If you just do somedomnode.myweirdname='foo' it will become a field of that object. That's what javascript does ;)
So when there is no version you'll get undefined instead of null.
But, if you want to add more functionality or wrap dom node in anything - there's a bit of history of doing that. Most of that history is dominated by stuff like jQuery :)
Just create an object that has a field containing the node. And then you can access it really simply:
myobject.node
And create the object with some constructor or just factory function:
var myobject = createDomNodeWrapper(domnode)
Are DOM objects regular Javascript objects? If not, what are they?
Does the DOM objects are reguler Javascript objects? if not, what are they?
No, they're "host objects". They don't necessarily play by all the same rules as native JavaScript objects.
They're in some sense objects, but they're added by the host environment and are not part of the ECMAScript specification.
For example, I don't believe there's anything that requires them to accept expando properties. Or in the case of functions, I don't know that they're required to have an accessible and extendable prototype property.
Also functions may or may not have the typical methods of Function.prototype, like .call() and .apply().
The rules are simply much looser than those of objects defined by the ECMAScript specification, so you can't necessarily rely on the same behavior in all cases.
They are of type HTMLElement
Yes, they are:
> typeof document.body
"object"
> document.body instanceof Object
true
Here is a description of the Documnet Object Model (DOM) from the Mozilla Development Network:
The Document Object Model is an API for HTML and XML documents. It
provides a structural representation of the document, enabling you to
modify its content and visual presentation. Essentially, it connects
web pages to scripts or programming languages.
All of the properties, methods, and events available to the web
developer for manipulating and creating web pages are organized into
objects (e.g., the document object that represents the document
itself, the table object that represents a HTML table elements, and so
forth). Those objects are accessible via scripting languages in most
recent web browsers.
The DOM is most often used in conjunction with JavaScript. That is,
the code is written in JavaScript, but it uses the DOM to access the
web page and its elements. However, the DOM was designed to be
independent of any particular programming language, making the
structural representation of the document available from a single,
consistent API. Though we focus on JavaScript throughout this site,
implementations of the DOM can be built for any language.
The World Wide Web Consortium establishes a standard for the DOM,
called the W3C DOM. It should, now that the most important browsers
correctly implement it, enable powerful cross-browser applications.
The DOM is an API , so a set of predefined "objects" attached to the window "namespace".
It can be HTMLElements but not only and they are not part of the javascript core.
So there is the javascript core and the DOM , and you can have other APIS.