Why use DOMStringList rather than an array? - javascript

I've recently discovered the DOMStringList, which can be found in an IndexedDB's list of store names. It seems like a DOMStringList is just a simplified version of an array, which has only two methods item() and contains(). There is no useful methods like indexOf, filter, forEach that you'll find on an Array. Why use this kind of object? What are DOMStringList's advantages?

The existence of DOMStringList is a historical accident. Today, in modern APIs, the same use cases are met by using an Array instance.
It was introduced into web APIs because we needed something array/list-like, that cannot be modified. The "cannot be modified" part is important, because there's no good answer for what would happen to a modifiable array in scenarios like
db.objectStoreNames.push("foo");
db.objectStoreNames.push(notAString);
db.objectStoreNames.shift();
At the time the first API using DOMStringList was introduced, the people designing the API did not know how to make this work with Arrays. So, they designed DOMStringList. It was used for a couple of APIs, namely location.ancestorOrigins and db.objectStoreNames.
But then, the people designing such web APIs figured out how to introduce non-modifiable arrays. This actually took two separate tries:
Introducing the use of frozen Array instances, via the FrozenArray<> Web IDL type. See whatwg/webidl#52, and the linked bug there.
Introducing the use of proxies around Array instances, via the ObservableArray<> Web IDL type. See whatwg/webidl#840, and the linked bug there.
The difference between these two is that frozen Arrays cannot be modified, even by the browser; whereas proxies around Arrays can be modified by the browser. (Or even by the web developer, if the spec in question allows that.)
So, can we move everything using DOMStringList to use one of these modern solutions? No. Because there is code in the wild which depends on db.objectStoreNames.item() and db.objectStoreNames.contains() working, and that would break if we moved to actual Array instances, which don't have those methods.
So we might need a third Array wrapper type if we want to fully obliterate the legacy array-like classes from the web platform, and start using true Arrays. It would be a subclass of Array, with an extra method or two, and possibly a proxy wrapped around that. Nobody has yet made moves in that direction.
(Other legacy array-like classes, you say? Yes: in addition to DOMStringList, we have TouchList, AnimationNodeList, CSSRuleList, DOMRectList, FileList, ... see this list of classes on the web platform with an item() method, most (but not all) of which are of this sort.)

Related

Why is getting from Map slower than getting from object?

I'm considering migrating my state management layer to using Map versus using a standard object.
From what I've read, Map is effectively a hash table whereas Objects use hidden classes under the hood. Generally it's advised, where the properties are likely to be dynamically added or removed it's more efficient to use Map.
I set up a little test and to my surprise accessing values in the Object version was faster.
https://jsfiddle.net/mfbx9da4/rk4hocwa/20/
The article also mentions fast and slow properties. Perhaps the reason my code sample in test1 is so fast is because it is using fast properties? This seems unlikely as the object has 100,000 keys. How can I tell if the object is using fast properties or dictionary lookup? Why would the Map version be slower?
And yes, in practice, looks like a premature optimization, root of all evil ... etc etc. However, I'm interested in the internals and curious to know of best practices of choosing Map over Object.
(V8 developer here.)
Beware of microbenchmarks, they are often misleading.
V8's object system is implemented the way it is because in many cases it turns out to be very fast -- as you can see here.
The primary reason why we recommend using Map for map-like use cases is because of non-local performance effects that the object system can exhibit when certain parts of the machinery get "overloaded". In a small test like the one you have created, you won't see this effect, because nothing else is going on. In a large app (using many objects with many properties in many different usage patterns), it's still not guaranteed (because it depends on what the rest of the app is doing) but there's a good chance that using Maps where appropriate will improve overall performance -- if the overall system previously happened to run into one of the unfortunate situations.
Another reason is that Maps handle deletion of entries much better than Objects do, because that's a use case their implementation explicitly anticipates as common.
That said, as you already noted, worrying about such details in the abstract is a case of premature optimization. If you have a performance problem, then profile your app to figure out where most time is being spent, and then focus on improving those areas. If you do end up suspecting that the use of objects-as-maps is causing issues, then I recommend to change the implementation in the app itself, and measure (with the real app!) whether it makes a difference.
(See here for a related, similarly misleading microbenchmark, where even the microbenchmark itself started producing opposite results after minor modifications: Why "Map" manipulation is much slower than "Object" in JavaScript (v8) for integer keys?. That's why we recommend benchmarking with real apps, not with simplistic miniature scenarios.)

Are internal slot and internal methods actually implemented by JavaScript engines?

I was reading ECMA2019 (the same is true in ES6 too), where I found:
Each object in an ECMAScript engine is associated with a set of
internal methods that defines its runtime behaviour. These internal
methods are not part of the ECMAScript language. They are defined by
this specification purely for expository purposes. However, each
object within an implementation of ECMAScript must behave as specified
by the internal methods associated with it. The exact manner in which
this is accomplished is determined by the implementation.
I also found these Stack Overflow question1 and question2 and that their answers don't seem to give me the answer I am looking for.
My question is simple. If JavaScript engines decide not to implement some of them, then how would they ensure this statement of above spec -
However, each object within an implementation of ECMAScript must
behave as specified by the internal methods associated with it.
Let us take an example:
[[GetPrototypeOf]] , [[Get]] , [[Set]] , [[GetOwnProperty]] etc are essential internal methods. If a JavaScript engine refuses to implement them, how does it achieve this functionality? Clearly they have to implement it, just that they can choose to have different method name and different method signature as it is not enforced by spec on them?
Where am I wrong?
Similarly for internal slots too? If they don't have internal variables storing that state, how on earth will they maintain the state of that object when asked?
EDIT : I will add more details to clarify my question. Let us take an example of Object.getPrototypeOf(). This is an API for internal behaviour [[GetPrototypeOf]] and there are possible algorithm for implementing it. The question is not possible ways to implement it a behaviour - its about having a behaviour or not ! and still satisfying the spec overall object behaviour.
V8 developer here. I think this question has mostly been answered already in the comments, so I'll just summarize.
Are internal slot and internal methods actually implemented by JavaScript engines?
Generally not; the engine simply behaves as if its internals were structured in this way. Some parts of an implementation might be very close to the spec's structure, if it's convenient.
One way to phrase it would be: you could implement a JavaScript engine by first faithfully translating the spec text to code (in whichever language you choose to use for your engine), and then you'd be allowed to refactor the invisible internals in any way you want (e.g.: inline functions, or split them up, or organize them as a helper class, or add a fast path or a cache, or generally turn the code inside out, etc). Which isn't surprising, really: as long as the observable behavior remains the same, any program is allowed to refactor its internals. What the ECMAScript is making clear at that point is simply that the "internal slots" really are guaranteed to always be internal and not observable.
[[[Get]] etc] are essential internal methods. If a JavaScript engine refuses to implement them, how does it achieve this functionality?
It's not about refusing to implement something. You can usually implement functionality in many different ways, i.e. with many different ways of structuring your code and your objects. Engines are free to structure their code and objects any way they want, as long as the resulting observable behavior is as specified.
Let us take an example of Object.getPrototypeOf(). This is an API for internal behaviour [[GetPrototypeOf]]
Not quite. Object.getPrototypeOf is a public function that's specified to behave in a certain way. The way the spec describes it is that it must *behave as if there were an internal slot [[GetPrototypeOf]].
You seem to have trouble imagining an alternative way. Well, in many cases, engines will probably choose to have an implementation that's very close to having those internal slots -- perhaps mapped to fields and methods in a C++ class. But it doesn't have to be that way; for example, instead of class methods, there could be free functions: GetPrototypeImpl(internal::Object object) rather than internal::Object::GetPrototypeImpl(). Or instead of an inheritance/hierarchy structure, the engine could use switch-statements over types.
One of the most common ways in which engines' implementations deviate from the structure defined by the spec's internal slots is by having additional fast paths. Typically, a fast path performs a few checks to see if it is applicable, and then does the simple, common case; if the applicability check fails, it falls back to a slower, more complete implementation, that might be much closer to the spec's structure. Or maybe neither function on its own contains the complete spec'ed behavior: you could have GetPrototypeFromRegularObject and GetPrototypeFromProxy plus a wrapper dispatching to the right one, and those all together behave like the spec's hypothetical system of having a [[GetPrototypeOf]] slot on both proxies and regular objects. All of that is perfectly okay because from the outside you can't see a difference in behavior -- all you can see is Object.getPrototypeOf.
One particular example of a fast path is a compiler. If you implemented object behaviors as (private) methods, and loaded and called those methods every time, then your implementation would be extremely slow. Modern engines compile JavaScript functions to bytecode or even machine code, and that code will behave as if you had loaded and called an internal function with the given behavior, but it (usually) will not actually call any such functions. For example, optimized code for an array[index] access should only be a few machine instructions (type check, bounds check, memory load), there should be no call to a [[Get]] involved.
Another very common example is object types. The spec typically uses wording like "if the object has a [[StringData]] internal slot, then ..."; an engine typically replaces that with "if the object's type is what I've chosen for representing strings internally, then ...". Again, the difference is not observable from the outside: Strings behave as if they had a [[StringData]] internal slot, but (in V8 at least) they don't have such a slot, they simply have an appropriate object type that identifies them as strings, and objects with string type know where their character payload is, they don't need any special slot for that.
Edit: forgot to mention: see also https://v8.dev/blog/understanding-ecmascript-part-1 for another way to explain it.

Portability of Array.prototype.* on array like objects or ever native/host objects

ESMA 262 5.1 for many Array.prototype functions say that they are intentionally generic and described in terms of [[Get]], [[Put]], etc operations on Object, but also require length property.
So them allowed to work on build-in objects, like:
obj = {"a":true, "length":10};
Array.prototype.push.call(obj, -1);
console.log(obj); // Object { 10: -1, a: true, length: 11 }
For native objects standard have note:
Whether the push function can be applied successfully
to a host object is implementation-dependent.
Is arguments host object? Seems that all DOM (as NodeList) are host objects. And them work in modern browsers.
MDN docs warn about < IE9. What about another browsers? What about nodejs native objects? What about Rhino/Nashorn native objects?
UPDATE #jfriend00 Hm, I didn't think about [[Put]] operation... In ECMA 5.1 I found special notes about such situation:
Host objects may implement these internal methods in any manner
unless specified otherwise; for example, one possibility is that
[[Get]] and [[Put]] for a particular host object indeed fetch and
store property values but [[HasProperty]] always generates false.
However, if any specified manipulation of a host object's
internal properties is not supported by an implementation, that
manipulation must throw a TypeError exception when attempted.
So in bad case you get TypeError!
Since you never really got a complete answer, I'll take a stab at answering some of the questions you posted.
Is arguments host object?
arguments is part of the Javascript language, not a host object. It has a pretty well defined behavior which has been modified some when running in strict mode. Since arguments does not persist beyond the current function call (not even in a closure) and since it is not meant to be mutable, the usual way of handling the arguments object is to immediately make a copy into a real array where you can then use all the normal array methods on it and it can persist in a closure to be accessed by a local function.
MDN docs warn about < IE9. What about another browsers?
It's not very specific to generalize here about a particular browser. Instead, you'd have to examine a specific object and then specific versions of a browser. Older versions of IE did have a reputation for having host objects that didn't interoperate as well with Javsacript (in this way), but you'd really have to examine a specific object to know what you could and couldn't do.
What about nodejs native objects?
node.js is much more pure Javascript environment than the browser because there is no DOM, no window object, etc... Did you have any specific node.js objects in mind that you wanted to ask about? In my somewhat limited experience with node.js, I'm just seeing actual JS objects, though there are many places that node.js interfaces with the OS so perhaps there are some non JS objects in those interfaces (I haven't encountered any yet, but that is a possibility).
So in bad case you get TypeError!
As I said in my comments, using any array object that attempts to modify the array such as .splice() is very likely to cause problems with host objects as many host objects are not meant to be modified directly. Plus reading the specification and assuming that older browsers all follow the specification (without extensive testing to verify that) can be dangerous. In particular, older versions of IE are famous for not following the specification. So, again, you can't just assume you would get a TypeError without proper testing.
If you're looking for a general purpose safe way to code, one will never go wrong by copying an array-like host object into an actual array and then using array operations on the actual array. That's guaranteed to be safe. There's a cross-browser polyfill for Array.prototype.slice that works with all browsers for copying into an actual array on the MDN page for .slice(). If you're only supporting IE 9 and up, you don't need the polyfill.
And, one should never assume that any operation that changes the array-like object is generally safe on a host object (there could be specific exceptions, but you'd have to do a lot of testing to be sure). My preference is to write code that I know will be safe and does not require a lot of testing to guarantee that. Copying into an actual array gives me that every time.
If you want to be sure it will work, better convert it to an array first.
To convert an array-like object to an array, you can use ES6 Array.from. Currently only Firefox 32 supports it, but there is a polyfill.
Alternatively, [].slice.call(arrayLike) will work on most browsers.

how are javascript properties added to and looked up on objects?

When adding properties to a JavaScript object are they added in an ordered way (alphabetical etc). And if so does that mean when you lookup a property on a JavaScript object that a quick algorithm is used like a binary tree search? I did a search for this and just found lots of explanations for prototype inheritance which I already understand I'm just interested in how a property is looked up within a single level of the prototype chain.
That entirely depends on the implementation. Google's V8 engine probably does it differently than Firefox's JagerMonkey. And they almost certainly does it different than IE6. Looking up a property in an object is just an interface (a fairly common Map interface as programmers would call it). The only thing Javascript guarantees you is the methods of the interface, no details about implementation, and that's a good thing. It could be a hash table (probably) or it could be a linked list (less likely, but possible) or it could even be a binary search tree.
The point is that we don't know how it's implemented, nor should we. And you should make no assumptions about the implementation. As is common with abstraction in programming, just assume it's magic. :)
Here is a high level description of how v8 does it using hidden classes it then looks up the property value by using the fixed offset provided by the definition of the hidden class. It also confirms that most other implementations use a dictionary type data object.

Why is it frowned upon to modify JavaScript object's prototypes?

I've come across a few comments here and there about how it's frowned upon to modify a JavaScript object's prototype? I personally don't see how it could be a problem. For instance extending the Array object to have map and include methods or to create more robust Date methods?
The problem is that prototype can be modified in several places. For example one library will add map method to Array's prototype and your own code will add the same but with another purpose. So one implementation will be broken.
Mostly because of namespace collisions. I know the Prototype framework has had many problems with keeping their names different from the ones included natively.
There are two major methods of providing utilities to people..
Prototyping
Adding a function to an Object's prototype. MooTools and Prototype do this.
Advantages:
Super easy access.
Disadvantages:
Can use a lot of system memory. While modern browsers just fetch an instance of the property from the constructor, some older browsers store a separate instance of each property for each instance of the constructor.
Not necessarily always available.
What I mean by "not available" is this:
Imagine you have a NodeList from document.getElementsByTagName and you want to iterate through them. You can't do..
document.getElementsByTagName('p').map(function () { ... });
..because it's a NodeList, not an Array. The above will give you an error something like: Uncaught TypeError: [object NodeList] doesn't have method 'map'.
I should note that there are very simple ways to convert NodeList's and other Array-like
Objects into real arrays.
Collecting
Creating a brand new global variable and stock piling utilities on it. jQuery and Dojo do this.
Advantages:
Always there.
Low memory usage.
Disadvantages:
Not placed quite as nicely.
Can feel awkward to use at times.
With this method you still couldn't do..
document.getElementsByTagName('p').map(function () { ... });
..but you could do..
jQuery.map(document.getElementsByTagName('p'), function () { ... });
..but as pointed out by Matt, in usual use, you would do the above with..
jQuery('p').map(function () { ... });
Which is better?
Ultimately, it's up to you. If you're OK with the risk of being overwritten/overwriting, then I would highly recommend prototyping. It's the style I prefer and I feel that the risks are worth the results. If you're not as sure about it as me, then collecting is a fine style too. They both have advantages and disadvantages but all and all, they usually produce the same end result.
As bjornd pointed out, monkey-patching is a problem only when there are multiple libraries involved. Therefore its not a good practice to do it if you are writing reusable libraries. However, it still remains the best technique out there to iron out cross-browser compatibility issues when using host objects in javascript.
See this blog post from 2009 (or the Wayback Machine original) for a real incident when prototype.js and json2.js are used together.
There is an excellent article from Nicholas C. Zakas explaining why this practice is not something that should be in the mind of any programmer during a team or customer project (maybe you can do some tweaks for educational purpose, but not for general project use).
Maintainable JavaScript: Don’t modify objects you don’t own:
https://www.nczonline.net/blog/2010/03/02/maintainable-javascript-dont-modify-objects-you-down-own/
In addition to the other answers, an even more permanent problem that can arise from modifying built-in objects is that if the non-standard change gets used on enough sites, future versions of ECMAScript will be unable to define prototype methods using the same name. See here:
This is exactly what happened with Array.prototype.flatten and Array.prototype.contains. In short, the specification was written up for those methods, their proposals got to stage 3, and then browsers started shipping it. But, in both cases, it was found that there were ancient libraries which patched the built-in Array object with their own methods with the same name as the new methods, and had different behavior; as a result, websites broke, the browsers had to back out of their implementations of the new methods, and the specification had to be edited. (The methods were renamed.)
For example, there is currently a proposal for String.prototype.replaceAll. If you ship a library which gets widely used, and that library monkeypatches a custom non-standard method onto String.prototype.replaceAll, the replaceAll name will no longer be usable by the specification-writers; it will have to be changed before browsers can implement it.

Categories

Resources