I read a few questions and answers about javascript dictionary implementations, but they don't meet my requirements:
the dictionary must be able to take objects as keys
the values must be accessible by the []-operator
So I came up with the idea to overwrite the valueOf-method in Object.prototype, as follows:
Object.__id__ = 0;
Object.prototype.valueOf = function() {
if(!this.__id__)
this.__id__ = ++Object.__id__;
return "__id__" + this.__id__;
}
Object.prototype.toString = Object.prototype.valueOf;
//test
var x = {p1: "5"};
var y = [6];
var z = {};
z[x] = "7";
z[y] = "8";
console.log(z[x], z[y]);
I tested this with google-chrome and it seems to work well, but I'm a bit sceptical, whether this will cause some drawbacks, since it was so easy to implement.
Considering that the valueOf method is not used for other purposes in the whole code, do you think there are any disadvantages?
It's an interesting idea. I suggest my jshashtable. It meets your first requirement but not the second. I don't really see the advantage of insisting on using the square bracket property access notation: do you have a particular requirement for it?
With jshashtable, you can provide a hashing function to the Hashtable constructor. This function is passed an object to be used as a key and must return a string; you could use a function not dissimilar to what you have there, without having to touch Object.prototype.
There are some disadvantages to your idea:
Your valueOf method will show up in a for...in loop over any native object;
You have no way determining which keys should be considered equal, which is something you may want to do. Instead, all keys will be considered unique.
This won't work with host objects (i.e. objects provided by the environment, such as DOM elements)
It is an interesting question, because I had so far assumed that any object can be used as an index (but never tried with associative arrays). I don't know enough about the inner workings of JavaScript to be sure, but I'd bet that valueOf is used somewhere else by JavaScript, even if not in your code. You might run into seemingly inexplicable problems later. At least, I'd restrict myself to a new class and leave Object alone ;) Or, you explicitly call your hashing function, calling it myHash() or whatever and calling z[x.myHash()] which adds clutter but would let me, personally, sleep better ;) I can't resist thinking there's a more JavaScript-aware solution to this, so consider all of these ugly workarounds ;)
If you came upon this question looking for a JS dictionary where objects are keys look at Map Map vs Object in JavaScript
Related
I was analyzing some third party javascript libraries and came across an approach wherein people create quick reference to core prototypes. Is there any performance benefit of doing this ? Can anyone explain this with an example ?
var ArrayProto = Array.prototype, ObjProto = Object.prototype, FuncProto = Function.prototype;
// Create quick reference variables for speed access to core prototypes.
var
push = ArrayProto.push,
slice = ArrayProto.slice,
concat = ArrayProto.concat,
toString = ObjProto.toString,
hasOwnProperty = ObjProto.hasOwnProperty;
Is there any performance benefit of doing this ?
A very small one, yes, for two/three reasons:
When you reference an identifier (for instance, Array or ArrayProto or push), the JavaScript engine first looks in the current lexical environment and then, if it's not found, the next one out, and the next one out, etc., until it reaches the global lexical environment. I assume the code you're referring to is within a scoping function. So because those are locals within the scoping function, they're found right away, rather than the JavaScript engine having to traverse up to the global environment to find them.
Array.prototype not only requires looking up Array, but also the prototype property on Array. It doesn't take any appreciable time, but it doesn't take zero time, either.
(Sort of #2 repeated) Looking up Array.prototype.push also requires looking up push on Array.prototype. Again, not appreciable, but again, not zero, either.
So the combination of those can make a very small performance difference, using a local push rather than Array.prototype.push (and so on).
More likely, though, the author did it because it made for less typing, rather than as a performance enhancement. :-)
Re an example: It's frequently useful to use a function like Array.prototype.slice on an object that isn't an array. In fact, until ES2015's Array.from, it was one of the canonical ways to turn an array-like object (such as the collection returned from querySelectorAll) into a true array (more in my answer here).
So given the setup in your question, if I have an array-like list:
var list = document.querySelectorAll("some-selector-here");
instead of doing this to get that list as an array:
var trueArray = Array.prototype.slice.call(list);
I can do this instead:
var trueArray = slice.call(list);
Since slice is likely to be in the current lexical environment or the one just outside it, it's found fairly quickly (point #1 above), and then we're done, rather than having to look up prototype on Array (point #2 above) and then look up slice on Array.prototype (point #3 above).
So it's very slightly faster; but again, primarily, it's shorter and less error-prone to type.
If you're going to do a lot of Array.prototype in your function, you might want to grab a copy of the function reference into a variable for reuse. This would provide a small performance benefit.
So I learned a bit about the hidden class concept in v8. It is said that you should declare all properties in the constructor (if using prototype based "pseudo classes") and that you should not delete them or add new ones outside of the constructor. So far, so good.
1) But what about properties where you know the type (that you also shouldn't change) but not the (initial) value?
For example, is it sufficient to do something like this:
var Foo = function () {
this.myString;
this.myNumber;
}
... and assign concrete values later on, or would it be better to assign a "bogus" value upfront, like this:
var Foo = function () {
this.myString = "";
this.myNumber = 0;
}
2) Another thing is with objects. Sometimes I just know that an object wont have a fixed structure, but I want to use it as a hash map. Is there any (non verbose) way to tell the compiler I want to use it this way, so that it isn't optimized (and deopted later on)?
Update
Thanks for your input! So after reading your comments (and more on the internet) I consider these points as "best practices":
Do define all properties of a class in the constructor (also applies for defining simple objects)
You have to assign something to these properties, even if thats just null or undefined - just stating this.myString; is apparently not enough
Because you have to assign something anyways I think assigning a "bogus" value in case you can't assign the final value immediatly cannot hurt, so that the compiler does "know" ASAP what type you want to use. So, for example this.myString = "";
In case of objects, do assign the whole structure if you know it beforehand, and again assign dummy values to it's properties if you don't know them immediatly. Otherwise, for example when intending to use the Object as a hashmap, just do: this.myObject = {};. Think its not worth indicating to the compiler that this should be a hashmap. If you really want to do this, I found a trick that assigns a dummy property to this object and deletes it immediatly afterwards. But I won't do this.
As for smaller Arrays it's apparently recommended (reference: https://www.youtube.com/watch?v=UJPdhx5zTaw&feature=youtu.be&t=25m40s) to preallocate them especially if you know the final size, so for example: this.myArray = new Array(4);
Don't delete properties later on! Just null them if needed
Don't change types after assigning! This will add another hidden class and hurt performance. I think thats best practice anyways. The only case where I have different types is for certain function arguments anyways. In that case I usually convert them to the same target type.
Same applies if you keep adding additional properties later on.
That being said, I also think doing this will lean to cleaner and more organized code, and also helps with documenting.
Yeah, so one little thing I am unsure remains: What if I define properties in a function (for example a kind of configure() method) called within the constructor?
Re 1): Just reading properties, like in your first snippet, does not do anything to the object. You need to assign them to create the properties.
But for object properties it doesn't actually matter much what values you initialise them with, as long as you do initialise them. Even undefined should be fine.
The concrete values are much more relevant for arrays, where you want to make sure to create them with the right elements (and without any holes!) because the VM tries to keep them homogeneous. In particular, never use the Array constructor, because that creates just holes.
Re 2): There are ways to trick the VM into using a dictionary representation, but they depend on VM and version and aren't really reliable. In general, it is best to avoid using objects as maps altogether. Since ES6, there is a proper Map class.
(Let us suppose that there is a good reason for wishing this. See the end of the question if you want to read the good reason.)
I would like to obtain the same result as a for in loop, but without using that language construct. By result I mean only an array of the property names (I don't need to reproduce the behavior that would happen if I modify the object while iterating over it).
To put the question into code, I'd like to implement this function without for in:
function getPropertiesOf(obj) {
var props = [];
for (var prop in obj)
props.push(prop);
return props;
}
From my understanding of the ECMAScript 5.1 specification about the for in statement and the Object.keys method, it seems the following implementation should be correct:
function getPropertiesOf(obj) {
var props = [];
var alreadySeen = {};
// Handle primitive types
if (obj === null || obj === undefined)
return props;
obj = Object(obj);
// For each object in the prototype chain:
while (obj !== null) {
// Add own enumerable properties that have not been seen yet
var enumProps = Object.keys(obj);
for (var i = 0; i < enumProps.length; i++) {
var prop = enumProps[i];
if (!alreadySeen[prop])
props.push(prop);
}
// Add all own properties (including non-enumerable ones)
// in the alreadySeen set.
var allProps = Object.getOwnPropertyNames(obj);
for (var i = 0; i < allProps.length; i++)
alreadySeen[allProps[i]] = true;
// Continue with the object's prototype
obj = Object.getPrototypeOf(obj);
}
return props;
}
The idea is to walk explicitly the prototype chain, and use Object.keys to get the own properties in each object of the chain. We exclude property names already seen in previous objects in the chain, including when they were seen as non-enumerable. This method should even respect the additional guarantee mentioned on MDN:
The Object.keys() method returns an array of a given object's own
enumerable properties, in the same order as that provided by a
for...in loop [...].
(emphasis is mine)
I played a bit with this implementation, and I haven't been able to break it.
So the question:
Is my analysis correct? Or am I overlooking a detail of the spec that would make this implementation incorrect?
Do you know another way to do this, that would match the implementation's specific order of for in in all cases?
Remarks:
I don't care about ECMAScript < 5.1.
I don't care about performance (it can be disastrous).
Edit: to satisfy #lexicore's curiosity (but not really part of the question), the good reason is the following. I develop a compiler to JavaScript (from Scala), and the for in language construct is not part of the things I want to support directly in the intermediate representation of my compiler. Instead, I have a "built-in" function getPropertiesOf which is basically what I show as first example. I'm trying to get rid of as many builtins as possible by replacing them by "user-space" implementations (written in Scala). For performance, I still have an optimizer that sometimes "intrinsifies" some methods, and in this case it would intrinsify getPropertiesOf with the efficient first implementation. But to make the intermediate representation sound, and work when the optimizer is disabled, I need a true implementation of the feature, no matter the performance cost, as long as it's correct. And in this case I cannot use for in, since my IR cannot represent that construct (but I can call arbitrary JavaScript functions on any objects, e.g., Object.keys).
From the specification point of view, your analysis correct only under assumption that a particular implementation defines a specific order of enumeration for the for-in statement:
If an implementation defines a specific order of enumeration for the
for-in statement, that same enumeration order must be used in step 5
of this algorithm.
See the last sentence here.
So if an implementation does not provide such specific order, then for-in and Object.keys may return different things. Well, in this case even two different for-ins may return different things.
Quite interesting, the whole story reduces to the question if two for-ins will give the same results if the object was not changed. Because, if it is not the case, then how could you test "the same" anyway?
In practice, this will most probably be true, but I could also easily imagine that an object could rebuild its internal structure dynamically, between for-in calls. For instance, if certain property is accessed very often, the implementation may restructure the hash table so that access to that property is more efficient. As far as I can see, the specification does not prohibit that. And it is also not-so-unreasonable.
So the answer to your question is: no, there is no guarantee according to the specification, but still will probably work in practice.
Update
I think there's another problem. Where is it defined, what the order of properties between the members of the prototype chain is? You may get the "own" properties in the right order, but are they merged exactly the way as you do it? For instance, why child properties first and parent's next?
I frequently get an array of an objects keys using:
Object.keys(someobject)
I'm comfortable doing this. I understand that Object is the Object constructor function, and keys() is a method of it, and that keys() will return a list of keys on whatever object is given as the first parameter. My question is not how to get the keys of an object - please do not reply with non-answers explaining this.
My question is, why isn't there a more predictable keys() or getKeys() method, or keys instance variable available on Object.prototype, so I can have:
someobject.keys()
or as an instance variable:
someobject.keys
And return the array of keys?
Again, my intention is to understand the design of Javascript, and what purpose the somewhat unintuitive mechanism of fetching keys serves. I don't need help getting keys.
I suppose they don't want too many properties on Object.prototype since your own properties could shadow them.
The more they include, the greater the chance for conflict.
It would be very clumsy to get the keys for this object if keys was on the prototype...
var myObj: {
keys: ["j498fhfhdl89", "1084jnmzbhgi84", "jf03jbbop021gd"]
};
var keys = Object.prototype.keys.call(myObj);
An example of how introducing potentially shadowed properties can break code.
There seems to be some confusion as to why it's a big deal to add new properties to Object.prototype.
It's not at all difficult to conceive of a bit of code in existence that looks like this...
if (someObject.keys) {
someObject.keys.push("new value")
else
someObject.keys = ["initial value"]
Clearly this code would break if you add a keys function to Object.prototype. The fact that someObject.keys would now be a shadowing property breaks the code that is written to assume that it is not a shadowing property.
Hindsight is 20/20
If you're wondering why keys wasn't part of the original language, so that people would at least be accustomed to coding around it... well I guess they didn't find it necessary, or simply didn't think of it.
There are many possible methods and syntax features that aren't included in the language. That's why we have revisions to the specification, in order to add new features. For example, Array.prototype.forEach is a late addition. But they could add it to Array.prototype, because it doesn't break proper uses of Array.
It's not a realistic expectation that a language should include every possible feature in its 1.0 release.
Since Object.keys does nothing more than return an Array of an Object's enumerable own properties, it's a non-essential addition, that could be achieved with existing language features. It should be no surprise that it wasn't present earlier.
Conclusion
Adding keys to Object.prototype most certainly would break legacy code.
In a tremendously popular language like JavaScript, backward compatibility is most certainly going to be an important consideration. Adding new properties to Object.prototype at this point could prove to be disastrous.
I guess an answer to your question is "Because the committee decided so", but I can already hear you ask "why?" before the end of this sentence.
I read about this recently but I can't find the source right now. What it boiled down to was that in many cases you had to use Object.prototype.keys.call(myobject) anyway, because the likelihood of myobject.keys already being used in the object for something else.
I think you will find this archived mail thread interesting, where for example Brendan Eich discuss some aspects of the new methods in ECMAScript 5.
Update:
While digging in the mail-archive I found this:
Topic: Should Object.keys be repositioned as Object.prototype.keys
Discussion: Allen argued that this isn't really a meta layer operation
as it is intended for use in application layer code as an alternative
to for..in for getting a list of enumerable property names. As a
application layer method it belongs on Object.prototype rather than on
the Object constructor. There was general agreement in principle, but
it was pragmatically argued by Doug and Mark that it is too likely
that a user defined object would define its own property named "keys"
which would shadow Object.prototype.keys making it inaccessible for
use on such objects.
Action: Leave it as Object.keys.
Feel free to make your own, but the more properties you add to the Object prototype, the higher chance you'll collide. These collisions will most likely break any third party javascript library, and any code that relies on a for...in loop.
Object.prototype.keys = function () {
return Object.keys(this);
};
for (var key in {}) {
// now i'm getting 'keys' in here, wtf?
}
var something = {
keys: 'foo'
};
something.keys(); // error
I need to make a Javascript object that would behave as an associative array, but with some functions that are called before getting and setting properties.
For example, the task may be like this: we should make an object, that would contain a squared value of a key, like this:
obj.two should be equal to 4,
obj.four should be equal to 16,
obj['twenty one'] should be equal to 441.
This is an example. Actually I need to make setting operation overridden too. The getting and setting operations would go to the database, and they not necceserily would take strings as keys, but any types of objects, from which it would create a DB query.
How would I do that a) with as less thirdparty libraries as possible and b) to make it work on as much platforms as possible?
I am new to JS, I've found that JS has no associative arrays, relying on the ability to define objects on the fly with arbitrary properties. I googled and had an idea to use or even override lookupgetter (and setter), where define a new getter/setter on the fly, but I coundn't find if the interpreter would use this method every time it encounters new key. Anyway, it looks like I wouldn't be able to use anything except strings or maybe numbers as keys.
In Java, I would just implement java.util.Map.
Please help me, how would I do the same in Javascript?
edit
I think I will get what I want if I manage to override [[Get]] and [[Put]] methods mentioned here http://interglacial.com/javascript_spec/a-8.html#a-8.6.2.1
For your example, doesn't this do what you want:
var myObj = {};
myObj["two"] = 4;
myObj["four"] = 16;
myObj["twenty one"] = 441;
alert(myObj["four"]); // says 16
Or are you trying to say that the object should magically calculate the squares for you?
JavaScript object keys are strings. If you try to use a number as a key JavaScript basically converts it to a string first.
Having said that, you can use objects as keys if you define a meaningful toString method on them. But of course meaningful is something that happens on a case by case basis and only you will know what needs to be done for your case.
You can also define objects that maintain their own internal data structures which you access via object methods. I think explaining that is beyond the scope of this post. Google "javascript module pattern" for some pointers to get you started.
See http://ejohn.org/blog/javascript-getters-and-setters/
Also this particular answer: Javascript getters and setters for dummies?
edit
According to Does JavaScript have the equivalent of Python's __getattribute__? and Is there an equivalent of the __noSuchMethod__ feature for properties, or a way to implement it in JS? there is no nice way of accomplishing exactly what the OP wants. Getters and setters are not useful because you must know the name of what you're looking for in advance.
My recommendation would thus be to do something like:
var database = {}
database.cache = {}
database.get = function(key) {
// INSERT CUSTOM LOGIC to recognize "forty-two"
if (!(key in database.data))
database.cache[key] = fetch_data_from_database();
return database.cache[key];
}
database.put = function(key, value) {
database.cache[key] = value;
send_data_to_database(key, value);
}
I decided that the most correct way to implement this is to use Harmony:Proxies. It isn't working on all platforms, but it lets implement this in the most seamless way; and it may be supported in more platforms in the future.
This page contains an example that I used as a template to do what I want:
http://wiki.ecmascript.org/doku.php?id=harmony:proxies