I'm currently developing a little game in Javascript and I'm using Codacy to review my code and help me cleaning it.
One of the most seen error is Generic Object Injection Sink (security/detect-object-injection).
It happens when I'm trying to access a value in an array using a variable. Like in this example :
function getValString(value)
{
var values = ["Misérable", "Acceptable", "Excellente", "Divine"];
return values[value];
}
This function is used to display on screen the value's string of an item. It receives a "value" which can be 0, 1, 2 or 3 and returns the string of the value.
Now here's my problem :
Codacy is telling me that use of var[var] should be prohibited because it causes security issues and since I'm rather new to Javascript, I was wondering why and what are the good practices in that kind of situation.
The security issue present here is that the stringified value of value may be accessing a property that is inherited from the Object's __proto__ hierarchical prototype, and not an actual property of the object itself.
For example, consider the scenario when value is a string literal of "constructor".
const property = "constructor";
const object = [];
const value = object[property];
The result of value in this context will resolve to the Array() function - which is inherited as part of the Object's prototype, not an actual property of the object variable. Furthermore, the object being accessed may have overridden any of the default inherited Object.prototype properties, potentially for malicious purposes.
This behavior can be partially prevented by doing a object.hasOwnProperty(property) conditional check to ensure the object actually has this property. For example:
const property = "constructor";
const object = [];
if (object.hasOwnProperty(property)) {
const value = object[property];
}
Note that if we suspect the object being accessed might be malicious or overridden the hasOwnProperty method, it may be necessary to use the Object hasOwnProperty inherited from the prototype directly: Object.prototype.hasOwnProperty.call(object, property)
Of course, this assumes that our Object.prototype has not already been tampered with.
This is not necessarily the full picture, but it does demonstrate a point.
Check out the following resources which elaborates in more detail why this is an issue and some alternative solutions:
https://github.com/nodesecurity/eslint-plugin-security/blob/master/docs/the-dangers-of-square-bracket-notation.md
Securely set unknown property (mitigate square bracket object injection attacks) utility function
By itself this is not a bad practice, because you do want to develop a system and make it secure. It's difficult to imagine a higher security risk to a system than one which causes the nonexistence of that system.
Yet, not being allowed to use a variable to dynamically create/use/update an index practically reduces your options of hard-coding any indexes that you may use to refer items of an array or members of an object.
Not allowing indexes greatly reduces your options, so much that it threatens with the nonexistence of any system that you may want to create in Javascript. Let's see some of the use-cases:
Numbered loops:
for (let index = 0; index < arr.length; index++) {
//do whatever with arr[index]
}
Of course, this is true to while loops as well.
in loops
for (let index in variable) {
//do whatever with arr[index]
}
of loops
for (let item of variable) {
// do whatever with item
}
see
finding dynamically a value
This is virtually used in quasi infinitely many ways, all the examples above are specific cases of this. Example:
function getItem(arr, index) {
return arr[index];
}
summary
The fear of exploits because of dynamic indexing is the equivalent of the fear from a meteor hitting into the exact place and the exact time one is in. Of course, we cannot exclude it, but one can not live in constant fear of low-chanced catastrophes. Similarly, programming is also impossible with unreasonable, paranoid fears. So, instead of rejecting dynamic indexing altogether because of the possibility of the exploits, we must refer to the actual exploits that could be possible. If we are not allowed to use dynamic instances, then whatever system we are to develop, if it's not simple as pie, will not come to existence. So, whatever threats we are afraid of should be protected against otherwise.
Example: You retrieve values from a data-source and have a field for credit card IBAN. Yeah, if that's shown to a user who is not the owner, that's a high risk. But you should protect against this by making IBAN unavailable by the mere of a use of an index by external sources, such as POST requests sent by the browser of a user.
Related
Notes about 'Not a duplicate':
I've been told this is a duplicate of What is the use of Symbol in javascript ECMAScript 6?. Well, it doesn't seem right to me. The code they've given is this:
const door = {};
// library 1
const cake1 = Symbol('cake');
door[cake1] = () => console.log('chocolate');
// library 2
const cake2 = Symbol('cake');
door[cake2] = () => console.log('vanilla');
// your code
door[cake1]();
door[cake2]();
The only thing that makes this work is because cake1 and cake2 are different (unique) names. But the developer has explicitly given these; there is nothing offered by Symbol which helps here.
For example if you change cake1 and cake2 to cake and run it, it will error:
Uncaught SyntaxError: Identifier 'cake' has already been declared
If you're already having to manually come up with unique identifiers then how is Symbol helping?
If you execute this in your console:
Symbol('cake') === Symbol('cake');
It evaluates to false. So they're unique. But in order to actually use them, you're now having to come up with 2 key names (cake1 and cake2) which are unique. This has to be done manually by the developer; there's nothing in Symbol or JavaScript in general which will help with that. You're basically creating a unique identifier using Symbol but then having to assign it manually to...a unique identifier that you've had to come up with as a developer.
With regards to the linked post they cite this as an example which does not use Symbol:
const door = {};
// from library 1
door.cake = () => console.log('chocolate');
// from library 2
door.cake = () => console.log('vanilla');
// your code
door.cake();
They try to claim this is a problem and will only log "vanilla". Well clearly that's because door.cake isn't unique (it's declared twice). The "fix" is as simple as using cake1 and cake2:
door.cake1 = () => console.log('chocolate');
door.cake2 = () => console.log('vanilla');
door.cake1(); // Outputs "chocolate"
door.cake2(); // Outputs "vanilla"
That will now work and log both "chocolate" and "vanilla". In this case Symbol hasn't been used at all, and indeed has no bearing on that working. It's simply a case that the developer has assigned a unique identifier but they have done this manually and without using Symbol.
Original question:
I'm taking a course in JavaScript and the presenter is discussing Symbol.
At the beginning of the video he says:
The thing about Symbol's is that every single one is unique and this makes them very valuable in terms of things like object property identifiers.
However he then goes on to say:
They are not enumerable in for...in loops.
They cannot be used in JSON.stringify. (It results in an empty object).
In the case of point (2) he gives this example:
console.log(JSON.stringify({key: 'prop'})); // object without Symbol
console.log(JSON.stringify({Symbol('sym1'): 'prop'})); // object using Symbol
This logs {"key": "prop"} and {} to the console respectively.
How does any of this make Symbol "valuable" in terms of being unique object keys or identifiers?
In my experience two very common things you'd want to do with an object is enumerate it, or convert the data in them to JSON to send via ajax or some such method.
I can't understand what the purpose of Symbol is at all, but especially why you would want to use them for making object identifiers? Given it will cause things later that you cannot do.
Edit - the following was part of the original question - but is a minor issue in comparison to the actual purpose of Symbol with respect to unique identifiers:
If you needed to send something like {Symbol('sym1'): 'prop'} to a backend via ajax what would you actually need to do in this case?
I replied to your comment in the other question, but since this is open I'll try to elaborate.
You are getting variable names mixed up with Symbols, which are unrelated to one another.
The variable name is just an identifier to reference a value. If I create a variable and then set it to something else, both of those refer to the same value (or in the case of non-primitives in JavaScript, the same reference).
In that case, I can do something like:
const a = Symbol('a');
const b = a;
console.log(a === b); // true
That's because there is only 1 Symbol created and the reference to that Symbol is assigned to both a and b. That isn't what you would use Symbols for.
Symbols are meant to provide unique keys which are not the same as a variable name. Keys are used in objects (or similar). I think the simplicity of the other example may be causing the confusion.
Let us imagine a more complex example. Say I have a program that lets you create an address book of people. I am going to store each person in an object.
const addressBook = {};
const addPerson = ({ name, ...data }) => {
addressBook[name] = data;
};
const listOfPeople = [];
// new user is added in the UI
const newPerson = getPersonFromUserEntry();
listOfPeople.push(newPerson.name);
addPerson(newPerson);
In this case, I would use listOfPeople to display a list and when you click it, it would show the information for that user.
Now, the problem is, since I'm using the person's name, that isn't truly unique. If I have two "Bob Smith"'s added, the second will override the first and clicking the UI from "listOfPeople" will take you to the same one for both.
Now, instead of doing that, lets use a Symbol in the addPerson() and return that and store it in listOfPeople.
const addressBook = {};
const addPerson = ({ name, ...data }) => {
const symbol = Symbol(name);
addressBook[symbol] = data;
return symbol;
};
const listOfPeople = [];
// new user is added in the UI
const newPerson = getPersonFromUserEntry();
listOfPeople.push(addPerson(newPerson));
Now, every entry in listOfPeople is totally unique. If you click the first "Bob Smith" and use that symbol to look him up you'll get the right one. Ditto for the second. They are unique even though the base of the key is the same.
As I mentioned in the other answer, the use-case for Symbol is actually fairly narrow. It is really only when you need to create a key you know will be wholly unique.
Another scenario where you might use it is if you have multiple independent libraries adding code to a common place. For example, the global window object.
If my library exports something to window named "getData" and someone has a library that also exports a "getData" one of us is going to override the other if they are loaded at the same time (whoever is loaded last).
However, if I want to be safer, instead of doing:
window.getData = () => {};
I can instead create a Symbol (whose reference I keep track of) and then call my getData() with the symbol:
window[getDataSymbol]();
I can even export that Symbol to users of my library so they can use that to call it instead.
(Note, all of the above would be fairly poor naming, but again, just an example.)
Also, as someone mentioned in the comments, these Symbols are not for sharing between systems. If I call Symbol('a') that is totally unique to my system. I can't share it with anyone else. If you need to share between systems you have to make sure you are enforcing key uniqueness.
As a very practical example what kind of problem Symbols solve, take angularjs's use of $ and $$:
AngularJS Prefixes $ and $$: To prevent accidental name collisions with your code, AngularJS prefixes names of public objects with $ and names of private objects with $$. Please do not use the $ or $$ prefix in your code.
https://docs.angularjs.org/api
You'll sometimes have to deal with objects that are "yours", but that Angular adds its own $ and $$ prefixed properties to, simply as a necessity for tracking certain states. The $ are meant for public use, but the $$ you're not supposed to touch. If you want to serialise your objects to JSON or such, you need to use Angular's provided functions which strip out the $-prefixed properties, or you need to otherwise be aware of dealing with those properties correctly.
This would be a perfect case for Symbols. Instead of adding public properties to objects which are merely differentiated by a naming convention, Symbols allow you to add truly private properties which only your code can access and which don't interfere with anything else. In practice Angular would define a Symbol once somewhere which it shares across all its modules, e.g.:
export const PRIVATE_PREFIX = Symbol('$$');
Any other module now imports it:
import { PRIVATE_PREFIX } from 'globals';
function foo(userDataObject) {
userDataObject[PRIVATE_PREFIX] = { foo: 'bar' };
}
It can now safely add properties to any and all objects without worrying about name clashes and without having to advise the user about such things, and the user doesn't need to worry about Angular adding any of its own properties since they won't show up anywhere. Only code which has access to the PRIVATE_PREFIX constant can access these properties at all, and if that constant is properly scoped, that's only Angular-related code.
Any other library or code could also add its own Symbol('$$') to the same object, and it would still not clash because they're different symbols. That's the point of Symbols being unique.
(Note that this Angular use is hypothetical, I'm just using its use of $$ as a starting point to illustrate the issue. It doesn't mean Angular actually does this in any way.)
To expand on #samanime's excellent answer, I'd just like to really put emphasis on how Symbols are most commonly used by real developers.
Symbols prevent key name collision on objects.
Let's inspect the following page from MDN on Symbols. Under "Properties", you can see some built-in Symbols. We'll look at the first one, Symbol.iterator.
Imagine for a second that you're designing a language like JavaScript. You've added special syntax like for..of and would like to allow developers to define their own behavior when their special object or class is iterated over using this syntax. Perhaps for..of could check for a special function defined on the object/class, named iterator:
const myObject = {
iterator: function() {
console.log("I'm being iterated over!");
}
};
However, this presents a problem. What if some developer, for whatever reason, happens to name their own function property iterator:
const myObject = {
iterator: function() {
//Iterate over and modify a bunch of data
}
};
Clearly this iterator function is only meant to be called to perform some data manipulation, probably very infrequently. And yet if some consumer of this library were to think myObject is iterable and use for..of on it, JavaScript will go right ahead and call that function, thinking it's supposed to return an iterator.
This is called a name collision and even if you tell every developer very firmly "don't name your object properties iterator unless it returns a proper iterator!", someone is bound to not listen and cause problems.
Even if you don't think just that one example is worthy of this whole Symbol thing, just look at the rest of the list of well-known symbols. replace, match, search, hasInstance, toPrimitive... So many possible collisions! Even if every developer is made to never use these as keys on their objects, you're really restricting the set of usable key names and therefore developer freedom to implement things how they want.
Symbols are the perfect solution for this. Take the above example, but now JavaScript doesn't check for a property named "iterator", but instead for a property with a key exactly equal to the unique Symbol Symbol.iterator. A developer wishing to implement their own iterator function writes it like this:
const myObject = {
[Symbol.iterator]: function() {
console.log("I'm being iterated over!");
}
};
...and a developer wishing to simply not be bothered and use their own property named iterator can do so completely freely without any possible hiccups.
This is a pattern developers of libraries may implement for any unique key they'd like to check for on an object, the same way the JavaScript developers have done it. This way, the problem of name collisions and needing to restrict the valid namespace for properties is completely solved.
Comment from the asker:
The bit which confused me on the linked OP is they've created 2 variables with the names cake1 and cake2. These names are unique and the developer has had to determine them so I didn't understand why they couldn't assign the variable to the same name, as a string (const cake1 = 'cake1'; const cake2 = 'cake2'). This could be used to make 2 unique key names since the strings 'cake1' !== 'cake2'. Also the answer says for Symbol you "can't share it" (e.g. between libraries) so what use is that in terms of avoiding conflict with other libraries or other developers code?
The linked OP I think is misleading - it seems the point was supposed to be that both symbols have the value "cake" and thus you technically have two duplicate property keys with the name "cake" on the object which normally isn't possible. However, in practice the capability for symbols to contain values is not really useful. I understand your confusion there, again, I think it was just another example of avoiding key name collision.
About the libraries, when a library is published, it doesn't publish the value generated for the symbol at runtime, it publishes code which, when added to your project, generates a completely unique symbol different than what the developers of the library had. However, this means nothing to users of the library. The point is that you can't save the value of a symbol, transfer it to another machine, and expect that symbol reference to work when running the same code. To reiterate, a library has code to create a symbol, it doesn't export the generated value of any symbols.
What's the purpose of Symbol in terms of unique object identifiers?
Well,
Symbol( 'description' ) !== Symbol( 'description' )
How does any of this make Symbol "valuable" in terms of being unique object keys or identifiers?
In a visitor pattern or chain of responsibility, some logic may add additional metadata to any object and that's it (imagine some validation OR ORM metadata) attached to objects but that does not persist *.
If you needed to send something like {Symbol('sym1'): 'prop'} to a backend via ajax what would you actually need to do in this case?
If I may assure you, you won't need to do that. you would consider { sym1: 'prop' } instead.
Now, this page even has a note about it
Note: If you are familiar with Ruby's (or another language) that also has a feature called "symbols", please don’t be misguided. JavaScript symbols are different.
As I said, there are useful for runtime metadata and not effective data.
The WeakSet is supposed to store elements by weak reference. That is, if an object is not referenced by anything else, it should be cleaned from the WeakSet.
I have written the following test:
var weakset = new WeakSet(),
numbers = [1, 2, 3];
weakset.add(numbers);
weakset.add({name: "Charlie"});
console.log(weakset);
numbers = undefined;
console.log(weakset);
Even though my [1, 2, 3] array is not referenced by anything, it's not being removed from the WeakSet. The console prints:
WeakSet {[1, 2, 3], Object {name: "Charlie"}}
WeakSet {[1, 2, 3], Object {name: "Charlie"}}
Why is that?
Plus, I have one more question. What is the point of adding objects to WeakSets directly, like this:
weakset.add({name: "Charlie"});
Are those Traceur's glitches or am I missing something?
And finally, what is the practical use of WeakSet if we cannot even iterate through it nor get the current size?
it's not being removed from the WeakSet. Why is that?
Most likely because the garbage collector has not yet run. However, you say you are using Traceur, so it just might be that they're not properly supported. I wonder how the console can show the contents of a WeakSet anyway.
What is the point of adding objects to WeakSets directly?
There is absolutely no point of adding object literals to WeakSets.
What is the practical use of WeakSet if we cannot even iterate through it nor get the current size?
All you can get is one bit of information: Is the object (or generically, value) contained in the set?
This can be useful in situations where you want to "tag" objects without actually mutating them (setting a property on them). Lots of algorithms contain some sort of "if x was already seen" condition (a JSON.stringify cycle detection might be a good example), and when you work with user-provided values the use of a Set/WeakSet would be advisable. The advantage of a WeakSet here is that its contents can be garbage-collected while your algorithm is still running, so it helps to reduce memory consumption (or even prevents leaks) when you are dealing with lots of data that is lazily (possibly even asynchronously) produced.
This is a really hard question. To be completely honest I had no idea in the context of JavaScript so I asked in esdiscuss and got a convincing answer from Domenic.
WeakSets are useful for security and validation reasons. If you want to be able to isolate a piece of JavaScript. They allow you to tag an object to indicate it belongs to a special set of object.
Let's say I have a class ApiRequest:
class ApiRequest {
constructor() {
// bring object to a consistent state, use platform code you have no direct access to
}
makeRequest() {
// do work
}
}
Now, I'm writing a JavaScript platform - my platform allows you to run JavaScript to make calls - to make those calls you need a ApiRequest - I only want you to make ApiRequests with the objects I give you so you can't bypass any constraints I have in place.
However, at the moment nothing is stopping you from doing:
ApiRequest.prototype.makeRequest.call(null, args); // make request as function
Object.create(ApiRequest.prototype).makeRequest(); // no initialization
function Foo(){}; Foo.prototype = ApiRequest.prototype; new Foo().makeRequest(); // no super
And so on, note that you can't keep a normal list or array of ApiRequest objects since that would prevent them from being garbage collected. Other than a closure, anything can be achieved with public methods like Object.getOwnPropertyNames or Object.getOwnSymbols. So you one up me and do:
const requests = new WeakSet();
class ApiRequest {
constructor() {
requests.add(this);
}
makeRequest() {
if(!request.has(this)) throw new Error("Invalid access");
// do work
}
}
Now, no matter what I do - I must hold a valid ApiRequest object to call the makeRequest method on it. This is impossible without a WeakMap/WeakSet.
So in short - WeakMaps are useful for writing platforms in JavaScript. Normally this sort of validation is done on the C++ side but adding these features will enable moving and making things in JavaScript.
(Of course, everything a WeakSet does a WeakMap that maps values to true can also do, but that's true for any map/set construct)
(Like Bergi's answer suggests, there is never a reason to add an object literal directly to a WeakMap or a WeakSet)
By definition, WeakSet has only three key functionalities
Weakly link an object into the set
Remove a link to an object from the set
Check if an object has already been linked to the set
Sounds more pretty familiar?
In some application, developers may need to implement a quick way to iterate through a series of data which is polluted by lots and lots of redundancy but you want to pick only ones which have not been processed before (unique). WeakSet could help you. See an example below:
var processedBag = new WeakSet();
var nextObject = getNext();
while (nextObject !== null){
// Check if already processed this similar object?
if (!processedBag.has(nextObject)){
// If not, process it and memorize
process(nextObject);
processedBag.add(nextObject);
}
nextObject = getNext();
}
One of the best data structure for application above is Bloom filter which is very good for a massive data size. However, you can apply the use of WeakSet for this purpose as well.
A "weak" set or map is useful when you need to keep an arbitrary collection of things but you don't want their presence in the collection from preventing those things from being garbage-collected if memory gets tight. (If garbage collection does occur, the "reaped" objects will silently disappear from the collection, so you can actually tell if they're gone.)
They are excellent, for example, for use as a look-aside cache: "have I already retrieved this record, recently?" Each time you retrieve something, put it into the map, knowing that the JavaScript garbage collector will be the one responsible for "trimming the list" for you, and that it will automatically do so in response to prevailing memory conditions (which you can't reasonably anticipate).
The only drawback is that these types are not "enumerable." You can't iterate over a list of entries – probably because this would likely "touch" those entries and so defeat the purpose. But, that's a small price to pay (and you could, if need be, "code around it").
WeakSet is a simplification of WeakMap for where your value is always going to be boolean true. It allows you to tag JavaScript objects so to only do something with them once or to maintain their state in respect to a certain process. In theory as it doesn't need to hold a value it should use a little less memory and perform slightly faster than WeakMap.
var [touch, untouch] = (() => {
var seen = new WeakSet();
return [
value => seen.has(value)) || (seen.add(value), !1),
value => !seen.has(value) || (seen.delete(value), !1)
];
})();
function convert(object) {
if(touch(object)) return;
extend(object, yunoprototype); // Made up.
};
function unconvert(object) {
if(untouch(object)) return;
del_props(object, Object.keys(yunoprototype)); // Never do this IRL.
};
Your console was probably incorrectly showing the contents due to the fact that the garbage collection did not take place yet. Therefore since the object wasn't garbage collected it would show the object still in weakset.
If you really want to see if a weakset still has a reference to a certain object then use the WeakSet.prototype.has() method. This method, as the name implies returns a boolean indicating wether the object still exists in the weakset.
Example:
var weakset = new WeakSet(),
numbers = [1, 2, 3];
weakset.add(numbers);
weakset.add({name: "Charlie"});
console.log(weakset.has(numbers));
numbers = undefined;
console.log(weakset.has(numbers));
Let me answer the first part, and try to avoid confusing you further.
The garbage collection of dereferenced objects is not observable! It would be a paradox, because you need an object reference to check if it exists in a map. But don't trust me on this, trust Kyle Simpson:
https://github.com/getify/You-Dont-Know-JS/blob/1st-ed/es6%20%26%20beyond/ch5.md#weakmaps
The problem with a lot of explanations I see here, is that they re-reference a variable to another object, or assign it a primitive value, and then check if the WeakMap contains that object or value as a key. Of course it doesn't! It never had that object/value as a key!
So the final piece to this puzzle: why does inspecting the WeakMap in a console still show all those objects there, even after you've removed all of your references to those objects? Because the console itself keeps persistent references to those Objects, for the purpose of being able to list all the keys in the WeakMap, because that is something that the WeakMap itself cannot do.
While I'm searching about use cases of Weakset I found these points:
"The WeakSet is weak, meaning references to objects in a WeakSet are held weakly.
If no other references to an object stored in the WeakSet exist, those objects can be garbage collected."
##################################
They are black boxes: we only get any data out of a WeakSet if we have both the WeakSet and a value.
##################################
Use Cases:
1 - to avoid bugs
2 - it can be very useful in general to avoid any object to be visited/setup twice
Refrence: https://esdiscuss.org/topic/actual-weakset-use-cases
3 - The contents of a WeakSet can be garbage collected.
4 - Possibility of lowering memory utilization.
Refrence: https://www.geeksforgeeks.org/what-is-the-use-of-a-weakset-object-in-javascript/
##################################
Example on Weakset: https://exploringjs.com/impatient-js/ch_weaksets.html
I Advice you to learn more about weak concept in JS: https://blog.logrocket.com/weakmap-weakset-understanding-javascript-weak-references/
I am working through a memory issue with one of our webapps. I am using Chrome's heap profiler. I want to make sure I understand something very explicitely, as I'm making assumptions on this information.
The # symbol in the heap profile screenshot above. I want to make sure I understand crystal clear: equal object ids implies the same object
a.objectId == b.objectId implies a same as b
a.objectId == b.objectId implies NOT a same as b
Therefore if I have two objects that I expected to actually be the same thing, yet their object id differs, this implies an unexpected copy occurred? This implies I can go and figure out in my code where I might be creating unnecessary duplicates?
The documentation appears to say this, but they don't quite say it explicitly, going on to say why they have an object id, not what it represents.
This is an object ID. Displaying an object's address makes no sense, as objects are moved during garbage collections. Those object IDs are real IDs — that means, they persist among multiple snapshots taken and are unique. This allows precise comparison between heap states. Maintaining those IDs adds an overhead to GC cycles, but it is only initiated after the first heap snapshot was taken — no overhead if heap profiles aren't used.
I get that. But I need to fit this back into my C programmer head. I realize, even with native heaps, pointer values can change over time. Can I effectively treat object ids as pointer addresses unique over time?
So I ran some test code in Chrome and the answer appears to be yes, the same object id implies identical objects. If object id differs, this implies a copy or different object altogether.
I profiled the following snippet of code which can be found in this github repo:
(function heapTest(doc) {
'use strict';
function clone(obj) {
return JSON.parse(JSON.stringify(obj));
}
var b = {'grandchild-key-2': 5};
var a = {'child-key-1': b};
doc.child1 = a;
doc.child1_again = a;
doc.child1_copy = clone(a);
})(document);
The heap profiler confirms the two references share object ids, the copy receives a new object id.
In short, this behaves like I expect. Multiple references to the same object receive the same object id. Copies refer to a different object and receive a different object id.
I recently asked a question about LocalStorage. Using JSON.parse(localStorage.item) and JSON.parse(localStorage['item']) weren't working to return NULL when the item hadn't been set yet.
However, JSON.parse(localStorage.getItem('item') did work. And it turns out, JSON.parse(localStorage.testObject || null) also works.
One of the comments basically said that localStorage.getItem() and localStorage.setItem() should always be preferred:
The getter and setter provide a consistent, standardised and
crossbrowser compatible way to work with the LS api and should always
be preferred over the other ways. -Christoph
I've come to like using the shorthand dot and bracket notations for localStorage, but I'm curious to know others' take on this. Is localStorage.getItem('item') better than localStorage.item or localStorage['item'] OR as long as they work are the shorthand notations okay?
Both direct property access (localStorage.foo or localStorage['foo']) and using the functional interface (localStorage.getItem('foo')) work fine. Both are standard and cross-browser compatible.* According to the spec:
The supported property names on a Storage object are the keys of each key/value pair currently present in the list associated with the object, in the order that the keys were last added to the storage area.
They just behave differently when no key/value pair is found with the requested name. For example, if key 'foo' does not exist, var a = localStorage.foo; will result in a being undefined, while var a = localStorage.getItem('foo'); will result in a having the value null. As you have discovered, undefined and null are not interchangeable in JavaScript. :)
EDIT: As Christoph points out in his answer, the functional interface is the only way to reliably store and retrieve values under keys equal to the predefined properties of localStorage. (There are six of these: length, key, setItem, getItem, removeItem, and clear.) So, for instance, the following will always work:
localStorage.setItem('length', 2);
console.log(localStorage.getItem('length'));
Note in particular that the first statement will not affect the property localStorage.length (except perhaps incrementing it if there was no key 'length' already in localStorage). In this respect, the spec seems to be internally inconsistent.
However, the following will probably not do what you want:
localStorage.length = 2;
console.log(localStorage.length);
Interestingly, the first is a no-op in Chrome, but is synonymous with the functional call in Firefox. The second will always log the number of keys present in localStorage.
* This is true for browsers that support web storage in the first place. (This includes pretty much all modern desktop and mobile browsers.) For environments that simulate local storage using cookies or other techniques, the behavior depends on the shim that is used. Several polyfills for localStorage can be found here.
The question is already quite old, but since I have been quoted in the question, I think I should say two words about my statement.
The Storage Object is rather special, it's an object, which provides access to a list of key/value pairs. Thus it's not an ordinary object or array.
For example it has the length attribute, which unlike the array length attribute is readonly and returns the number of keys in the storage.
With an array you can do:
var a = [1,2,3,4];
a.length // => 4
a.length = 2;
a // => [1,2]
Here we have the first reason to use the getters/setters. What if you want to set an item called length?
localStorage.length = "foo";
localStorage.length // => 0
localStorage.setItem("length","foo");
// the "length" key is now only accessable via the getter method:
localStorage.length // => 1
localStorage.getItem("length") // => "foo"
With other members of the Storage object it's even more critical, since they are writable and you can accidently overwrite methods like getItem. Using the API methods prevents any of these possible problems and provides a consistent Interface.
Also interesting point is the following paragraph in the spec (emphasized by me):
The setItem() and removeItem() methods must be atomic with respect to failure. In the case of failure, the method does nothing. That is, changes to the data storage area must either be successful, or the data storage area must not be changed at all.
Theoretically there should be no difference between the getters/setters and the [] access, but you never know...
I know it's an old post but since nobody actually mentioned performance I set up some JsPerf tests to benchmark it and as well as being a coherent interface getItem and setItem are also consistently faster than using dot notation or brackets as well as being much easier to read.
Here are my tests on JsPerf
As it was mentioned, there is almost no difference except of nonexisting key. The difference in performance varies depending on what browser/OS are you using. But it is not really that different.
I suggest you to use standard interface, just because it is a recommended way of using it.
(Let us suppose that there is a good reason for wishing this. See the end of the question if you want to read the good reason.)
I would like to obtain the same result as a for in loop, but without using that language construct. By result I mean only an array of the property names (I don't need to reproduce the behavior that would happen if I modify the object while iterating over it).
To put the question into code, I'd like to implement this function without for in:
function getPropertiesOf(obj) {
var props = [];
for (var prop in obj)
props.push(prop);
return props;
}
From my understanding of the ECMAScript 5.1 specification about the for in statement and the Object.keys method, it seems the following implementation should be correct:
function getPropertiesOf(obj) {
var props = [];
var alreadySeen = {};
// Handle primitive types
if (obj === null || obj === undefined)
return props;
obj = Object(obj);
// For each object in the prototype chain:
while (obj !== null) {
// Add own enumerable properties that have not been seen yet
var enumProps = Object.keys(obj);
for (var i = 0; i < enumProps.length; i++) {
var prop = enumProps[i];
if (!alreadySeen[prop])
props.push(prop);
}
// Add all own properties (including non-enumerable ones)
// in the alreadySeen set.
var allProps = Object.getOwnPropertyNames(obj);
for (var i = 0; i < allProps.length; i++)
alreadySeen[allProps[i]] = true;
// Continue with the object's prototype
obj = Object.getPrototypeOf(obj);
}
return props;
}
The idea is to walk explicitly the prototype chain, and use Object.keys to get the own properties in each object of the chain. We exclude property names already seen in previous objects in the chain, including when they were seen as non-enumerable. This method should even respect the additional guarantee mentioned on MDN:
The Object.keys() method returns an array of a given object's own
enumerable properties, in the same order as that provided by a
for...in loop [...].
(emphasis is mine)
I played a bit with this implementation, and I haven't been able to break it.
So the question:
Is my analysis correct? Or am I overlooking a detail of the spec that would make this implementation incorrect?
Do you know another way to do this, that would match the implementation's specific order of for in in all cases?
Remarks:
I don't care about ECMAScript < 5.1.
I don't care about performance (it can be disastrous).
Edit: to satisfy #lexicore's curiosity (but not really part of the question), the good reason is the following. I develop a compiler to JavaScript (from Scala), and the for in language construct is not part of the things I want to support directly in the intermediate representation of my compiler. Instead, I have a "built-in" function getPropertiesOf which is basically what I show as first example. I'm trying to get rid of as many builtins as possible by replacing them by "user-space" implementations (written in Scala). For performance, I still have an optimizer that sometimes "intrinsifies" some methods, and in this case it would intrinsify getPropertiesOf with the efficient first implementation. But to make the intermediate representation sound, and work when the optimizer is disabled, I need a true implementation of the feature, no matter the performance cost, as long as it's correct. And in this case I cannot use for in, since my IR cannot represent that construct (but I can call arbitrary JavaScript functions on any objects, e.g., Object.keys).
From the specification point of view, your analysis correct only under assumption that a particular implementation defines a specific order of enumeration for the for-in statement:
If an implementation defines a specific order of enumeration for the
for-in statement, that same enumeration order must be used in step 5
of this algorithm.
See the last sentence here.
So if an implementation does not provide such specific order, then for-in and Object.keys may return different things. Well, in this case even two different for-ins may return different things.
Quite interesting, the whole story reduces to the question if two for-ins will give the same results if the object was not changed. Because, if it is not the case, then how could you test "the same" anyway?
In practice, this will most probably be true, but I could also easily imagine that an object could rebuild its internal structure dynamically, between for-in calls. For instance, if certain property is accessed very often, the implementation may restructure the hash table so that access to that property is more efficient. As far as I can see, the specification does not prohibit that. And it is also not-so-unreasonable.
So the answer to your question is: no, there is no guarantee according to the specification, but still will probably work in practice.
Update
I think there's another problem. Where is it defined, what the order of properties between the members of the prototype chain is? You may get the "own" properties in the right order, but are they merged exactly the way as you do it? For instance, why child properties first and parent's next?