My question is the same as Are Javascript arrays sparse? with one difference...
Are JavaScript arrays sparse, as implemented in Node.js (and/or V8)? I had assumed the were, but then I did the following test:
var testArray = [];
testArray[10000] = 'test';
console.log(testArray);
Returned is 10,000 blank elements, with 'test' at the end. Is this because of the way the debugging output works, or does Node.js actually allocate memory for the undefined array elements when adding new elements?
What you are seeing is not an actual array, just a representation. The console checks for the presence of a length and a splice property and prints the numeric properties of the object, which is why jQuery collections appear to be arrays when logged.
function RandomThingThatIsDefinitelyNotAnArray() {
this.splice = function() {};
this.length = 10;
this[5] = 5;
}
console.log(new RandomThingThatIsDefinitelyNotAnArray());
// shows [undefined, undefined, undefined, undefined, undefined, 5]
jsfFiddle for a live demo.
Node's util.inspect function specifically checks for sparse arrays to print them like that. In fact, it has a test that specifically checks for it. I know this because my full rewrite of node's inspector ultimately didn't make it into core, and lead to my UltraREPL.
Related
I'm trying to learn Javascript - here's my issue:
In the w3schools.com javascript array examples, they show the sequent example:
var person = [];
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;
document.getElementById("demo").innerHTML =
person[0] + " " + person.length;
An array "person" has been defined, but then they proceed to add some elements whit a "named" index. Then tries to print the HTML document the 0th element and the number of elements of the array, like you would do with a standard array.
The description says:
If you use a named index when accessing an array, JavaScript will
redefine the array to a standard object, and some array methods and
properties will produce undefined or incorrect results.
In fact, person[0] and person.length return respectively "undefined" and "0". Even is person was initially defined as an array, by inserting new named indexes elements, the array should be redefined as an object. But when i try do use the Array.isArray() method for checking it, it returns true:
var person = [];
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;
document.getElementById("demo").innerHTML =
person[0] + " " + person.length;
document.getElementById('test').innerHTML = Array.isArray(person);// returns true
So, why? if, as specified by the tutorial, this has been effectively redefined as a standard object, and the ECMAScript 5 has added the .isArray() method for checking if something is an array and nothing else, shouldn't this return false insted of true?
I'm sure i missed something. If i define person like this:
person = {};
then it returns false, as expected. What is happening here? I just wanted to understand arrays a little bit more, this confuses me. Is this just a broken array, but still an array?
Here's the example (without the Array.isarray() bit, just the default): https://www.w3schools.com/js/tryit.asp?filename=tryjs_array_associative_2
First of all I want to note that the example you took from the w3schools page on arrays, is from the "Associative Arrays" section, which has this important introduction:
Many programming languages support arrays with named indexes.
Arrays with named indexes are called associative arrays (or hashes).
JavaScript does not support arrays with named indexes.
In JavaScript, arrays always use numbered indexes.
This puts the example into context, because it really makes no sense to define a variable as an array and then use string keys. But this was an example to illustrate the point.
Does an Array become an Object?
That JavaScript still considers the variable to be an array is as expected. It becomes an array at the moment of assignment of [], and that does not change by adding properties to that object. Yes, arrays are objects. They just have additional capabilities.
The array did not lose any of its array-like capabilities, but those features just don't work on those string properties, ... only on numerical ones (more precisely, the non-negative integer ones).
You loosely quoted the following statement from w3schools:
If you use named indexes, JavaScript will redefine the array to a standard object.
That is wrong information and leads to your misunderstanding. There is no redefinition happening. When you add properties to any object, then the object does not change "type". It remains an instance of what it was before... An array remains an array, a date object remains a date, a regex object remains a regex, even if you assign other properties to it. But non-numerical properties do not "count" for an array: the length will remain unchanged when you add such properties. The length only reveals something about the numerical properties of the object.
This quote is yet another illustration of what the JavaScript community thinks about w3schools.com, i.e. that it is not the most reliable reference, even though it has its value for learning the language.
Example of adding useful properties to arrays
Having said the above, there are cases where you may intentionally want to make use of such properties on arrays. Let's for example think of an array of words that is sorted:
const arr = ["apple", "banana", "grapefruit", "orange", "pear"];
Now let's add something to this array that denotes that it is currently sorted:
arr.isSorted = true;
We could imagine a function that would allow one to add a value to this array, but which also verifies if the array is still sorted:
function addFruit(arr, fruit) {
if (arr.length && fruit < arr[arr.length-1]) {
arr.sorted = false;
}
arr.push(fruit);
}
Then after having added several values, it would maybe be interesting to verify whether the array needs sorting:
if (!arr.sorted) arr.sort();
So this extra property helps to avoid executing an unnecessary sort. But for the rest the array has all the functionality as if it did not have that extra property.
An object that is set up as an array and then filled as an object becomes a member of both classes. Methods of the Array class will apply to its 'array-ness':
Array.isArray(person);
returns true. Methods of the Object class will apply to its 'object-ness':
typeof(person);
returns object. When it could be either one, the 'array-ness' will prevail, because the variable was first defined as an array:
console.log(person);
will put Array [ ] on the console, because it runs the Array class's logging method. It is displayed as an empty array, since it has no numbered elements, but you could add some:
person[2]=66;
and then console.log would log Array [ <2 empty slots>, 66 ].
I think the polyfill implementation of isArray() will clear your doubt by some extent.
#Polyfill
I saw this for the first time (or noticed it for the first time) today during a maintenance of a colleagues code.
Essentially the "weird" part is the same as if you try to run this code:
var arr = [];
arr[0] = "cat"; // this adds to the array
arr[1] = "mouse"; // this adds to the array
arr.length; // returns 2
arr["favoriteFood"] = "pizza"; // this DOES NOT add to the array. Setting a string parameter adds to the underlying object
arr.length; // returns 2, not 3
Got this example from nfiredly.com
I don't know what the technical term for this "case" is so I haven't been able to find any additional information about it here or on Google but it strikes me very peculiar that this "behaviour" can at all exists in JavaScript; a kind of "mix" between Arrays and Objects (or Associative Arrays).
It states in the above code snippet that that Setting a string parameter adds to the underlying object and thus not affect the length of the "array" itself.
What is this kind of pattern?
Why is it at all possible? Is it a weird JS quirk or is it deliberate?
For what purpose would you "mix" these types?
It's possible because arrays are objects with some special behaviors, but objects nevertheless.
15.4 Array Objects
However, if you start using non-array properties on an array, some implementations may change the underlying data structure to a hash. Then array operations might be slower.
In JavaScript, arrays, functions and objects are all objects. Arrays are objects created with Array constructor function.
E.g.
var a = new Array();
Or, using shortcut array literal,
var a = [];
Both are the same. They both create objects. However, special kind of object. It has a length property and numeric properties with corresponding values which are the array elements.
This object (array) has methods like push, pop etc. which you can use to manipulate the object.
When you add a non-numeric property to this array object, you do not affect its length. But, you do add a new property to the object.
So, if you have
var a = [1];
a.x = 'y';
console.log(Object.keys(a)); // outputs ["0", "x"]
console.log(a); // outputs [1];
Consider the following example:
Code:
var array = [];
array[4] = "Hello World";
Result:
[undefined, undefined, undefined, undefined, "Hello World"]
It looks rather inefficient to be able to just declare where in the array you want your value to reside. Think in terms of big arrays (100,000+ indexes).
Is this actually an inefficient use of arrays in JavaScript, or are arrays handled in such a way that n indexes aren't actually declared undefined? (i.e. is this just pretty printed to illustrate the empty indexes?)
Note: The suggested duplicate of this question is wrong. This is not concerning zero-based indexes. I am already aware that arrays start from 0!
Your assumption is correct. JS arrays are objects, and an array declaration, like
a = ['a','b','c']
is just a shortcut for
a = {
"0": "a",
"1": "b",
"2": "c"
}
Respectively, a=[]; a[4]='foo' is the same as
a = {
"4": "foo"
}
The bunch of undefineds you see in the console is just an artifact of how the console dumps arrays, they don't actually exist.
The only difference between arrays and ordinary objects is that arrays have a "length" property which is handled in a special way. length is always equal to the max of numeric indexes assigned so far plus one:
a = [];
a[100] = 'x';
a.length; // 101
NB: I'm talking about an "ideal" JS implementation as described in the standard, specific implementations might provide under-the-hood optimizations and store arrays in a different way.
JavaScript arrays work that way. When you declare a new array:
var arr = [];
The length will be 0. And there will be no elements.
And if you insert at the 5th index:
arr[5] = 10;
Then, JavaScript engine fills the previous numeric indices with undefined values. Thereby making the array length to be 6 and not 1. If you see the array contents, it would be:
arr => array(
undefined,
undefined,
undefined,
undefined,
undefined,
10
)
I have already asked a question about this, but I am not able to find it. The question is: Wrong representation of JavaScript Array Length.
Copying the accepted answer from the above said question:
The .length is defined to be one greater than the value of the largest numeric index. (It's not just "numeric"; it's 32-bit integer values, but basically numbered properties.)
Conversely, setting the .length property to some numeric value (say, 6 in your example) has the effect of deleting properties whose property name is a number greater than or equal to the value you set it to.
It will all depend on the JavaScript interpreter as to exactly how the internal structure is handled — which would define whether it is handled in an inefficient way or not.
My simplified assumption is that actually causing the array to be printed out is the only point where those undefineds are discovered. Basically meaning that it is only through access that those particular indexes are found to have a value of undefined (i.e. not allocated). Basically the following would be different (in my thinking):
var a = []; a[100] = 1;
to that of:
var a = []; for ( var i=0; i<=99; i++ ) { a[i] = undefined; } a[100] = 1;
The latter would have all 100 indexes taking up recorded space. Whereas the former would only have one index recorded.
Basically until an index is defined, it won't take up stored space. The internal array structure will just record the indexes that are defined and the largest index.
The reality I'm sure is probably more involved, take this overview of the v8 engine (I can't link directly to the Array section because someone messed up their anchor IDs ;). It seems there are two types of storage for the v8 engine. One designed to handle arrays that are linear, and well defined, and the other to handle the type we are talking about (whereby it uses a hash) which won't work in an linear indexed manner, it will use "hashed" key locations instead.
http://www.html5rocks.com/en/tutorials/speed/v8/#toc-topic-numbers
I've had a bit of a wakeup to the nature of JavaScript array indexes recently. Pursuing it, I found the following (I'm working with Node.js in interpretive mode here):
var x=[];
x['a']='a';
console.log(x); // Yields [ a: 'a' ]
console.log(x.length); // yields 0 not 1
x[1]=1;
console.log(x); // Yields [ , 1, a: 'a' ]
console.log(x.length); // Yields 2 not 3 (one for empty 0 space, one for the occupied 1 space)
Is a: 'a' really what it looks like - an object property embedded in an array - and, thus, isn't counted in the array property .length?
In JavaScript, arrays are just objects with some special properties, such as an automatic length property, and some methods attached (such as sort, pop, join, etc.). Indeed, a will not be counted in your array, since the length property of an array only stores the amount of elements with a property name that can be represented with a 32-bit positive integer.
And since arrays always automatically define every numbered element up to the highest element with a positive 32-bit int property name, this effectively means the length property stores 1 higher than the element with the highest 32-bit integer as a property name. Thanks #Felix Kling for correcting me about this in the comments.
Adding properties such as a is not forbidden at all, but you should watch out with them, since it might be confusing when reading your code.
There's also a difference in walking through the elements in the array:
To walk through all the numbered elements:
for (var i=0; i<myArray.length; i++) {
//do something
}
To walk through every property that's not built-in:
for (var i in myArray) {
//do something
}
Note that this loop will also include anything that's included from Array.prototype that's not a built-in method. So, if you were to add Array.prototype.sum = function() {/*...*/};, it will also be looped through.
To find out if the object you're using is indeed an array, and not just an object, you could perform the following test:
if (Object.prototype.toString.call(myObject) === '[object Array]') {
//myObject is an array
} else if (typeof myObject === 'object') {
//myObject is some other kind of object
}
See #artem's comment: myObject instanceof Array might not always work correctly.
That's correct, and it's a good illustration of the fact that that new Array you've created is really just a special kind of Object. In fact, typeof [] is 'object'! Setting a named property on it (which might be written more cleanly here as x.a = 'a') is really setting a new property to the object wrapper around your "real" array (numbered properties, really). They don't affect the length property for the same reason that the Array.isArray method doesn't.
Yes, that's correct. This works because pretty much everything in JavaScript is an object, including arrays and functions, which means that you can add your own arbitrary string properties. That's not you say that you should do this.
Arrays ([]) should be indexed using nonnegative integers. Objects ({}) should be "indexed" using strings.
I'm writing a Google Chrome extension, in JavaScript, and I want to use an array to store a bunch of objects, but I want the indexes to be specific non-consecutive ID numbers.
(This is because I need to be able to efficiently look up the values later, using an ID number that comes from another source outside my control.)
For example:
var myObjects = [] ;
myObjects[471] = {foo: "bar"} ;
myObjects[3119] = {hello: "goodbye"}
When I do console.log(myObjects), in the console I see the entire array printed out, with all the thousands of 'missing' indexes showing undefined.
My question is: does this matter? Is this wasting any memory?
And even if it's not wasting memory, surely whenever I loop over the array, it wastes CPU if I have to manually skip over every missing value?
I tried using an object instead of an array, but it seems you can't use numbers as object keys. I'm hoping there's a better way to achieve this?
First of all, everyone, please learn that what the for-in statement does is called enumeration (though it's an IterationStatement) in order to differentiate from iteration. This is very important, because it leads to confusion especially among beginners.
To answer the OP's question: It doesn't take up more space (test) (you could say it's implementation dependent, but we're talking about a Google Chrome Extension!), and it isn't slower either (test).
Yet my advice is: Use what's appropriate!
In this situation: use objects!
What you want to do with them is clearly a hashing mechanism, keys are converted to strings anyway so you can safely use object for this task.
I won't show you a lot of code, other answers do it already, I've just wanted to make things clear.
// keys are converted to strings
// (numbers can be used safely)
var obj = {}
obj[1] = "value"
alert(obj[1]) // "value"
alert(obj["1"]) // "value"
Note on sparse arrays
The main reason why a sparse array will NOT waste any space is because the specification doesn't say so. There is no point where it would require property accessors to check if the internal [[Class]] property is an "Array", and then create every element from 0 < i < len to be the value undefined etc. They just happen to be undefined when the toString method is iterating over the array. It basically means they are not there.
11.2.1 Property Accessors
The production MemberExpression : MemberExpression [ Expression ] is evaluated as follows:
Let baseReference be the result of evaluating MemberExpression.
Let baseValue be GetValue(baseReference).
Let propertyNameReference be the result of evaluating Expression.
Let propertyNameValue be GetValue(propertyNameReference).
Call CheckObjectCoercible(baseValue).
Let propertyNameString be ToString(propertyNameValue).
If the syntactic production that is being evaluated is contained in strict mode code, let strict be true, else let strict be false.
Return a value of type Reference whose base value is baseValue and whose referenced name is propertyNameString, and whose strict mode flag is strict.
The production CallExpression : CallExpression [ Expression ] is evaluated in exactly the same manner, except that the contained CallExpression is evaluated in step 1.
ECMA-262 5th Edition (http://www.ecma-international.org/publications/standards/Ecma-262.htm)
You can simply use an object instead here, having keys as integers, like this:
var myObjects = {};
myObjects[471] = {foo: "bar"};
myObjects[3119] = {hello: "goodbye"};
This allows you to store anything on the object, functions, etc. To enumerate (since it's an object now) over it you'll want a different syntax though, a for...in loop, like this:
for(var key in myObjects) {
if(myObjects.hasOwnProperty(key)) {
console.log("key: " + key, myObjects[key]);
}
}
For your other specific questions:
My question is: does this matter? Is this wasting any memory?
Yes, it wastes a bit of memory for the allocation (more-so for iterating over it) - not much though, does it matter...that depends on how spaced out the keys are.
And even if it's not wasting memory, surely whenever I loop over the array, it wastes CPU if I have to manually skip over every missing value?
Yup, extra cycles are used here.
I tried using an object instead of an array, but it seems you can't use numbers as object keys. I'm hoping there's a better way to achieve this?
Sure you can!, see above.
I would use an object to store these. You can use numbers for properties using subscript notation but you can't using dot notation; the object passed as the key using subscript notation has toString() called on it.
var obj = {};
obj[471] = {foo: "bar"} ;
As I understand it from my reading of Crockford's "The Good Parts," this does not particularly waste memory, since javascript arrays are more like a special kind of key value collection than an actual array. The array's length is defined not as the number of addresses in the actual array, but as the highest-numbered index in the array.
But you are right that iterating through all possible values until you get to the array's length. Better to do as you mention, and use an object with numeric keys. This is possible by using the syntax myObj['x']=y where x is the symbol for some integer. e.g. myObj['5']=poodles Basically, convert your index to a string and you're fine to use it as an object key.
It would be implementation dependent, but I don't think you need to worry about wasted memory for the "in between" indices. The developer tools don't represent how the data is necessarily stored.
Regarding iterating over them, yes, you would be iterating over everything in between when using a for loop.
If the sequential order isn't important, then definitely use a plain Object instead of an Array. And yes, you can use numeric names for the properties.
var myObjects = {} ;
myObjects["471"] = {foo: "bar"} ;
myObjects["3119"] = {hello: "goodbye"};
Here I used Strings for the names since you said you were having trouble with the numbers. They ultimately end up represented as strings when you loop anyway.
Now you'll use a for-in statement to iterate over the set, and you'll only get the properties you've defined.
EDIT:
With regard to console.log() displaying indices that shouldn't be there, here's an example of how easy it is to trick the developer tools into thinking you have an Array.
var someObj = {};
someObj.length = 11;
someObj.splice = function(){};
someObj[10] = 'val';
console.log(someObj);
Clearly this is an Object, but Firebug and the Chrome dev tools will display it as an Array with 11 members.
[undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, "val"]
So you can see that the console doesn't necessarily reflect how/what data is actually stored.
I would simply use a constant prefix, to avoid such problems.
var myObjects = {};
myObjects['objectId_'+365] = {test: 3};
will default to Js-Objects.
You could attempt to do something like this to make it loud and clear to the JIST compiler that this is a more objecty-ish array like so:
window.SparseArray = function(len){
if (typeof len === 'number')
if (0 <= len && len <= (-1>>>0))
this.length = len;
else
new Array(len); // throws an Invalid array length type error
else
this.push.apply(this, arguments);
}
window.SparseArray.prototype = Array.prototype