This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Are Javascript arrays sparse?
I am learning JavaScript at the moment and have been reading some simple introductions and tutorials. While looking at the Array object I stumbled upon some details, which strike me as very odd, coming from other languages like C/Java/Scala/...
So lets assume we define an array as such:
var arr = ['foo','bar','qux']
We now assign
arr[5] = 'baz'
which results in our array looking like this:
arr
>> ["foo", "bar", "qux", undefined, undefined, "baz"]
And the length is as expected
arr.length
>> 6
JavaScript has kindly expanded our array to the needed length - six - and the new items are set to undefined - except for the one we actually assigned a value to.
From a low level point of view this is horrible memory-wise. Typically an array would be a continuous range in memory - making an array bigger generally involves copying the whole array to a new memory location, sufficient in size. This is a very costly operation.
Now, I do realize that this is likely not what JavaScript engines are doing, as copying around arrays would be crazy expensive and the memory space would be wasted on all these 'undefined' values.
Can someone tell me what actually happens behind the door?
Are arrays actually some sort linked lists?
Are the 'undefined' array items actually there?
How expensive is it to work with large arrays that are mostly filled with 'undefined'?
In the first version of JavaScript, there were no arrays. They were later introduced as a sub-class of that "mother of all objects": Object. You can test this quite easily by doing this:
var foo = [1,2,3,4];
for (var n in foo)
{//check if n is equal (value and type) to itself, coerced to a number
console.log(n === +(n) ? 'Number' : 'String');
}
This will log String, time and time again. Internally, all numeric keys are converted to strings. The Length property merely fetches the highest index, and adds 1 to it. Nothing more. When you display your array, the object is iterated, and for each key, the same rules apply as for any object: first the instance is scanned, then the prototype(s)... so if we alter our code a bit:
var foo = [1,2,3,4];
foo[9] = 5;
for (var n in foo)
{
if (foo.hasOwnProperty(n))
{//check if current key is an array property
console.log(n === +(n) ? 'Number' : 'String');
}
}
You'll notice the array only has 5 own properties, the undefined keys 4-8 are undefined, because there was no corresponding value found within the instance, nor in any of the underlying prototypes. In short: Arrays aren't really arrays, but objects that behave similarly.
As Tim remarked, you can have an array instance with an undefined property that does exist within that object:
var foo = [1,2,undefined,3];
console.log(foo[2] === undefined);//true
console.log(foo[99] === undefined);//true
But again, there is a difference:
console.log((foo.hasOwnProperty('2') && foo[2] === undefined));//true
console.log((foo.hasOwnProperty('99') && foo[99] === undefined));//false
RECAP, your three main questions:
Arrays are objects, that allow you to reference their properties with numeric instances
The undefined values are not there, they're merely the default return value when JS scans an object and the prototypes and can't find what you're looking for: "Sorry, what you ask me is undefined in my book." is what it says.
Working with largely undefined arrays doesn't affect the size of the object itself, but accessing an undefined key might be very, very marginally slower, because the prototypes have to be scanned, too.
Update:
Just quoting the Ecma std:
15.4 Array Objects
Array objects give special treatment to a certain class of property names. A property name P (in the form of a
String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to
2^32
1. A property whose property name is an array index is also called an element. Every Array object has a
length property whose value is always a nonnegative integer less than 2^32. The value of the length
property is numerically greater than the name of every property whose name is an array index; whenever a
property of an Array object is created or changed, other properties are adjusted as necessary to maintain this
invariant. Specifically, whenever a property is added whose name is an array index, the length property is
changed, if necessary, to be one more than the numeric value of that array index; and whenever the length
property is changed, every property whose name is an array index whose value is not smaller than the new
length is automatically deleted. This constraint applies only to own properties of an Array object and is
unaffected by length or array index properties that may be inherited from its prototypes.
An object, O, is said to be sparse if the following algorithm returns true:
1. Let len be the result of calling the [[Get]] internal method of O with argument "length".
2. For each integer i in the range 0≤i
a. Let elem be the result of calling the [[GetOwnProperty]] internal method of O with argument
ToString(i).
b. If elem is undefined, return true.
3. Return false.
Arrays are just an ordered list of objects. In JavaScript everything is an object, so arrays are not really arrays as we know them :)
You can find little internals here.
For your doubts about working with large arrays... Well, remember that the less calculation you make "client-side", the faster will be your page.
Answers:
An array in JavaScript is just the same as an object (i.e. an unordered collection of properties) with a magic length property and extra prototype methods (push() etc.)
No, the undefined items are not there. JavaScript has an in operator that test for the existence of a property that you can use to prove this. So for the following array: var arr = ['foo']; arr[2] = 'bar';, 2 in arr returns true and 1 in arr returns false.
A sparse array should take up no more memory than a dense array whose length is the number of properties actually defined in your sparse array. It will only be more expensive to work with a sparse array when you iterate over its undefined properties.
Most javascript implementations implement arrays as some flavor of binary tree or hash table with the array index as the key, so a large range of undefined objects does not use up any memory.
I was told that the arrays come in 2 parts, [value, pointer]. So the pointer of arr[2] is null. When you add a 5 it changes the address from null to point to number 3, which points to number 4, which points to number 5, which is null (so end of array).
Im not sure how true this is as ive never actually checked it. But it seems to make sense.
So you cant do the maths like on a c type array (ie to get to value 4 just do starting memory point + 4x (object amount in memory)) but you can do it by following the array peice by peice
Related
This question already has answers here:
Are Javascript arrays primitives? Strings? Objects?
(7 answers)
Closed 2 years ago.
I played around with the Object.assign function and found out that i can create an array object fusion kinda?
the code looks like this:
var list = [1, 2, 3]
Object.assign(list, {property: true})
the list appears in brackets when i log it and i can still access the 1 with list[0], BUT I can also access the true by logging list.property and when i log typeof list i get and Object.
I'm really confused rn, can someone explain to me what list is and why this happens?
Thank you :)
list is an array, which is an object. Standard arrays in JavaScript are objects, not the contiguous blocks of memory divided into fixed-size elements that they are in most programming languages. They have their own literal form, they inherit from Array.prototype, and they have a length property that adjusts itself if you increase their size (and that can remove entries if you assign to it), but other than that they're just objects. You can add non-entry properties to them, as you've discovered, because they're just objects.
Here's a simpler example:
const a = [];
a.example = "hi there";
console.log(a.example); // "hi there"
In fact, what you think of as array indexes are actually property names, and they're strings. Example:
const a = ["zero"];
for (const index in a) {
console.log(index + " (" + typeof index + ")");
}
A property name is considered an array index if it meets certain criteria:
An integer index is a String-valued property key that is a canonical numeric String (see 7.1.21) and whose numeric value is either +0 or a positive integer ≤ 253 - 1. An array index is an integer index whose numeric value i is in the range +0 ≤ i < 232 - 1.
We typically write them as numbers, and JavaScript engines optimize, but in specification terms they're strings. :-)
the list appears in brackets when i log it
Typically, when you log something with a console, the console looks at it to see what it is and then bases its output on that. With most consoles, when you log an array, they just log the array entries without showing you the other properties. (That's also what you get when you use JSON.stringify, since JSON arrays can't have non-entry properties.) But the property is there, as you saw. You can also see it in for-in loops, since for-in loops through the properties of the object, not the entries of the array:
const list = [1, 2, 3];
Object.assign(list, {property: true});
for (const key in list) {
console.log(`${key} = ${list[key]}`);
}
I'm trying to learn Javascript - here's my issue:
In the w3schools.com javascript array examples, they show the sequent example:
var person = [];
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;
document.getElementById("demo").innerHTML =
person[0] + " " + person.length;
An array "person" has been defined, but then they proceed to add some elements whit a "named" index. Then tries to print the HTML document the 0th element and the number of elements of the array, like you would do with a standard array.
The description says:
If you use a named index when accessing an array, JavaScript will
redefine the array to a standard object, and some array methods and
properties will produce undefined or incorrect results.
In fact, person[0] and person.length return respectively "undefined" and "0". Even is person was initially defined as an array, by inserting new named indexes elements, the array should be redefined as an object. But when i try do use the Array.isArray() method for checking it, it returns true:
var person = [];
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;
document.getElementById("demo").innerHTML =
person[0] + " " + person.length;
document.getElementById('test').innerHTML = Array.isArray(person);// returns true
So, why? if, as specified by the tutorial, this has been effectively redefined as a standard object, and the ECMAScript 5 has added the .isArray() method for checking if something is an array and nothing else, shouldn't this return false insted of true?
I'm sure i missed something. If i define person like this:
person = {};
then it returns false, as expected. What is happening here? I just wanted to understand arrays a little bit more, this confuses me. Is this just a broken array, but still an array?
Here's the example (without the Array.isarray() bit, just the default): https://www.w3schools.com/js/tryit.asp?filename=tryjs_array_associative_2
First of all I want to note that the example you took from the w3schools page on arrays, is from the "Associative Arrays" section, which has this important introduction:
Many programming languages support arrays with named indexes.
Arrays with named indexes are called associative arrays (or hashes).
JavaScript does not support arrays with named indexes.
In JavaScript, arrays always use numbered indexes.
This puts the example into context, because it really makes no sense to define a variable as an array and then use string keys. But this was an example to illustrate the point.
Does an Array become an Object?
That JavaScript still considers the variable to be an array is as expected. It becomes an array at the moment of assignment of [], and that does not change by adding properties to that object. Yes, arrays are objects. They just have additional capabilities.
The array did not lose any of its array-like capabilities, but those features just don't work on those string properties, ... only on numerical ones (more precisely, the non-negative integer ones).
You loosely quoted the following statement from w3schools:
If you use named indexes, JavaScript will redefine the array to a standard object.
That is wrong information and leads to your misunderstanding. There is no redefinition happening. When you add properties to any object, then the object does not change "type". It remains an instance of what it was before... An array remains an array, a date object remains a date, a regex object remains a regex, even if you assign other properties to it. But non-numerical properties do not "count" for an array: the length will remain unchanged when you add such properties. The length only reveals something about the numerical properties of the object.
This quote is yet another illustration of what the JavaScript community thinks about w3schools.com, i.e. that it is not the most reliable reference, even though it has its value for learning the language.
Example of adding useful properties to arrays
Having said the above, there are cases where you may intentionally want to make use of such properties on arrays. Let's for example think of an array of words that is sorted:
const arr = ["apple", "banana", "grapefruit", "orange", "pear"];
Now let's add something to this array that denotes that it is currently sorted:
arr.isSorted = true;
We could imagine a function that would allow one to add a value to this array, but which also verifies if the array is still sorted:
function addFruit(arr, fruit) {
if (arr.length && fruit < arr[arr.length-1]) {
arr.sorted = false;
}
arr.push(fruit);
}
Then after having added several values, it would maybe be interesting to verify whether the array needs sorting:
if (!arr.sorted) arr.sort();
So this extra property helps to avoid executing an unnecessary sort. But for the rest the array has all the functionality as if it did not have that extra property.
An object that is set up as an array and then filled as an object becomes a member of both classes. Methods of the Array class will apply to its 'array-ness':
Array.isArray(person);
returns true. Methods of the Object class will apply to its 'object-ness':
typeof(person);
returns object. When it could be either one, the 'array-ness' will prevail, because the variable was first defined as an array:
console.log(person);
will put Array [ ] on the console, because it runs the Array class's logging method. It is displayed as an empty array, since it has no numbered elements, but you could add some:
person[2]=66;
and then console.log would log Array [ <2 empty slots>, 66 ].
I think the polyfill implementation of isArray() will clear your doubt by some extent.
#Polyfill
I just want to understand how Javascript arrays work but I have a complicated problem here.
First I created my array:
var arr = [];
And set some elements in it:
arr[5] = "a thing";
arr[2] = undefined;
I thought that I should have an array of size 2, because I only have two objects at 2 specific indexes. So I tested it with the .length property of arrays:
document.write(arr.length + "<br>");
The result, interestingly, is 6. But it must contain two items. How can its size be 6? It is probably related with the latest index that I used, here arr[5] = "a thing";
I then tried to loop over it:
var size = 0;
for(var x in arr){
size++;
}
And the size variable is now 2. So, what I learned from this: if I use a for in loop, I will calculate how many properties are in it, not its last index.
But if I try to document.write(arr[4]) (which is not set yet), it writes undefined.
So why is arr[2] counted in the for..in loop, but not arr[4]?
Let me answer my question: what I was thinking about typeof undefined == undefined which is amazingly true. But this is JavaScript, we need to play with it using his own rules :)
jsFiddle and snippet below.
var arr = [];
arr[5] = "a thing";
arr[2] = undefined;
document.write(arr.length + "<br>");
var size = 0;
for(var x in arr){
size++;
}
document.write(size + "<br>");
document.write(arr[4] + "<br>");
Note: Array indexes are nothing but properties of Array objects.
Quoting MDN's Relationship between length and numerical properties section,
When setting a property on a JavaScript array when the property is a valid array index and that index is outside the current bounds of the array, the engine will update the array's length property accordingly.
Quoting ECMA Script 5 Specification of Array Objects,
whenever a property is added whose name is an array index, the length property is changed, if necessary, to be one more than the numeric value of that array index; and whenever the length property is changed, every property whose name is an array index whose value is not smaller than the new length is automatically deleted
So, when you set a value at index 5, JavaScript engine adjusts the length of the Array to 6.
Quoting ECMA Script 5 Specification of Array Objects,
A property name P (in the form of a String value) is an array index if and only if ToString(ToUint32(P)) is equal to P and ToUint32(P) is not equal to 232−1.
So, in your case 2 and 4 are valid indexes but only 2 is defined in the array. You can confirm that like this
arr.hasOwnProperty(2)
The other indexes are not defined in the array yet. So, your array object is called a sparse array object.
So why arr[2] is counted in for..in loop and not arr[4] is not counted?
The for..in enumerates all the valid enumerable properties of the object. In your case, since only 2 is a valid property in the array, it will be counted.
But, when you print arr[4], it prints undefined, because JavaScript will return undefined, if you try to access a property which is not defined in an object. For example,
console.log({}['name']);
// undefined
Similarly, since 4 is not yet defined in the arr, undefined is returned.
While we are on this subject, you might want to read these answers as well,
Why doesn't the length of the array change when I add a new property?
JavaScript 'in' operator for undefined elements in Arrays
There’s a difference between a property that has the value undefined and a property that doesn’t exist, illustrated here using the in operator:
var obj = {
one: undefined
};
console.log(obj.one === undefined); // true
console.log(obj.two === undefined); // true
console.log('one' in obj); // true
console.log('two' in obj); // false
When you try to get the value of a property that doesn’t exist, you still get undefined, but that doesn’t make it exist.
Finally, to explain the behaviour you see: a for in loop will only loop over keys where that key is in the object (and is enumerable).
length, meanwhile, is just adjusted to be one more than whatever index you assign if that index is greater than or equal to the current length.
To remove undefined values from array , try utilizing .filter()
var arr = [];
arr[5] = "a thing";
arr[2] = undefined;
arr = arr.filter(Boolean);
document.write(arr.length);
It all comes down the idea of how space is handled by machines. Let's start with the simplest idea of:
var arr =[];
This in turn creates a location where you can now store information. As #Mike 'Pomax' Kamermans pointed out: This location is a special javascript object that in turn functions as a collection of keys and values, like so:
arr[key] = value;
Now moving on through your code:
arr[5] = "a thing";
The machine now is understanding that you are creating something in the (giving value to the) 6th position/5th key (as array's first position is 0). So you wind up with something that looks like this:
arr[,,,,,"a thing"];
Those commas represent empty positions (elisions as #RobG pointed out) in your array.
Same thing happens when you declare:
arr[2] = undefined;
arr[,,undefined,,,"a thing"];
So when you're iterating inside an array using "for var in" you're checking for each one of the spaces in this array that are populated, so in turn 2.
As a difference, when you check for the length of the array, you're looking to see how many spaces to store information exist inside the array, which in turn is 6.
Finally, javascript interprets empty room in an array as unidentified values, which is the reason arr[4] is being outputted as such.
Hope that answered your question.
JavaScript arrays, at least in the first version, were plain object with a length property. Most of the weird behaviour you experienced is a consequence of this.
Result interesting, it is 6. But it must contain two data, how its
size can be 6? It is probably related with the latest index that I
used here arr[5] = "a thing";
It results in 6 because the length is always 1 higher than the highest index, even if there are actually fewer items in the array.
o why arr[2] is counted in for..in loop and not arr[4] is not counted?
because when you are doing:
arr[2] = undefined;
You are actually adding a key called 2 to the array object. As result, the value arr[2] is counted in the for in loop, while the a[4] is ignored.
The assignment sets the property of the array, so that when you do the for var i in style for loop, you only see properties that have been set (even if you set them to be undefined). When you assign a new integery property such as arr[6] the array modifies the length of the array to be 7. The memory underlying the array may or may not be reallocated accordingly, but it will be there for you when you go to use it - unless your system is out of memory.
Edited according to RobG's comment about what ECMA-262 says.
Consider the following example:
Code:
var array = [];
array[4] = "Hello World";
Result:
[undefined, undefined, undefined, undefined, "Hello World"]
It looks rather inefficient to be able to just declare where in the array you want your value to reside. Think in terms of big arrays (100,000+ indexes).
Is this actually an inefficient use of arrays in JavaScript, or are arrays handled in such a way that n indexes aren't actually declared undefined? (i.e. is this just pretty printed to illustrate the empty indexes?)
Note: The suggested duplicate of this question is wrong. This is not concerning zero-based indexes. I am already aware that arrays start from 0!
Your assumption is correct. JS arrays are objects, and an array declaration, like
a = ['a','b','c']
is just a shortcut for
a = {
"0": "a",
"1": "b",
"2": "c"
}
Respectively, a=[]; a[4]='foo' is the same as
a = {
"4": "foo"
}
The bunch of undefineds you see in the console is just an artifact of how the console dumps arrays, they don't actually exist.
The only difference between arrays and ordinary objects is that arrays have a "length" property which is handled in a special way. length is always equal to the max of numeric indexes assigned so far plus one:
a = [];
a[100] = 'x';
a.length; // 101
NB: I'm talking about an "ideal" JS implementation as described in the standard, specific implementations might provide under-the-hood optimizations and store arrays in a different way.
JavaScript arrays work that way. When you declare a new array:
var arr = [];
The length will be 0. And there will be no elements.
And if you insert at the 5th index:
arr[5] = 10;
Then, JavaScript engine fills the previous numeric indices with undefined values. Thereby making the array length to be 6 and not 1. If you see the array contents, it would be:
arr => array(
undefined,
undefined,
undefined,
undefined,
undefined,
10
)
I have already asked a question about this, but I am not able to find it. The question is: Wrong representation of JavaScript Array Length.
Copying the accepted answer from the above said question:
The .length is defined to be one greater than the value of the largest numeric index. (It's not just "numeric"; it's 32-bit integer values, but basically numbered properties.)
Conversely, setting the .length property to some numeric value (say, 6 in your example) has the effect of deleting properties whose property name is a number greater than or equal to the value you set it to.
It will all depend on the JavaScript interpreter as to exactly how the internal structure is handled — which would define whether it is handled in an inefficient way or not.
My simplified assumption is that actually causing the array to be printed out is the only point where those undefineds are discovered. Basically meaning that it is only through access that those particular indexes are found to have a value of undefined (i.e. not allocated). Basically the following would be different (in my thinking):
var a = []; a[100] = 1;
to that of:
var a = []; for ( var i=0; i<=99; i++ ) { a[i] = undefined; } a[100] = 1;
The latter would have all 100 indexes taking up recorded space. Whereas the former would only have one index recorded.
Basically until an index is defined, it won't take up stored space. The internal array structure will just record the indexes that are defined and the largest index.
The reality I'm sure is probably more involved, take this overview of the v8 engine (I can't link directly to the Array section because someone messed up their anchor IDs ;). It seems there are two types of storage for the v8 engine. One designed to handle arrays that are linear, and well defined, and the other to handle the type we are talking about (whereby it uses a hash) which won't work in an linear indexed manner, it will use "hashed" key locations instead.
http://www.html5rocks.com/en/tutorials/speed/v8/#toc-topic-numbers
I've had a bit of a wakeup to the nature of JavaScript array indexes recently. Pursuing it, I found the following (I'm working with Node.js in interpretive mode here):
var x=[];
x['a']='a';
console.log(x); // Yields [ a: 'a' ]
console.log(x.length); // yields 0 not 1
x[1]=1;
console.log(x); // Yields [ , 1, a: 'a' ]
console.log(x.length); // Yields 2 not 3 (one for empty 0 space, one for the occupied 1 space)
Is a: 'a' really what it looks like - an object property embedded in an array - and, thus, isn't counted in the array property .length?
In JavaScript, arrays are just objects with some special properties, such as an automatic length property, and some methods attached (such as sort, pop, join, etc.). Indeed, a will not be counted in your array, since the length property of an array only stores the amount of elements with a property name that can be represented with a 32-bit positive integer.
And since arrays always automatically define every numbered element up to the highest element with a positive 32-bit int property name, this effectively means the length property stores 1 higher than the element with the highest 32-bit integer as a property name. Thanks #Felix Kling for correcting me about this in the comments.
Adding properties such as a is not forbidden at all, but you should watch out with them, since it might be confusing when reading your code.
There's also a difference in walking through the elements in the array:
To walk through all the numbered elements:
for (var i=0; i<myArray.length; i++) {
//do something
}
To walk through every property that's not built-in:
for (var i in myArray) {
//do something
}
Note that this loop will also include anything that's included from Array.prototype that's not a built-in method. So, if you were to add Array.prototype.sum = function() {/*...*/};, it will also be looped through.
To find out if the object you're using is indeed an array, and not just an object, you could perform the following test:
if (Object.prototype.toString.call(myObject) === '[object Array]') {
//myObject is an array
} else if (typeof myObject === 'object') {
//myObject is some other kind of object
}
See #artem's comment: myObject instanceof Array might not always work correctly.
That's correct, and it's a good illustration of the fact that that new Array you've created is really just a special kind of Object. In fact, typeof [] is 'object'! Setting a named property on it (which might be written more cleanly here as x.a = 'a') is really setting a new property to the object wrapper around your "real" array (numbered properties, really). They don't affect the length property for the same reason that the Array.isArray method doesn't.
Yes, that's correct. This works because pretty much everything in JavaScript is an object, including arrays and functions, which means that you can add your own arbitrary string properties. That's not you say that you should do this.
Arrays ([]) should be indexed using nonnegative integers. Objects ({}) should be "indexed" using strings.