How best to do a JavaScript array with non-consecutive indexes? - javascript

I'm writing a Google Chrome extension, in JavaScript, and I want to use an array to store a bunch of objects, but I want the indexes to be specific non-consecutive ID numbers.
(This is because I need to be able to efficiently look up the values later, using an ID number that comes from another source outside my control.)
For example:
var myObjects = [] ;
myObjects[471] = {foo: "bar"} ;
myObjects[3119] = {hello: "goodbye"}
When I do console.log(myObjects), in the console I see the entire array printed out, with all the thousands of 'missing' indexes showing undefined.
My question is: does this matter? Is this wasting any memory?
And even if it's not wasting memory, surely whenever I loop over the array, it wastes CPU if I have to manually skip over every missing value?
I tried using an object instead of an array, but it seems you can't use numbers as object keys. I'm hoping there's a better way to achieve this?

First of all, everyone, please learn that what the for-in statement does is called enumeration (though it's an IterationStatement) in order to differentiate from iteration. This is very important, because it leads to confusion especially among beginners.
To answer the OP's question: It doesn't take up more space (test) (you could say it's implementation dependent, but we're talking about a Google Chrome Extension!), and it isn't slower either (test).
Yet my advice is: Use what's appropriate!
In this situation: use objects!
What you want to do with them is clearly a hashing mechanism, keys are converted to strings anyway so you can safely use object for this task.
I won't show you a lot of code, other answers do it already, I've just wanted to make things clear.
// keys are converted to strings
// (numbers can be used safely)
var obj = {}
obj[1] = "value"
alert(obj[1]) // "value"
alert(obj["1"]) // "value"
Note on sparse arrays
The main reason why a sparse array will NOT waste any space is because the specification doesn't say so. There is no point where it would require property accessors to check if the internal [[Class]] property is an "Array", and then create every element from 0 < i < len to be the value undefined etc. They just happen to be undefined when the toString method is iterating over the array. It basically means they are not there.
11.2.1 Property Accessors
The production MemberExpression : MemberExpression [ Expression ] is evaluated as follows:
Let baseReference be the result of evaluating MemberExpression.
Let baseValue be GetValue(baseReference).
Let propertyNameReference be the result of evaluating Expression.
Let propertyNameValue be GetValue(propertyNameReference).
Call CheckObjectCoercible(baseValue).
Let propertyNameString be ToString(propertyNameValue).
If the syntactic production that is being evaluated is contained in strict mode code, let strict be true, else let strict be false.
Return a value of type Reference whose base value is baseValue and whose referenced name is propertyNameString, and whose strict mode flag is strict.
The production CallExpression : CallExpression [ Expression ] is evaluated in exactly the same manner, except that the contained CallExpression is evaluated in step 1.
ECMA-262 5th Edition (http://www.ecma-international.org/publications/standards/Ecma-262.htm)

You can simply use an object instead here, having keys as integers, like this:
var myObjects = {};
myObjects[471] = {foo: "bar"};
myObjects[3119] = {hello: "goodbye"};
This allows you to store anything on the object, functions, etc. To enumerate (since it's an object now) over it you'll want a different syntax though, a for...in loop, like this:
for(var key in myObjects) {
if(myObjects.hasOwnProperty(key)) {
console.log("key: " + key, myObjects[key]);
}
}
For your other specific questions:
My question is: does this matter? Is this wasting any memory?
Yes, it wastes a bit of memory for the allocation (more-so for iterating over it) - not much though, does it matter...that depends on how spaced out the keys are.
And even if it's not wasting memory, surely whenever I loop over the array, it wastes CPU if I have to manually skip over every missing value?
Yup, extra cycles are used here.
I tried using an object instead of an array, but it seems you can't use numbers as object keys. I'm hoping there's a better way to achieve this?
Sure you can!, see above.

I would use an object to store these. You can use numbers for properties using subscript notation but you can't using dot notation; the object passed as the key using subscript notation has toString() called on it.
var obj = {};
obj[471] = {foo: "bar"} ;

As I understand it from my reading of Crockford's "The Good Parts," this does not particularly waste memory, since javascript arrays are more like a special kind of key value collection than an actual array. The array's length is defined not as the number of addresses in the actual array, but as the highest-numbered index in the array.
But you are right that iterating through all possible values until you get to the array's length. Better to do as you mention, and use an object with numeric keys. This is possible by using the syntax myObj['x']=y where x is the symbol for some integer. e.g. myObj['5']=poodles Basically, convert your index to a string and you're fine to use it as an object key.

It would be implementation dependent, but I don't think you need to worry about wasted memory for the "in between" indices. The developer tools don't represent how the data is necessarily stored.
Regarding iterating over them, yes, you would be iterating over everything in between when using a for loop.
If the sequential order isn't important, then definitely use a plain Object instead of an Array. And yes, you can use numeric names for the properties.
var myObjects = {} ;
myObjects["471"] = {foo: "bar"} ;
myObjects["3119"] = {hello: "goodbye"};
Here I used Strings for the names since you said you were having trouble with the numbers. They ultimately end up represented as strings when you loop anyway.
Now you'll use a for-in statement to iterate over the set, and you'll only get the properties you've defined.
EDIT:
With regard to console.log() displaying indices that shouldn't be there, here's an example of how easy it is to trick the developer tools into thinking you have an Array.
var someObj = {};
someObj.length = 11;
someObj.splice = function(){};
someObj[10] = 'val';
console.log(someObj);
Clearly this is an Object, but Firebug and the Chrome dev tools will display it as an Array with 11 members.
[undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, undefined, "val"]
So you can see that the console doesn't necessarily reflect how/what data is actually stored.

I would simply use a constant prefix, to avoid such problems.
var myObjects = {};
myObjects['objectId_'+365] = {test: 3};
will default to Js-Objects.

You could attempt to do something like this to make it loud and clear to the JIST compiler that this is a more objecty-ish array like so:
window.SparseArray = function(len){
if (typeof len === 'number')
if (0 <= len && len <= (-1>>>0))
this.length = len;
else
new Array(len); // throws an Invalid array length type error
else
this.push.apply(this, arguments);
}
window.SparseArray.prototype = Array.prototype

Related

Is there an identity index value in JavaScript?

In JavaScript, values of objects and arrays can be indexed like the following: objOrArray[index]. Is there an identity "index" value?
In other words:
Is there a value of x that makes the following always true?
let a = [1, 2, 3, 4];
/* Is this true? */ a[x] == a
let b = { a: 1, b: 2, c: 3 };
/* Is this true? */ b[x] == b
Definition of an identity in this context: https://en.wikipedia.org/wiki/Identity_function
There is no such thing built-in, as there is rarely a need for it (and sometimes even a need against it).0 It is nevertheless possible to roll your own ‘identity’ key:
const self = Symbol('self');
Object.defineProperty(Object.prototype, self, {
enumerable: false,
get() { "use strict"; return this; }
});
This will work on all primitives (other that null and undefined) and most JavaScript objects: that is, other than proxies or those that bypass the usual prototype chain by means of e.g. Object.create(null). Any object later in the prototype chain will also be able to disable the functionality, e.g. by doing { [self]: void 0 }; all these caveats mean that x[self] === x is by no means a universal law. But this is probably the best you can do.
Modifying Object.prototype is usually considered a bad idea, but the above manages to avoid most of its badness: adding the property at a symbol key (and making it explicitly non-enumerable as well) prevents it from unexpectedly showing up in iterations and lookups that walk the prototype chain, helping ensure no code should be impacted that does not specifically look for this property.
0 Even if such a feature existed, it would not be a good solution for the asker’s original use case: a ‘cut to 50 characters or take the whole string if shorter’ operation can be expressed as s.description.substring(0, s.description.length > 50 ? 50 : void 0) (or in fact just s.description.substring(0, 50)). It wouldn’t be any easier to express even with such a feature: depending on the condition, you still need to invoke the substring method, not just look it up, but not invoke the ‘self’ non-method. And given that you need to append an ellipsis at the end in the former case, you would still have to perform the condition check outside the substring call, making any shorthand rather ineffective. All that said, tricks like described in this answer do find some real use.
The indexing operation doesn't have an identity element. The domain and range of indexing is not necessarily the same -- the domain is arrays and objects, but the range is any type of object, since array elements and object properties can hold any type. If you have an array of integers, the domain is Array, while the range is Integer, so it's not possible for there to be an identity. a[x] will always be an integer, which can never be equal to the array itself.
And even if you have an array of arrays, there's no reason to expect any of the elements to be a reference to the array itself. It's possible to create self-referential arrays like this, but most are not. And even if it is, the self-reference could be in any index, so there's no unique identity value.

Why does the isArray() Javascript method in this example returns true, if the array has been redefined as a standard object?

I'm trying to learn Javascript - here's my issue:
In the w3schools.com javascript array examples, they show the sequent example:
var person = [];
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;
document.getElementById("demo").innerHTML =
person[0] + " " + person.length;
An array "person" has been defined, but then they proceed to add some elements whit a "named" index. Then tries to print the HTML document the 0th element and the number of elements of the array, like you would do with a standard array.
The description says:
If you use a named index when accessing an array, JavaScript will
redefine the array to a standard object, and some array methods and
properties will produce undefined or incorrect results.
In fact, person[0] and person.length return respectively "undefined" and "0". Even is person was initially defined as an array, by inserting new named indexes elements, the array should be redefined as an object. But when i try do use the Array.isArray() method for checking it, it returns true:
var person = [];
person["firstName"] = "John";
person["lastName"] = "Doe";
person["age"] = 46;
document.getElementById("demo").innerHTML =
person[0] + " " + person.length;
document.getElementById('test').innerHTML = Array.isArray(person);// returns true
So, why? if, as specified by the tutorial, this has been effectively redefined as a standard object, and the ECMAScript 5 has added the .isArray() method for checking if something is an array and nothing else, shouldn't this return false insted of true?
I'm sure i missed something. If i define person like this:
person = {};
then it returns false, as expected. What is happening here? I just wanted to understand arrays a little bit more, this confuses me. Is this just a broken array, but still an array?
Here's the example (without the Array.isarray() bit, just the default): https://www.w3schools.com/js/tryit.asp?filename=tryjs_array_associative_2
First of all I want to note that the example you took from the w3schools page on arrays, is from the "Associative Arrays" section, which has this important introduction:
Many programming languages support arrays with named indexes.
Arrays with named indexes are called associative arrays (or hashes).
JavaScript does not support arrays with named indexes.
In JavaScript, arrays always use numbered indexes.
This puts the example into context, because it really makes no sense to define a variable as an array and then use string keys. But this was an example to illustrate the point.
Does an Array become an Object?
That JavaScript still considers the variable to be an array is as expected. It becomes an array at the moment of assignment of [], and that does not change by adding properties to that object. Yes, arrays are objects. They just have additional capabilities.
The array did not lose any of its array-like capabilities, but those features just don't work on those string properties, ... only on numerical ones (more precisely, the non-negative integer ones).
You loosely quoted the following statement from w3schools:
If you use named indexes, JavaScript will redefine the array to a standard object.
That is wrong information and leads to your misunderstanding. There is no redefinition happening. When you add properties to any object, then the object does not change "type". It remains an instance of what it was before... An array remains an array, a date object remains a date, a regex object remains a regex, even if you assign other properties to it. But non-numerical properties do not "count" for an array: the length will remain unchanged when you add such properties. The length only reveals something about the numerical properties of the object.
This quote is yet another illustration of what the JavaScript community thinks about w3schools.com, i.e. that it is not the most reliable reference, even though it has its value for learning the language.
Example of adding useful properties to arrays
Having said the above, there are cases where you may intentionally want to make use of such properties on arrays. Let's for example think of an array of words that is sorted:
const arr = ["apple", "banana", "grapefruit", "orange", "pear"];
Now let's add something to this array that denotes that it is currently sorted:
arr.isSorted = true;
We could imagine a function that would allow one to add a value to this array, but which also verifies if the array is still sorted:
function addFruit(arr, fruit) {
if (arr.length && fruit < arr[arr.length-1]) {
arr.sorted = false;
}
arr.push(fruit);
}
Then after having added several values, it would maybe be interesting to verify whether the array needs sorting:
if (!arr.sorted) arr.sort();
So this extra property helps to avoid executing an unnecessary sort. But for the rest the array has all the functionality as if it did not have that extra property.
An object that is set up as an array and then filled as an object becomes a member of both classes. Methods of the Array class will apply to its 'array-ness':
Array.isArray(person);
returns true. Methods of the Object class will apply to its 'object-ness':
typeof(person);
returns object. When it could be either one, the 'array-ness' will prevail, because the variable was first defined as an array:
console.log(person);
will put Array [ ] on the console, because it runs the Array class's logging method. It is displayed as an empty array, since it has no numbered elements, but you could add some:
person[2]=66;
and then console.log would log Array [ <2 empty slots>, 66 ].
I think the polyfill implementation of isArray() will clear your doubt by some extent.
#Polyfill

Why are array offsets considered own properties?

In my serialization code, I stumbled across a a stinky issue - as I loop through generic object properties, it also serializes array indexes, which is really not the plan - I serialize this data later on without saving the indexes in the stream.
[1].hasOwnProperty("0") // true
So my question is, why are array indexes considered own properties by the hasOwnProperty method? Is there even a way to tell property from array offset? A generic way that also works for TypedArray, HTMLElementCollection and whatever else?
Of course, this can be done, but it stinks:
for(var i in this) {
if(this.hasOwnProperty(i) &&
// If object is an array, we ignore the number offsets as they're not meant to be object properties
(typeof this.length!="number" || !(i<this.length) || i.length==0)) {
And yeah, the i.length==0 is there because you can actually do this:
var obj = {};
obj[""] = "something";
console.log(obj);
Yeah, you're welcome, enjoy your nightmares.
Arrays are objects, just slightly specialised. And as you have discovered, the indexes of an array are just properties called 0, 1, 2 etc.
On a really simple level, the length property just finds the highest numeric property and adds one.
You could make a slightly simpler way of filtering the keys, along the lines of
for (key in obj) {
if (isNaN(+key) && obj.hasOwnProperty(key)) {
doSomething()
}
}
Depends if you want to include the numeric properties of objects. It would be perfectly valid to do a = {'0': 'value'}, which is for the purpose of this exercise the same as b = ['value']. Although b has a length property and a does not, also b has all the other functions that come from being an array.

JavaScript array sizing and undefined values

Consider the following example:
Code:
var array = [];
array[4] = "Hello World";
Result:
[undefined, undefined, undefined, undefined, "Hello World"]
It looks rather inefficient to be able to just declare where in the array you want your value to reside. Think in terms of big arrays (100,000+ indexes).
Is this actually an inefficient use of arrays in JavaScript, or are arrays handled in such a way that n indexes aren't actually declared undefined? (i.e. is this just pretty printed to illustrate the empty indexes?)
Note: The suggested duplicate of this question is wrong. This is not concerning zero-based indexes. I am already aware that arrays start from 0!
Your assumption is correct. JS arrays are objects, and an array declaration, like
a = ['a','b','c']
is just a shortcut for
a = {
"0": "a",
"1": "b",
"2": "c"
}
Respectively, a=[]; a[4]='foo' is the same as
a = {
"4": "foo"
}
The bunch of undefineds you see in the console is just an artifact of how the console dumps arrays, they don't actually exist.
The only difference between arrays and ordinary objects is that arrays have a "length" property which is handled in a special way. length is always equal to the max of numeric indexes assigned so far plus one:
a = [];
a[100] = 'x';
a.length; // 101
NB: I'm talking about an "ideal" JS implementation as described in the standard, specific implementations might provide under-the-hood optimizations and store arrays in a different way.
JavaScript arrays work that way. When you declare a new array:
var arr = [];
The length will be 0. And there will be no elements.
And if you insert at the 5th index:
arr[5] = 10;
Then, JavaScript engine fills the previous numeric indices with undefined values. Thereby making the array length to be 6 and not 1. If you see the array contents, it would be:
arr => array(
undefined,
undefined,
undefined,
undefined,
undefined,
10
)
I have already asked a question about this, but I am not able to find it. The question is: Wrong representation of JavaScript Array Length.
Copying the accepted answer from the above said question:
The .length is defined to be one greater than the value of the largest numeric index. (It's not just "numeric"; it's 32-bit integer values, but basically numbered properties.)
Conversely, setting the .length property to some numeric value (say, 6 in your example) has the effect of deleting properties whose property name is a number greater than or equal to the value you set it to.
It will all depend on the JavaScript interpreter as to exactly how the internal structure is handled — which would define whether it is handled in an inefficient way or not.
My simplified assumption is that actually causing the array to be printed out is the only point where those undefineds are discovered. Basically meaning that it is only through access that those particular indexes are found to have a value of undefined (i.e. not allocated). Basically the following would be different (in my thinking):
var a = []; a[100] = 1;
to that of:
var a = []; for ( var i=0; i<=99; i++ ) { a[i] = undefined; } a[100] = 1;
The latter would have all 100 indexes taking up recorded space. Whereas the former would only have one index recorded.
Basically until an index is defined, it won't take up stored space. The internal array structure will just record the indexes that are defined and the largest index.
The reality I'm sure is probably more involved, take this overview of the v8 engine (I can't link directly to the Array section because someone messed up their anchor IDs ;). It seems there are two types of storage for the v8 engine. One designed to handle arrays that are linear, and well defined, and the other to handle the type we are talking about (whereby it uses a hash) which won't work in an linear indexed manner, it will use "hashed" key locations instead.
http://www.html5rocks.com/en/tutorials/speed/v8/#toc-topic-numbers

JavaScript arrays: string indexed items

I've had a bit of a wakeup to the nature of JavaScript array indexes recently. Pursuing it, I found the following (I'm working with Node.js in interpretive mode here):
var x=[];
x['a']='a';
console.log(x); // Yields [ a: 'a' ]
console.log(x.length); // yields 0 not 1
x[1]=1;
console.log(x); // Yields [ , 1, a: 'a' ]
console.log(x.length); // Yields 2 not 3 (one for empty 0 space, one for the occupied 1 space)
Is a: 'a' really what it looks like - an object property embedded in an array - and, thus, isn't counted in the array property .length?
In JavaScript, arrays are just objects with some special properties, such as an automatic length property, and some methods attached (such as sort, pop, join, etc.). Indeed, a will not be counted in your array, since the length property of an array only stores the amount of elements with a property name that can be represented with a 32-bit positive integer.
And since arrays always automatically define every numbered element up to the highest element with a positive 32-bit int property name, this effectively means the length property stores 1 higher than the element with the highest 32-bit integer as a property name. Thanks #Felix Kling for correcting me about this in the comments.
Adding properties such as a is not forbidden at all, but you should watch out with them, since it might be confusing when reading your code.
There's also a difference in walking through the elements in the array:
To walk through all the numbered elements:
for (var i=0; i<myArray.length; i++) {
//do something
}
To walk through every property that's not built-in:
for (var i in myArray) {
//do something
}
Note that this loop will also include anything that's included from Array.prototype that's not a built-in method. So, if you were to add Array.prototype.sum = function() {/*...*/};, it will also be looped through.
To find out if the object you're using is indeed an array, and not just an object, you could perform the following test:
if (Object.prototype.toString.call(myObject) === '[object Array]') {
//myObject is an array
} else if (typeof myObject === 'object') {
//myObject is some other kind of object
}
See #artem's comment: myObject instanceof Array might not always work correctly.
That's correct, and it's a good illustration of the fact that that new Array you've created is really just a special kind of Object. In fact, typeof [] is 'object'! Setting a named property on it (which might be written more cleanly here as x.a = 'a') is really setting a new property to the object wrapper around your "real" array (numbered properties, really). They don't affect the length property for the same reason that the Array.isArray method doesn't.
Yes, that's correct. This works because pretty much everything in JavaScript is an object, including arrays and functions, which means that you can add your own arbitrary string properties. That's not you say that you should do this.
Arrays ([]) should be indexed using nonnegative integers. Objects ({}) should be "indexed" using strings.

Categories

Resources