I was searching for a sorted map in javascript and found collections.js's implementation - SortedMap. I also see that they have something called SortedArrayMap.
In the description of SortedArrayMap it says:
A map of key value pairs, sorted by key, backed by an array.
(Source)
And in the description of SortedMap it says:
A map with entries sorted by key.
What exactly does backed by an array mean? and what is the difference between it and the normal SortedMap?
As stated on http://www.collectionsjs.com/map, maps are different as they work like dictionaries but accepting even objects as a key.
Reading the Dict documentation on http://www.collectionsjs.com/dict we understand that:
A dictionary is a specialized map. The keys are required to be strings.
An array is simply a data structure where you can insert anything using a simple push(thingToInsert). Items inserted into an Array are just values without a key to point to them.
At first glance, SortedArrayMap uses a binary search strategy to maintain the order of the entries and is backed by an array, which means that use an array as a basic data structure(to maintain data of key). So that means in the end, that it uses a map based on SortedArray class.
However, SortedMap is a general key/value collection, where you can access values with their relative key(quick access rather than an array order).
Anyway, if you want to dive a bit more into this topic, you can find a very good explanation on the is the official website of collections.js,
Related
If we can make key/value pairs with plain javascript objects, then what is the reason to use the new ES6 Map object?
When should I use one and when the other? Is Map limited to values or can it contain functions as well?
Anything can be used as a key in a map.
Maps are ordered, and that allows for iteration.
Combining 1 and 2, when you iterate over a map, you'll get a useful array of key-value pairs!
Check out the map.prototype.forEach() documentation.
Source: Another good question/answer exchange. Worth marking this one as a duplicate.
Update:
Adding to this answer to address the question directly:
You should use a map whenever you need to associate things together or preserve insertion order (common data structures need this).
You can use an object when you don't need to do this, but they just do different things.
Update 2:
OP asked if functions are okay too. Yes, because values can be functions too! Check it out:
let x = new Map();
let y = () => {
console.log('derp');
}
x.set(y, "HI");
console.log(x.get(y)); // will log "HI"
For more info, check out the source of this quote, in a great chapter of Eloquent JavaScript:
"Every value has a type that determines its role. There are six basic types of values in JavaScript: numbers, strings, Booleans, objects, functions, and undefined values."
Also, the main differences between Map and Object, from MDN, under the header "Objects and Maps Compared":
An Object has a prototype, so there are default keys in the map. However, this can be bypassed using map = Object.create(null).
The keys of an Object are Strings, where they can be any value for a Map.
You can get the size of a Map easily while you have to manually keep track of size for an Object.
Again, the keys can be any value!
Maps are ordered.
Map keys do not have to be strings
This is more a general question about the inner workings of the language. I was wondering how javascript gets the value of an index. For example when you write array[index] does it loop through the array till it finds it? or by some other means? the reason I ask is because I have written some code where I am looping through arrays to match values and find points on a grid, I am wondering if performance would be increased by creating and array like array[gridX][gridY] or if it will make a difference. what I am doing now is going through a flat array of objects with gridpoints as properties like this:
var array = [{x:1,y:3}];
then looping through and using those coordinates within the object properties to identify and use the values contained in the object.
my thought is that by implementing a multidimensional grid it would access them more directly as can specify a gridpoint by saying array[1][3] instead of looping through and doing:
for ( var a = 0; a < array.length; a += 1 ){
if( array[a].x === 1 && array[a].y === 3 ){
return array[a];
}
}
or something of the like.
any insight would be appreciated, thanks!
For example when you write array[index] does it loop through the array till it finds it? or by some other means?
This is implementation defined. Javascript can have both numeric and string keys and the very first Javascript implementations did do this slow looping to access things.
However, nowadays most browsers are more efficient and store arrays in two parts, a packed array for numeric indexes and a hash table for the rest. This means that accessing a numeric index (for dense arrays without holes) is O(1) and accessing string keys and sparse arrays is done via hash tables.
I am wondering if performance would be increased by creating and array like array[gridX][gridY] or if it will make a difference. what I am doing now is going through a flat array of objects with gridpoints as properties like this array[{x:1,y:3}]
Go with the 2 dimension array. Its a much simpler solution and is most likely going to be efficient enough for you.
Another reason to do this is that when you use an object as an array index what actually happens is that the object is converted to a string and then that string is used as a hash table key. So array[{x:1,y:3}] is actually array["[object Object]"]. If you really wanted, you could override the toString method so not all grid points serialize to the same value, but I don't think its worth the trouble.
Whether it's an array or an object, the underlying structure in any modern javascript engine is a hashtable. Need to prove it? Allocate an array of 1000000000 elements and notice the speed and lack of memory growth. Javascript arrays are a special case of Object that provides a length method and restricts the keys to integers, but it's sparse.
So, you are really chaining hashtables together. When you nest tables, as in a[x][y], you creating multiple hashtables, and it will require multiple visits to resolve an object.
But which is faster? Here is a jsperf testing the speed of allocation and access, respectively:
http://jsperf.com/hash-tables-2d-versus-1d
http://jsperf.com/hash-tables-2d-versus-1d/2
On my machines, the nested approach is faster.
Intuition is no match for the profiler.
Update: It was pointed out that in some limited instances, arrays really are arrays underneath. But since arrays are specialized objects, you'll find that in these same instances, objects are implemented as arrays as well (i.e., {0:'hello', 1:'world'} is internally implemented as an array. But this shouldn't frighten you from using arrays with trillions of elements, because that special case will be discarded once it no longer makes sense.
To answer your initial question, in JavaScript, arrays are nothing more than a specialized type of object. If you set up an new Array like this:
var someArray = new Array(1, 2, 3);
You end up with an Array object with a structure that looks more-or-less, like this (Note: this is strictly in regards to the data that it is storing . . . there is a LOT more to an Array object):
someArray = {
0: 1,
1: 2,
2: 3
}
What the Array object does add to the equation, though, is built in operations that allow you to interact with it in the [1, 2, 3] concept that you are used to. push() for example, will use the array's length property to figure out where the next value should be added, creates the value in that position, and increments the length property.
However (getting back to the original question), there is nothing in the array structure that is any different when it comes to accessing the values that it stores than any other property. There is no inherent looping or anything like that . . . accessing someArray[0] is essentially the same as accessing someArray.length.
In fact, the only reason that you have to access standard array values using the someArray[N] format is that, the stored array values are number-indexed, and you cannot directly access object properties that begin with a number using the "dot" technique (i.e., someArray.0 is invalid, someArray[0] is not).
Now, admittedly, that is a pretty simplistic view of the Array object in JavaScript, but, for the purposes of your question, it should be enough. If you want to know more about the inner workings, there is TONS of information to be found here: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array
single dimension hash table with direct access:
var array = {
x1y3: 999,
x10y100: 0
};
function getValue(x, y) {
return array['x' + x + 'y' + y];
}
I am currently in the process of writing a GUI which fundamentally allows users to edit/populate/delete a number of settings files, where the settings are stored in JSON, using AJAX.
I have limited experience with JavaScript (I have little experience with anything beyond MATLAB to be frank), however I find myself restructuring my settings structure because of the semantics of working with an object containing more objects, rather than an array of objects. In C# I would do this using a KeyValuePair, however the JSON structure prevents me from doing what I'd really like to do here, and I was wondering whether there was an accepted convention for do this in JavaScript which I should adopt now, rather than making these changes and finding that I cause more issues than I solve.
The sample data structure, which has similar requirements to many of my structures, accepts any number of years, and within these any number of events, and within these a set number of values.
Here is the previous structure:
{"2013":
{
"someEventName":
{
"data1":"foo",
"data2":"bar",
...},
...},
...}
Here is my ideal structure, where the year/event name operates as a key of type string for a value of type Array:
["2013":
[
"someEventName":
{
"data1":"foo",
"data2":"bar",
...},
...],
...]
As far as I am aware, this would be invalid JSON notation, so here is my proposed structure:
[{"Key":"2013",
"Value":
[{"Key":"someEventName",
"Value":
{
"data1":"foo",
"data2":"bar",
...}
},
...]
},
...]
My proposed "test" for whether something should be an object containing objects or an array of objects is "does my sub-structure take a fixed, known number of objects?" If yes, design as object containing objects; if no, design as array of objects.
I am required to filter through this structure frequently to find data/values, and I don't envisage ever exploiting the index functionality that using an array brings, however pushing and removing data from an array is much more flexible than to an object and it feels like using an object containing objects deviates from the class model of OOP; on the other hand, the methods for finding my data by "Key" all seem simpler if it is an object containing objects, and I don't envisage myself using Prototype methods on these objects anyway so who cares about breaking OOP.
Response 1
In the previous structure to add a year, for example, the code would be OBJ["2014"]={}; in the new structure it would be OBJ.push({"Key":"2014", "Value":{}}); both of these solutions are similarly lacking in their complexity.
Deleting is similarly trivial in both cases.
However, if I want to manipulate the value of an event, say, using a function, if I pass a pointer to that object to the function and try to superceed the whole object in the reference, it won't work: I am forced to copy the original event (using jQuery or worse) and reinsert it at the parent level. With a "Value" attribute, I can overwrite the whole value element however I like, provided I pass the entire {"Key":"", "Value":""} object to the function. It's an awful lot cleaner in this situation for me to use the array of objects method.
I am also basing this change to arrays on the wealth of other responses on stackoverflow which encourage the use of them instead of objects.
If all you're going to do is iterate over your objects, then an array of objects makes more sense. If these are settings and people are going to need to look up a specific one then the original object notation is better. the original allows people write code like
var foo = settings['2013'][someEventName].data1
whereas getting that data out of the array of objects would requires iterating through them to find the one with the key: 2013 which depending on the length of the list will cause performance issues.
Pushing new data to the object is as simple as
settings['2014'] = {...}
and deleting data from an object is also simple
delete settings['2014']
Objects in JavaScript contain key-value pairs. The cost of a typical pair (using DevTools Profiler) is a reference to the key name, 8 bytes, and the cost of the object: 4 bytes for small ints, 8 bytes for numbers, references, etc.
The cost of the keys add up, especially in arrays with millions of objects.
Is there a asm.js sort of way to use typed arrays for arrays of identical objects?
Yes, I know this seems like a pain, but for a particular project, this may be required.
The sort of approach I'm thinking of is using a template JS object who's keys describes the offset into a typed array for each key's value, along with its type. For arrays of these objects, there'd be multiple of these object spans.
Thus two questions:
1 - Are my assumptions correct .. and there is no optimization in chrome/modern browsers that optimize the key costs? Possibly with constraints used here: http://www.2ality.com/2013/08/protecting-objects.html
2 - If so, is there a library for handling typed arrays as objects? Or any articles or gists etc?
If you have millions of objects, all with the same set of known keys and you have so many of them that memory is an issue, then you probably don't want to store your data as javascript objects at all.
You probably want to think about this like a database problem. You want:
A semi-compact storage format
A way to find the right record in the storage format (this depends upon what the data is and how you need to access it.
A way to read the semi-compact storage format and turn it into a live javascript object only when you want to actually use that object in your code.
A way to write changes back to the semi-compact storage format.
For example, if you had seven keys (e.g. fields) and three were numbers and four were strings and one of the numbers was your lookup key, then you could do this:
Create three typed arrays (one for each numeric value)
Create one regular array. This will hold all the string values concatenated together with a unique separator character.
Create a master key lookup object
As you read your data in, presumably from multiple ajax calls, you do the following:
Note the length of one of your arrays (they are all the same length) as this will be your new record number.
Add the lookup key to the master key object and set the value of the key to be the record number.
Add each numeric value to each typed array (each will be the record number index in the array)
Concat all the string values together with a separator between them in order by key and then put the contactenated string value into the string array.
Now you have a semi-compact storage format and a key lookup means. When you want to lookup a value, you use the master key lookup object. The value for the key will be the record number which is an index into the other arrays. You can create two functions that will find a record and return a javascript object form of the record (all data in key/value pairs on the object) and another function that will write an object (that might have changed = but the master key can't change) back to your storage format.
This makes a few assumptions about your data that you have one master key that won't change that you use for lookup and that you can find a separator to bind all the string values together and then separate them apart later and that you can know when you go to store all this and that you know what the keys are and that the objects generally all have the same keys.
If any of those assumptions are not true, then the design would have to be adapted to deal with that, but hopefully you get the idea of using something other than a giant array of objects to store your data and then constituting a given object only when you need to work with that record's data.
I'm building an Entity System for a game, and basically I'm not sure whether I should use simple objects (dictionaries) or arrays to to store the entities/components by their id.
My biggest issue is that I didn't want a dynamic entity id. If the id was just a string (using the dictionary to store entities), then it would always be valid and I could use storage[id] to get to that entity.
If I used arrays, I thought, the id's of entities, which would represent an index in the storage array, would change. Consider this array of entities:
[
Entity,
Entity, //This one is being removed.
Entity
];
If I was to remove the second entity in that array, I thought that the id required to access the third array would have to change to the id (index) of the (now removed) second entity. That's because I thought about deleting in terms of splice()ing.
But, I could use the delete expression to turn the element (an entity) into an undefined! And, if it's true that arrays in Javascript are actually just objects, and objects logically have infinitely many undefined values, does that mean that undefined values inside arrays don't use up memory?
Initially, I though that arrays were implemented in a way that they were aligned in memory, and that an index was just an offset from the first element, and by this logic I thought that undefined values would use at least the memory of a pointer (because I, actually, thought that pointers to elements are aligned, not elements themselves).
So, if I stored 10k+ entities in this array, and deleteed half of them, would the 5k undefined's use any memory at all?
Also, when I do a for entity in array loop, would these undefined elements be passed?
Also, where can I find resources to see how arrays are actually supposed to be implemented in Javascript? All I can find are general explanations of arrays and tutorials on how to use them, but I want to find out all about these little quirks that can prove important in certain situations. Something like a "Javascript quirks" site would be great.
Arrays are not just objects. In particular the length property is very magic.
Of course, a JavaScript engine is allowed to represent the array internally in any way it chooses, as long as the external API remains the same. For instance, if you set randomly separated values then they may be stored as a hash, but if you set consecutive values then they may be optimised into an array.
for ... in does not enumerate properties that are not set. This includes an array literal that skips values e.g. [true, , false], which will only enumerate indices 0 and 2.