ES6 Hashmap with 2 keys - javascript

JavaScript ES6 introduced Map implemented with hash table. Since hash table look up time is on average O(1), for random accessed data, Map seems to be a good choice for data storage.
However, JavaScript does not have data structures like struct in C++ that can be used as keys in the Map to enable "multiple keys mapping". The closest ones are Objects, but their instances do not equal to each other even if "contents are the same".
If I want to save a 2D or 3D tile based game map using the Map type, is there a way to easily access the blocks given the coordinates? Of course strings like "1,2,3" (representing x,y,z) would work, but is there a way that we can use integers as keys?
And if I must fall back to using assembled string coordinates, would the performance decrease a lot?
EDIT: I want to use a hash table because there may be "holes" in the maps and tiles may be created randomly in the middle of nowhere.

What you are asking for is just a multidimensional Array. If you are going to use only integer keys, there is absolutely no benefit to use a Map.
const map = [];
for(let x=0; x<2; x++) {
let xArr = [];
map.push(xArr);
for(let y=0; y<2; y++) {
let yArr = [];
xArr.push(yArr);
for(let z=0; z<2; z++) {
yArr.push(`${x},${y},${z}`);
}
}
}
console.log(map[1][1][0]);
console.log(map);

I just made a performance test between storing in Array, and Nested Object as well as Object with string keys. The result is surprising to me. The fastest is Object with string keys.
https://jsperf.com/multiple-dimension-sparse-matrix
Array OPS 0.48 ±5.19% 77% slower //a[z][y][x]
Nested Object OPS 0.51 ±16.65% 77% slower //a[z][y][x]
String Object OPS 2.96 ±29.77% fastest //a["x,y,z"]

I changed Daniels' performance test by adding the "Left shifting" case, where we left shift those numbers (as long as we don't extrapolate the number limits). I got the following results:
Array OPS 0.90 ±4.68% 91% slower //a[z][y][x]
Nested Object OPS 0.86 ±3.25% 92% slower //a[z][y][x]
String Object OPS 3.59 ±16.90% 68% slower //a["x,y,z"]
Left Shift OPS 10.68 ±11.69% fastest //a[(x<<N1)+(y<<N2)+z]

Other than joining the keys into a string like you mentioned (which is perfectly fine IMO), another option is to use multiple nested Maps. For example, for a 3d structure, accessing 1,2,3 (axes x, y, and z, let's say) could be done with
bigMap.get(1).get(2).get(3)
where the bigMap (containing the 3d structure) contains Map values for each y and z slice, and each of those has Map keys corresponding to each z slice, and each z-slice Map has values containing the information at each coordinate.
But while this is a possibility, your idea of an object or Map indexed by comma-joined coordinates is perfectly fine and probably more understandable at a glance IMO.
Remember that object property lookup is O(1) just like Map lookup - there's no performance hit if you use a single object (or Map) indexed by strings.

Related

JS Map from Object.entries memory usage

I'm trying to find out the memory usage of javascript Map objects that have ~20k entries - key is a string, and value is an array of 1-2 strings.
I have created 2 Maps: with 17k and 22k entries. They both have the same memory usage in chrome profiler - how?
Why do Map objects differ in size only after Objects from which they are created get removed from the scope?
Also, i know how hashmaps work, but If someone knows how the js Map can perserve order please let me know (maybe just a linked list?).
function createMap(){
var obj = JSON.parse(bigStringRepresentingTheJSON);
return new Map(Object.entries(obj));
}
Looking at the profiler I see that both Maps take 917kb - how can that be?
But Object (obj) from which they were made takes 786kb and 1 572kb - which is reasonable.
So I thought maybe Map holds a pointer to obj from which it was created, and thats why they don't differ in size? Then, I used the createMap function so that obj gets garbage collected. Only then, Map objects take 1 9kb and 2.3kb, which is to be expected.
Scaling a Map is complicated, also a Hashmap is more efficient if data size << map size, as then hashes collide less often. Therefore it makes sense that a Map allocates more than needed, it probably allocates a fixed size hashtable, then scales if needed.
If thats done and how large the Map is depends entirely on the implementation though.
So I thought maybe Map holds a pointer to obj from which it was created, and thats why they don't differ in size?
No, the Map only holds references to the values in obj.

Memory overhead of typed arrays vs strings

I am trying to reduce the memory usage of a JavaScript web application that stores a lot of information in memory in the form of a large number of small strings. When I changed the code to use Uint8Array instead of String, I noticed that memory usage went up.
For example, consider the following code that creates many small strings:
// (1000000 strings) x (10 characters)
var a=[];
for (let i=0; i<1000000; i++)
a.push("a".repeat(10).toUpperCase());
If you put it in an empty page and let the memory usage settle for a few seconds, it settles at 70 MiB on Google Chrome. On the other hand, the following code:
// (1000000 arrays) x (10 bytes)
var a=[];
for (let i=0; i<1000000; i++)
a.push(new Uint8Array(10));
uses 233 MiB of memory. An empty page without any code uses about 20 MiB. On the other hand, if I create a small number of large strings/arrays, the difference becomes smaller and in the case of a single string/array with 10000000 characters/entries, the memory usage is virtually identical.
So why do typed arrays have such a large memory overhead?
V8 developer here. Your conclusion makes sense: If you compare characters in a string to elements in a Uint8Array, the string will have less overhead. TypedArrays are great at providing fast access to typed elements; however having a large number of small TypedArrays is not memory efficient.
The difference is in the object header size for strings and typed arrays.
For a string, the object header is:
hidden class pointer
hash
length
payload
where the payload is rounded up to pointer size alignment, so 16 bytes in this case.
For a Uint8Array, you need the following:
hidden class pointer
properties pointer (unused)
elements pointer (see below)
array buffer pointer (see below)
offset into array buffer
byte length
length of view into array buffer
length (user-visible)
embedder field #1
embedder field #2
array buffer: hidden class pointer
array buffer: properties pointer (unused)
array buffer: elements pointer (see below)
array buffer: byte length
array buffer: backing store
array buffer: allocation base
array buffer: allocation length
array buffer: bit field (internal flags)
array buffer: embedder field #1
array buffer: embedder field #2
elements object: hidden class pointer
elements object: length (of the backing store)
elements object: base pointer (of the backing store)
elements object: offset to data start
elements object: payload
where, again, the payload is rounded up to pointer size alignment, so consumes 16 bytes here.
In summary, each string consumes 5*8 = 40 bytes, each typed array consumes 26*8 = 208 bytes. That does seem like a lot of overhead; the reason is due to the various flexible options that TypedArrays provide (they can be overlapping views into ArrayBuffers, which can be allocated directly from JavaScript, or shared with WebGL and whatnot, etc).
(It's not about "optimizing memory allocation" nor being "better at garbage collecting strings" -- since you're holding on to all the objects, GC does not play a role.)
The typed arrays are not supposed to be used that way.
If you want high memory efficiency, use just one typed array to hold all of your integer numbers. Instead of use a huge number of arrays to hold your integer numbers due to low level reasons.
Those low level reasons are related to how much overhead is need to hold one object in memory, and that quantity depends on a few aspects like immutability and garbage collection. In this case hold one typed array has higher overhead than hold one simple string. Thats why you should pay that price one time only
You should take advantage of:
var a = []; for (let i=0; i<1000000; i++) a.push("1");
var b = new Uint8Array(10000000); for (let i=0; i<1000000; i++) a[i] = 1;
// 'b' is more memory efficient than 'a', just pay the price of Uint8Array one time
// and save the wasted memory in string allocation overhead

Using arraybuffers to store canvas rendering data

Canvas Performance
Lately i'm creating alot of animations in canvas, and as canvas has no events you need to create your own eventsystem based on the coordinates, in short you need a collisiondetection function.As most of the codes are very long i rewrote my own and doing so i undersood how simple it is. So i wrote some sort of a game code.
Basically canvas games are alot of temp arrays of numbers, where in most cases a number between 0 and 64,128 or 256 would be enough. Maybe reduced and used as a multipler
hitpoints=max*(number/256);
So i was thinking What if i store these values into an arraybuffer?
var array = new Float64Array(8);
array[0] = 10;
//....
Example:(i do it that way... if you know something better feel free to tell me)
//Usable ammonition
var ammo=[
['missiles',dmg,range,radius,red,green,blue,duration],
//all appart the first are numbers between 0 and 255
['projectile',dmg,range,radius,red,green,blue,duration]
]
//Flying ammonition
//this is created inside the canvas animation everytime you shoot.
var tempAmmo=[
[id,timestarted,x,y,timepassed] //these are all integers.
// the longest is probably the timestamp
]
// i could add the ammo numbers to the tempAmmo (which is faster):
[id,timestarted,x,y,timepassed,dmg,range,radius,red,green,blue,duration]
// then i do the same double array for enemies and flyingEnemies.
wouldn't it be better to store everything in arrabuffers?
What i think (correct me if i'm wrong):
arraybuffers are binary data,
they should be faster for the rendering, they should be smaller
for the memory.
Now if this two opinions are correct how do i properly create an array structure like the described one, maybe choosing the proper array Type?
Note: in my case i'm using a bidimensional array.And obiovsly i don't want to use objects.

Can reducing index length in Javascript associative array save memory

I am trying to build a large Array (22,000 elements) of Associative Array elements in JavaScript. Do I need to worry about the length of the indices with regards to memory usage?
In other words, which of the following options saves memory? or are they the same in memory consumption?
Option 1:
var student = new Array();
for (i=0; i<22000; i++)
student[i] = {
"studentName": token[0],
"studentMarks": token[1],
"studentDOB": token[2]
};
Option 2:
var student = new Array();
for (i=0; i<22000; i++)
student[i] = {
"n": token[0],
"m": token[1],
"d": token[2]
};
I tried to test this on Google Chrome DevTools, but the numbers are inconsistent to make a decision. My best guess is that because the Array indices repeat, the browser can optimize memory usage by not repeating them for each student[i], but that is just a guess.
Edit:
To clarify, the problem is the following: a large array containing many small associative arrays. Does it matter using long index or short index when it comes to memory requirements.
Edit 2:
The 3N array approach that was suggested in the comments and #Joseph Myers is referring to is creating one array 'var student = []', with a size 3*22000, and then using student[0] for name, student[1] for marks, student[2] for DOB, etc.
Thanks.
The difference is insignificant, so the answer is no. This sort of thing would barely even fall under micro optimization. You should always opt for most readable solutions when in such dilemmas. The cost of maintaining code from your second option outweighs any (if any) performance gain you could get from it.
What you should do though is use the literal for creating an array.
[] instead of new Array(). (just a side note)
A better approach to solve your problem would probably be to find a way to load the data in parts, implementing some kind of pagination (I assume you're not doing heavy computations on the client).
The main analysis of associative arrays' computational cost has to do with performance degradation as the number of elements stored increases, but there are some results available about performance loss as the key length increases.
In Algorithms in C by Sedgewick, it is noted that for some key-based storage systems the search cost does not grow with the key length, and for others it does. All of the comparison-based search methods depend on key length--if two keys differ only in their rightmost bit, then comparing them requires time proportional to their length. Hash-based methods always require time proportional to the key length (in order to compute the hash function).
Of course, the key takes up storage space within the original code and/or at least temporarily in the execution of the script.
The kind of storage used for JavaScript may vary for different browsers, but in a resource-constrained environment, using smaller keys would have an advantage, like still too small of an advantage to notice, but surely there are some cases when the advantage would be worthwhile.
P.S. My library just got in two new books that I ordered in December about the latest computational algorithms, and I can check them tomorrow to see if there are any new results about key length impacting the performance of associative arrays / JS objects.
Update: Keys like studentName take 2% longer on a Nexus 7 and 4% longer on an iPhone 5. This is negligible to me. I averaged 500 runs of creating a 30,000-element array with each element containing an object { a: i, b: 6, c: 'seven' } vs. 500 runs using an object { studentName: i, studentMarks: 6, studentDOB: 'seven' }. On a desktop computer, the program still runs so fast that the processor's frequency / number of interrupts, etc., produce varying results and the entire program finishes almost instantly. Once every few runs, the big key size actually goes faster (because other variations in the testing environment affect the result more than 2-4%, since the JavaScript timer is based on clock time rather than CPU time.) You can try it yourself here: http://dropoff.us/private/1372219707-1-test-small-objects-key-size.html
Your 3N array approach (using array[0], array[1], and array[2] for the contents of the first object; and array[3], array[4], and array[5] for the second object, etc.) works much faster than any object method. It's five times faster than the small object method and five times faster plus 2-4% than the big object method on a desktop, and it is 11 times faster on a Nexus 7.

Find in Multidiamentional Array

I have an multi dimensional array as
[
{"EventDate":"20110421221932","LONGITUDE":"-75.61481666666670","LATITUDE":"38.35916666666670","BothConnectionsDown":false},
{"EventDate":"20110421222228","LONGITUDE":"-75.61456666666670","LATITUDE":"38.35946666666670","BothConnectionsDown":false}
]
Is there any plugin available to search for combination of LONGITUDE,LATITUDE?
Thanks in advance
for (var i in VehCommLost) {
var item = VehCommLost[i];
if (item.LONGITUDE == 1 && item.LATITUDE == 2) {
//gotcha
break;
}
}
this is json string..which programming language u r using with js??
by the way try with parseJSON
Are the latitudes and longitudes completely random? or are they points along a path, so there is some notion of sequence?
If there is some ordering of the points in the array, perhaps a search algorithm could be faster.
For example:
if the inner array is up to 10,000 elements, test item 5000
if that value is too high, focus on 1-4999;
if too low, focus on 5001-10000, else 5000 is the right anwser
repeat until the range shrinks to the vicinity, making a straight loop through the remaining values quick enough.
After sleeping on it, it seems to me most likely that the solution to your problem lies in recasting the problem.
Since it is a requirement of the system that you are able to find a point quickly, I'd suggest that a large array is the wrong data structure to support that requirement. It maybe necessary to have an array, but perhaps there could also be another mechanism to make the search rapid.
As I understand it you are trying to locate points near a known lat-long.
What if, in addition to the array, you had a hash keyed on lat-long, with the value being an array of indexes into the huge array?
Latitude and Longitude can be expressed at different degrees of precision, such as 141.438754 or 141.4
The precision relates to the size of the grid square.
With some knowledge of the business domain, it should be possible to select a reasonably-sized grid such that several points fit inside but not too many to search.
So the hash is keyed on lat-long coords such as '141.4#23.2' with the value being a smaller array of indexes [3456,3478,4579,6344] using the indexes we can easily access the items in the large array.
Suppose we need to find 141.438754#23.2i7643 : we can reduce the precision to '141.4#23.2' and see if there is an array for that grid square.
If not, widen the search to the (3*3-1=) 8 adjacent grids (plus or minus one unit).
If not, widen to the (=5*5-9) 16 grid squares one unit away. And so on...
Depending on how the original data is stored and processed, it may be possible to generate the hash server-side, which would be preferable. If you needed to generate the hash client-side, it might be worth doing if you reused it for many searches, but would be kind of pointless if you used the data only once.
Could you comment on the possibility of recasting the problem in a different way, perhaps along these lines?

Categories

Resources