What is the more efficient (in terms of memory consumed) way to store an array of length 10,000 with 4 integer properties:
Option 1:Array of objects
var array = [];
array[0] = {p1:1, p2:1, p3:1, p4:1}
or
Option 2: Four arrays of integers
var p1 = [], p2 = [], p3 = [], p4 = [];
p1[0] = 1;
p2[0] = 1;
p3[0] = 1;
p4[0] = 1;
Option 2. 4 objects (arrays are objects too) vs. 10001 objects.
Four arrays of 10,000 elements is probably better in terms of memory, because you're only storing four complex objects (arrays) and then 40,000 integers -- where the other way you are storing 10,000 arrays and 40,000 integers (4 per array).
My guess would be that, purely from a bits and bytes standpoint, a single multi-dimensional array would have the smallest footprint:
var p = [];
p[0] = [1,1,1,1];
I actually tested both options with the Google Chrome task manager, which shows information about the open tabs (Shift+ESC), and while this not might be 100% accurate, it does show significant differences:
For the first option, creating an array with 10,000 elements, each being an object with 4 properties as you specified, the memory usage jumped by about 10MB after initiating the array.
The second option, creating 4 arrays with 10,000 elements each, made the memory usage jump by about 5MB.
Some of that memory usage jump might be related to the actual processing of the creation and internal browser stuff, but the point is - as expected - creating objects is adding more overhead for the data you are storing.
Related
JavaScript ES6 introduced Map implemented with hash table. Since hash table look up time is on average O(1), for random accessed data, Map seems to be a good choice for data storage.
However, JavaScript does not have data structures like struct in C++ that can be used as keys in the Map to enable "multiple keys mapping". The closest ones are Objects, but their instances do not equal to each other even if "contents are the same".
If I want to save a 2D or 3D tile based game map using the Map type, is there a way to easily access the blocks given the coordinates? Of course strings like "1,2,3" (representing x,y,z) would work, but is there a way that we can use integers as keys?
And if I must fall back to using assembled string coordinates, would the performance decrease a lot?
EDIT: I want to use a hash table because there may be "holes" in the maps and tiles may be created randomly in the middle of nowhere.
What you are asking for is just a multidimensional Array. If you are going to use only integer keys, there is absolutely no benefit to use a Map.
const map = [];
for(let x=0; x<2; x++) {
let xArr = [];
map.push(xArr);
for(let y=0; y<2; y++) {
let yArr = [];
xArr.push(yArr);
for(let z=0; z<2; z++) {
yArr.push(`${x},${y},${z}`);
}
}
}
console.log(map[1][1][0]);
console.log(map);
I just made a performance test between storing in Array, and Nested Object as well as Object with string keys. The result is surprising to me. The fastest is Object with string keys.
https://jsperf.com/multiple-dimension-sparse-matrix
Array OPS 0.48 ±5.19% 77% slower //a[z][y][x]
Nested Object OPS 0.51 ±16.65% 77% slower //a[z][y][x]
String Object OPS 2.96 ±29.77% fastest //a["x,y,z"]
I changed Daniels' performance test by adding the "Left shifting" case, where we left shift those numbers (as long as we don't extrapolate the number limits). I got the following results:
Array OPS 0.90 ±4.68% 91% slower //a[z][y][x]
Nested Object OPS 0.86 ±3.25% 92% slower //a[z][y][x]
String Object OPS 3.59 ±16.90% 68% slower //a["x,y,z"]
Left Shift OPS 10.68 ±11.69% fastest //a[(x<<N1)+(y<<N2)+z]
Other than joining the keys into a string like you mentioned (which is perfectly fine IMO), another option is to use multiple nested Maps. For example, for a 3d structure, accessing 1,2,3 (axes x, y, and z, let's say) could be done with
bigMap.get(1).get(2).get(3)
where the bigMap (containing the 3d structure) contains Map values for each y and z slice, and each of those has Map keys corresponding to each z slice, and each z-slice Map has values containing the information at each coordinate.
But while this is a possibility, your idea of an object or Map indexed by comma-joined coordinates is perfectly fine and probably more understandable at a glance IMO.
Remember that object property lookup is O(1) just like Map lookup - there's no performance hit if you use a single object (or Map) indexed by strings.
Wanted to share a simple experiment I ran, using node.js v6.11.0 under Win 10.
Goal. Compare arrays vs. objects in terms of memory occupied.
Code. Each function reference, twoArrays, matrix and objects create two arrays of same size, containing random numbers. They organize the data a bit differentely.
reference creates two arrays of fixed size and fills them with numbers.
twoArrays fills two arrays via push (so the interpreter doesn't know the final size).
objects creates one array via push, each element is an object containing two numbers.
matrix creates a two-row matrix, also using push.
const SIZE = 5000000;
let s = [];
let q = [];
function rand () {return Math.floor(Math.random()*10)}
function reference (size = SIZE) {
s = new Array(size).fill(0).map(a => rand());
q = new Array(size).fill(0).map(a => rand());
}
function twoArrays (size = SIZE) {
s = [];
q = [];
let i = 0;
while (i++ < size) {
s.push(rand());
q.push(rand());
}
}
function matrix (size = SIZE) {
s = [];
let i = 0;
while (i++ < size) s.push([rand(), rand()]);
}
function objects (size = SIZE) {
s = [];
let i = 0;
while (i++ < size) s.push({s: rand(), q: rand()});
}
Result. After running each function separately in a fresh environment, and after calling global.gc() few times, the Node.js environment was occupying the following memory sizes:
reference: 84 MB
twoArrays: 101 MB
objects: 249 MB
matrix: 365 MB
theoretical: assuming that each number takes 8 bytes, the size should be 5*10^6*2*8 ~ 80 MB
We see, that reference resulted in a lightest memory structure, which is kind of obvious.
twoArrays is taking a bit more of memory. I think this is due to the fact that the arrays there are dynamic and the interpreter allocates memory in chunks, as soon as next push operation is exceeding preallocated space. Hence the final memory allocation is done for more than 5^10 numbers.
objects is interesting. Although each object is kind of fixed, it seems that the interpreter doesn't think so, and allocates much more space for each object then necessary.
matrix is also quite interesting - obviously, in case of explicit array definition in the code, the interpreter allocates more memory than required.
Conclusion. If your aim is a high-performance application, try to use arrays. They are also fast and have just O(1) time for random access. If the nature of your project requires objects, you can quite often simulate them with arrays as well (in case number of properties in each object is fixed).
Hope this is usefull, would like to hear what people think or maybe there are links to some more thorough experiments...
I am trying to reduce the memory usage of a JavaScript web application that stores a lot of information in memory in the form of a large number of small strings. When I changed the code to use Uint8Array instead of String, I noticed that memory usage went up.
For example, consider the following code that creates many small strings:
// (1000000 strings) x (10 characters)
var a=[];
for (let i=0; i<1000000; i++)
a.push("a".repeat(10).toUpperCase());
If you put it in an empty page and let the memory usage settle for a few seconds, it settles at 70 MiB on Google Chrome. On the other hand, the following code:
// (1000000 arrays) x (10 bytes)
var a=[];
for (let i=0; i<1000000; i++)
a.push(new Uint8Array(10));
uses 233 MiB of memory. An empty page without any code uses about 20 MiB. On the other hand, if I create a small number of large strings/arrays, the difference becomes smaller and in the case of a single string/array with 10000000 characters/entries, the memory usage is virtually identical.
So why do typed arrays have such a large memory overhead?
V8 developer here. Your conclusion makes sense: If you compare characters in a string to elements in a Uint8Array, the string will have less overhead. TypedArrays are great at providing fast access to typed elements; however having a large number of small TypedArrays is not memory efficient.
The difference is in the object header size for strings and typed arrays.
For a string, the object header is:
hidden class pointer
hash
length
payload
where the payload is rounded up to pointer size alignment, so 16 bytes in this case.
For a Uint8Array, you need the following:
hidden class pointer
properties pointer (unused)
elements pointer (see below)
array buffer pointer (see below)
offset into array buffer
byte length
length of view into array buffer
length (user-visible)
embedder field #1
embedder field #2
array buffer: hidden class pointer
array buffer: properties pointer (unused)
array buffer: elements pointer (see below)
array buffer: byte length
array buffer: backing store
array buffer: allocation base
array buffer: allocation length
array buffer: bit field (internal flags)
array buffer: embedder field #1
array buffer: embedder field #2
elements object: hidden class pointer
elements object: length (of the backing store)
elements object: base pointer (of the backing store)
elements object: offset to data start
elements object: payload
where, again, the payload is rounded up to pointer size alignment, so consumes 16 bytes here.
In summary, each string consumes 5*8 = 40 bytes, each typed array consumes 26*8 = 208 bytes. That does seem like a lot of overhead; the reason is due to the various flexible options that TypedArrays provide (they can be overlapping views into ArrayBuffers, which can be allocated directly from JavaScript, or shared with WebGL and whatnot, etc).
(It's not about "optimizing memory allocation" nor being "better at garbage collecting strings" -- since you're holding on to all the objects, GC does not play a role.)
The typed arrays are not supposed to be used that way.
If you want high memory efficiency, use just one typed array to hold all of your integer numbers. Instead of use a huge number of arrays to hold your integer numbers due to low level reasons.
Those low level reasons are related to how much overhead is need to hold one object in memory, and that quantity depends on a few aspects like immutability and garbage collection. In this case hold one typed array has higher overhead than hold one simple string. Thats why you should pay that price one time only
You should take advantage of:
var a = []; for (let i=0; i<1000000; i++) a.push("1");
var b = new Uint8Array(10000000); for (let i=0; i<1000000; i++) a[i] = 1;
// 'b' is more memory efficient than 'a', just pay the price of Uint8Array one time
// and save the wasted memory in string allocation overhead
I have to load a good chunk of data form my API and I have the choice of the format that I get the data. My question is about performance and to choose the fastest format to load on a query and being able to read it fast as well in JavaScript.
I can have a two dimensional array :
[0][0] = true;
[0][1] = false;
[1][2] = true;
[...]
etc etc..
Or I can have an array of object :
[
{ x: 0, y: 0, data: true},
{ x: 0, y: 1, data: false},
{ x: 1, y: 2, data: true},
[...]
etc etc..
]
I couldn't find any benchmark for this comparison for a GET request, with a huge amount of data.. If there is anything anywhere, I would love to read it !
The second part of the question is to read the data. I will have a loop that will need to get the value for each coordinate.
I assume looking up directly for the coordinate in a 2 dimensional array would be faster than looking up into each object at every loop. Or maybe I am wrong ?
Which one of the two format would be the fastest to load and read ?
Thanks.
For the first part of your question regarding the GET request, I imagine the array would be slightly quicker to load, but depending on your data, it could very well be negligible. I'm basing that on the fact that, if you take out the white space, the example data you have for each member of the array is 12 bytes, while the example data for the similar object is 20 bytes. If that were true for your actual data, theoretically there would be only 3/5 of the data to transfer, but unless you're getting a lot of data it's probably not going to make a noticeable difference.
To answer the second part of your question: the performance of any code is going to depend significantly on the details of your specific use case. For most situations, I think the most important point is:
Objects are significantly more readable and user-friendly
That said, when performance/speed is an issue and/or high priority, which it sounds like could be the case for you, there are definitely things to consider. While it relates to writing data instead of reading it, I found this good comparison of the performance of arrays vs objects that brought up some interesting points. In running the tests above multiple times using Chrome 45.0.2454.101 32-bit on Windows 7 64-bit, I found these points to generally be true:
Arrays will always be close to the fastest, if not the fastest
If the length of the object is known/can be hard coded, it's possible to make their performance close to and sometimes better than arrays
In the test linked above, this code using objects ran at 225 ops/sec in one of my tests:
var sum = 0;
for (var x in obj) {
sum += obj[x].payload;
}
Compared to this code using arrays that ran at 13,620 ops/sec in the same test:
var sum = 0;
for (var x = 0; x < arr.length; ++x) {
sum += arr[x].payload
}
Important to note, however, is that this code using objects with a hard coded length ran at 14,698 ops/sec in the same test, beating each of the above:
var sum = 0;
for (var x = 0; x < 10000; ++x) {
sum += obj[x].payload
}
All of that said, it probably depends on your specific use case what will have the best performance, but hopefully this gives you some things to consider.
I've seen little utility routines in various languages that, for a desired array capacity, will compute an "ideal size" for the array. These routines are typically used when it's okay for the allocated array to be larger than the capacity. They usually work by computing an array length such that the allocated block size (in bytes) plus a memory allocation overhead is the smallest exact power of 2 needed for a given capacity. Depending on the memory management scheme, this can significantly reduce memory fragmentation as memory blocks are allocated and then freed.
JavaScript allows one to construct arrays with predefined length. So does the concept of "ideal size" apply? I can think of four arguments against it (in no particular order):
JS memory management systems work in a way that would not benefit from such a strategy
JS engines already implement such a sizing strategy internally
JS engines don't really keep arrays as contiguous memory blocks, so the whole idea is moot (except for typed arrays)
The idea applies, but memory management is so engine-dependent that no single "ideal size" strategy would be workable
On the other hand, perhaps all of those arguments are wrong and a little utility routine would actually be effective (as in: make a measurable difference in script performance).
So: Can one write an effective "ideal size" routine for JavaScript arrays?
Arrays in javascript are at their core objects. They merely act like arrays through an api. Initializing an array with an argument merely sets the length property with that value.
If the only argument passed to the Array constructor is an integer between 0 and 232-1 (inclusive), this returns a new JavaScript array with length set to that number. -Array MDN
Also, there is no array "Type". An array is an Object type. It is thus an Array Object ecma 5.1.
As a result, there will be no difference in memory usage between using
var one = new Array();
var two = new Array(1000);
aside from the length property. When tested in a loop using chrome's memory timeline, this checks out as well. Creating 1000 of each of those both result in roughly 2.2MB of allocation on my machine.
one
two
You'll have to measure performance because there are too many moving parts. The VM and the engine and browser. Then, the virtual memory (the platform windows/linux, the physical available memory and mass storage devices HD/SSD). And, obviously, the current load (presence of other web pages or if server-side, other applications).
I see little use in such an effort. Any ideal size for performance may just not be ideal anymore when another tab loads in the browser or the page is loaded on another machine.
Best thing I see here to improve is development time, write less and be quicker on deploying your website.
I know this question and the answer was about memory usage. BUT although there might be no difference in the allocated memory size between calling the two constructors (with and without the size parameter), there is a difference in performance when filling the array. Chrome engine obviously performs some pre-allocation, as suggested by this code run in the Chrome profiler:
<html>
<body>
<script>
function preAlloc() {
var a = new Array(100000);
for(var i = 0; i < a.length; i++) {
a[i] = i;
}
}
function noAlloc() {
var a = [];
var length = 100000;
for(var i = 0; i < length; i++) {
a[i] = i;
}
}
function repeat(func, count) {
var i = 0;
while (i++ < count) {
func();
}
}
</script>
</body>
Array performance test
<script>
// 2413 ms scripting
repeat(noAlloc, 10000);
repeat(preAlloc, 10000);
</script>
</html>
The profiler shows that the function with no size parameter took 28 s to allocate and fill 100,000 items array for 1000 times and the function with the size parameter in the array constructor took under 7 seconds.