Are there ideal array sizes in JavaScript? - javascript

I've seen little utility routines in various languages that, for a desired array capacity, will compute an "ideal size" for the array. These routines are typically used when it's okay for the allocated array to be larger than the capacity. They usually work by computing an array length such that the allocated block size (in bytes) plus a memory allocation overhead is the smallest exact power of 2 needed for a given capacity. Depending on the memory management scheme, this can significantly reduce memory fragmentation as memory blocks are allocated and then freed.
JavaScript allows one to construct arrays with predefined length. So does the concept of "ideal size" apply? I can think of four arguments against it (in no particular order):
JS memory management systems work in a way that would not benefit from such a strategy
JS engines already implement such a sizing strategy internally
JS engines don't really keep arrays as contiguous memory blocks, so the whole idea is moot (except for typed arrays)
The idea applies, but memory management is so engine-dependent that no single "ideal size" strategy would be workable
On the other hand, perhaps all of those arguments are wrong and a little utility routine would actually be effective (as in: make a measurable difference in script performance).
So: Can one write an effective "ideal size" routine for JavaScript arrays?

Arrays in javascript are at their core objects. They merely act like arrays through an api. Initializing an array with an argument merely sets the length property with that value.
If the only argument passed to the Array constructor is an integer between 0 and 232-1 (inclusive), this returns a new JavaScript array with length set to that number. -Array MDN
Also, there is no array "Type". An array is an Object type. It is thus an Array Object ecma 5.1.
As a result, there will be no difference in memory usage between using
var one = new Array();
var two = new Array(1000);
aside from the length property. When tested in a loop using chrome's memory timeline, this checks out as well. Creating 1000 of each of those both result in roughly 2.2MB of allocation on my machine.
one
two

You'll have to measure performance because there are too many moving parts. The VM and the engine and browser. Then, the virtual memory (the platform windows/linux, the physical available memory and mass storage devices HD/SSD). And, obviously, the current load (presence of other web pages or if server-side, other applications).
I see little use in such an effort. Any ideal size for performance may just not be ideal anymore when another tab loads in the browser or the page is loaded on another machine.
Best thing I see here to improve is development time, write less and be quicker on deploying your website.

I know this question and the answer was about memory usage. BUT although there might be no difference in the allocated memory size between calling the two constructors (with and without the size parameter), there is a difference in performance when filling the array. Chrome engine obviously performs some pre-allocation, as suggested by this code run in the Chrome profiler:
<html>
<body>
<script>
function preAlloc() {
var a = new Array(100000);
for(var i = 0; i < a.length; i++) {
a[i] = i;
}
}
function noAlloc() {
var a = [];
var length = 100000;
for(var i = 0; i < length; i++) {
a[i] = i;
}
}
function repeat(func, count) {
var i = 0;
while (i++ < count) {
func();
}
}
</script>
</body>
Array performance test
<script>
// 2413 ms scripting
repeat(noAlloc, 10000);
repeat(preAlloc, 10000);
</script>
</html>
The profiler shows that the function with no size parameter took 28 s to allocate and fill 100,000 items array for 1000 times and the function with the size parameter in the array constructor took under 7 seconds.

Related

Are two for loops slower than a bigger one?

Which piece of code is faster
Number 1:
for(var i = 0; i<50; i++){
//run code in here
}
for(var i = 0; i<50; i++){
//run more code in here
}
Or number 2:
for(var i = 0; i<100; i++){
//run all the code in here
}
Thanks!
As already pointed out in another answer, both loops yield the same O(N) scaling behaviour (for whatever happens in the loop body as well as for scaling the loop lengths 50 and 100, resp. The point usually is the proportional factor that accompanies the power term (c . XN).
On many (most) real CPU systems used for performance-relevant computation, there are usually caches and pipelines for the data manipulated inside the loops. Then, the answer to the question depends on the details of your loop bodies (will all the data read/written in the loops fit into some level of cache, or will the second 50-loop miss all existing cache values, and retrieve data from memory again?). Additionally, speculation of branch prediction (to the loop exit/repeat branch as well as to those inside the loop) has a complicated influence on the actual performance.
It is an own section of computational science to take into account all relevant details exactly. One should analyse the concrete example (what do the loops do actually?) - and before, whether this loop is actually relevant.
Some heuristic may nevertheless be helpful:
If i is an iterator (and not only a repetition counter), the two 1..50 loops might be working on the same data.
If it is possible to treat every element by both loop bodies (which only works if there are no dependencies between the second loop and the state of other elements after the first loop), it is usually more efficient to treat each index only once.
It would depend on the logic inside of them (nested loops etc). Theoretically, They'd run the same, as they're both linear. (Both are 100 iterations). So the Big O Time Complexity is O(N), where N is the size of the loop.
Big O Time Complexity

node.js arrays vs. objects from utilized memory point of view

Wanted to share a simple experiment I ran, using node.js v6.11.0 under Win 10.
Goal. Compare arrays vs. objects in terms of memory occupied.
Code. Each function reference, twoArrays, matrix and objects create two arrays of same size, containing random numbers. They organize the data a bit differentely.
reference creates two arrays of fixed size and fills them with numbers.
twoArrays fills two arrays via push (so the interpreter doesn't know the final size).
objects creates one array via push, each element is an object containing two numbers.
matrix creates a two-row matrix, also using push.
const SIZE = 5000000;
let s = [];
let q = [];
function rand () {return Math.floor(Math.random()*10)}
function reference (size = SIZE) {
s = new Array(size).fill(0).map(a => rand());
q = new Array(size).fill(0).map(a => rand());
}
function twoArrays (size = SIZE) {
s = [];
q = [];
let i = 0;
while (i++ < size) {
s.push(rand());
q.push(rand());
}
}
function matrix (size = SIZE) {
s = [];
let i = 0;
while (i++ < size) s.push([rand(), rand()]);
}
function objects (size = SIZE) {
s = [];
let i = 0;
while (i++ < size) s.push({s: rand(), q: rand()});
}
Result. After running each function separately in a fresh environment, and after calling global.gc() few times, the Node.js environment was occupying the following memory sizes:
reference: 84 MB
twoArrays: 101 MB
objects: 249 MB
matrix: 365 MB
theoretical: assuming that each number takes 8 bytes, the size should be 5*10^6*2*8 ~ 80 MB
We see, that reference resulted in a lightest memory structure, which is kind of obvious.
twoArrays is taking a bit more of memory. I think this is due to the fact that the arrays there are dynamic and the interpreter allocates memory in chunks, as soon as next push operation is exceeding preallocated space. Hence the final memory allocation is done for more than 5^10 numbers.
objects is interesting. Although each object is kind of fixed, it seems that the interpreter doesn't think so, and allocates much more space for each object then necessary.
matrix is also quite interesting - obviously, in case of explicit array definition in the code, the interpreter allocates more memory than required.
Conclusion. If your aim is a high-performance application, try to use arrays. They are also fast and have just O(1) time for random access. If the nature of your project requires objects, you can quite often simulate them with arrays as well (in case number of properties in each object is fixed).
Hope this is usefull, would like to hear what people think or maybe there are links to some more thorough experiments...

Can reducing index length in Javascript associative array save memory

I am trying to build a large Array (22,000 elements) of Associative Array elements in JavaScript. Do I need to worry about the length of the indices with regards to memory usage?
In other words, which of the following options saves memory? or are they the same in memory consumption?
Option 1:
var student = new Array();
for (i=0; i<22000; i++)
student[i] = {
"studentName": token[0],
"studentMarks": token[1],
"studentDOB": token[2]
};
Option 2:
var student = new Array();
for (i=0; i<22000; i++)
student[i] = {
"n": token[0],
"m": token[1],
"d": token[2]
};
I tried to test this on Google Chrome DevTools, but the numbers are inconsistent to make a decision. My best guess is that because the Array indices repeat, the browser can optimize memory usage by not repeating them for each student[i], but that is just a guess.
Edit:
To clarify, the problem is the following: a large array containing many small associative arrays. Does it matter using long index or short index when it comes to memory requirements.
Edit 2:
The 3N array approach that was suggested in the comments and #Joseph Myers is referring to is creating one array 'var student = []', with a size 3*22000, and then using student[0] for name, student[1] for marks, student[2] for DOB, etc.
Thanks.
The difference is insignificant, so the answer is no. This sort of thing would barely even fall under micro optimization. You should always opt for most readable solutions when in such dilemmas. The cost of maintaining code from your second option outweighs any (if any) performance gain you could get from it.
What you should do though is use the literal for creating an array.
[] instead of new Array(). (just a side note)
A better approach to solve your problem would probably be to find a way to load the data in parts, implementing some kind of pagination (I assume you're not doing heavy computations on the client).
The main analysis of associative arrays' computational cost has to do with performance degradation as the number of elements stored increases, but there are some results available about performance loss as the key length increases.
In Algorithms in C by Sedgewick, it is noted that for some key-based storage systems the search cost does not grow with the key length, and for others it does. All of the comparison-based search methods depend on key length--if two keys differ only in their rightmost bit, then comparing them requires time proportional to their length. Hash-based methods always require time proportional to the key length (in order to compute the hash function).
Of course, the key takes up storage space within the original code and/or at least temporarily in the execution of the script.
The kind of storage used for JavaScript may vary for different browsers, but in a resource-constrained environment, using smaller keys would have an advantage, like still too small of an advantage to notice, but surely there are some cases when the advantage would be worthwhile.
P.S. My library just got in two new books that I ordered in December about the latest computational algorithms, and I can check them tomorrow to see if there are any new results about key length impacting the performance of associative arrays / JS objects.
Update: Keys like studentName take 2% longer on a Nexus 7 and 4% longer on an iPhone 5. This is negligible to me. I averaged 500 runs of creating a 30,000-element array with each element containing an object { a: i, b: 6, c: 'seven' } vs. 500 runs using an object { studentName: i, studentMarks: 6, studentDOB: 'seven' }. On a desktop computer, the program still runs so fast that the processor's frequency / number of interrupts, etc., produce varying results and the entire program finishes almost instantly. Once every few runs, the big key size actually goes faster (because other variations in the testing environment affect the result more than 2-4%, since the JavaScript timer is based on clock time rather than CPU time.) You can try it yourself here: http://dropoff.us/private/1372219707-1-test-small-objects-key-size.html
Your 3N array approach (using array[0], array[1], and array[2] for the contents of the first object; and array[3], array[4], and array[5] for the second object, etc.) works much faster than any object method. It's five times faster than the small object method and five times faster plus 2-4% than the big object method on a desktop, and it is 11 times faster on a Nexus 7.

Performance of assigning values to array

Code optimizing is said here in SO that profiling is the first step for optimizing javascript and the suggested engines are profilers of Chrome and Firefox. The problem with those is that they tell in some weird way the time that each function is executed, but I haven't got any good understanding of them. The most helpful way would be that the profiler would tell, how many times each row is executed and if ever possible also the time that is spent on each row. This way would it be possible to see the bottlenecks strictly. But before such tool is implemented/found, we have two options:
1) make own calculator which counts both the time and how many times certain code block or row is executed
2) learn to understand which are slow methods and which are not
For option 2 jsperf.com is of great help. I have tried to learn optimizing arrays and made a speed test in JSPERF.COM. The following image shows the results in 5 main browsers and found some bottlenecks that I didn't know earlier.
The main findings were:
1) Assigning values to arrays is significantly slower than assigning to normal variables despite of which method is used for assigning.
2) Preinitializing and/or prefilling array before performance critical loops can improve speed significantly
3) Math trigonometric functions are not so slow when compared to pushing values into arrays(!)
Here are the explanations of every test:
1. non_array (100%):
The variables were given a predefined value this way:
var non_array_0=0;
var non_array_1=0;
var non_array_2=0;
...
and in timed region they were called this way:
non_array_0=0;
non_array_1=1;
non_array_2=2;
non_array_3=3;
non_array_4=4;
non_array_5=5;
non_array_6=6;
non_array_7=7;
non_array_8=8;
non_array_9=9;
The above is an array-like variable, but there seems to be no way to iterate or refer to those variables in other way as oppocite to array. Or is there?
Nothing in this test is faster than assigning a number to variable.
2. non_array_non_pre (83.78%)
Exactly the same as test 1, but the variables were not pre-initialized nor prefilled. The speed is 83,78% of the speed of test 1. In every tested browser the speed of prefilled variables was faster than non-prefilled. So initialize (and possibly prefill) variables outside any speed critical loops.
The test code is here:
var non_array_non_pre_0=0;
var non_array_non_pre_1=0;
var non_array_non_pre_2=0;
var non_array_non_pre_3=0;
var non_array_non_pre_4=0;
var non_array_non_pre_5=0;
var non_array_non_pre_6=0;
var non_array_non_pre_7=0;
var non_array_non_pre_8=0;
var non_array_non_pre_9=0;
3. pre_filled_array (19.96 %):
Arrays are evil! When we throw away normal variables (test1 and test2) and take arrays in to the picture, the speed decreases significantly. Although we make all optimizations (preinitialize and prefill arrays) and then assign values directly without looping or pushing, the speed decreases to 19.96 percent. This is very sad and I really don't understand why this occurs. This was one of the main shocks to me in this test. Arrays are so important, and I have not find a way to make many things without arrays.
The test data is here:
pre_filled_array[0]=0;
pre_filled_array[1]=1;
pre_filled_array[2]=2;
pre_filled_array[3]=3;
pre_filled_array[4]=4;
pre_filled_array[5]=5;
pre_filled_array[6]=6;
pre_filled_array[7]=7;
pre_filled_array[8]=8;
pre_filled_array[9]=9;
4. non_pre_filled_array (8.34%):
This is the same test as 3, but the array members are not preinitialized nor prefilled, only optimization was to initialize the array beforehand: var non_pre_filled_array=[];
The speed decreases 58,23 % compared to preinitilized test 3. So preinitializing and/or prefilling array over doubles the speed.
The test code is here:
non_pre_filled_array[0]=0;
non_pre_filled_array[1]=1;
non_pre_filled_array[2]=2;
non_pre_filled_array[3]=3;
non_pre_filled_array[4]=4;
non_pre_filled_array[5]=5;
non_pre_filled_array[6]=6;
non_pre_filled_array[7]=7;
non_pre_filled_array[8]=8;
non_pre_filled_array[9]=9;
5. pre_filled_array[i] (7.10%):
Then to the loops. Fastest looping method in this test. The array was preinitialized and prefilled.
The speed drop compared to inline version (test 3) is 64,44 %. This is so remarkable difference that I would say, do not loop if not needed. If array size is small (don't know how small, it have to be tested separately), using inline assignments instead of looping are wiser.
And because the speed drop is so huge and we really need loops, it's is wise to find better looping method (eg. while(i--)).
The test code is here:
for(var i=0;i<10;i++)
{
pre_filled_array[i]=i;
}
6. non_pre_filled_array[i] (5.26%):
If we do not preinitialize and prefill array, the speed decreases 25,96 %. Again, preinitializing and/or prefilling before speed critical loops is wise.
The code is here:
for(var i=0;i<10;i++)
{
non_pre_filled_array[i]=i;
}
7. Math calculations (1.17%):
Every test have to be some reference point. Mathematical functions are considered slow. The test consisted of ten "heavy" Math calculations, but now comes the other thing that struck me in this test. Look at speed of 8 and 9 where we push ten integer numbers to array in loop. Calculating these 10 Math functions is more than 30% faster than pushing ten integers into array in loop. So, may be it's easier to convert some array pushes to preinitialized non-arrays and keep those trigonometrics. Of course if there are hundred or thousands of calculations per frame, it's wise to use eg. sqrt instead of sin/cos/tan and use taxicab distances for distance comparisons and diamond angles (t-radians) for angle comparisons, but still the main bottleneck can be elsewhere: looping is slower than inlining, pushing is slower than using direct assignment with preinitilization and/or prefilling, code logic, drawing algorithms and DOM access can be slow. All cannot be optimized in Javascript (we have to see something on the screen!) but all easy and significant we can do, is wise to do. Someone here in SO has said that code is for humans and readable code is more essential than fast code, because maintenance cost is the biggest cost. This is economical viewpoint, but I have found that code optimizing can get the both: elegance and readability and the performance. And if 5% performance boost is achieved and the code is more straightforwad, it gives a good feeling!
The code is here:
non_array_0=Math.sqrt(10435.4557);
non_array_1=Math.atan2(12345,24869);
non_array_2=Math.sin(35.345262356547);
non_array_3=Math.cos(232.43575432);
non_array_4=Math.tan(325);
non_array_5=Math.asin(3459.35498534536);
non_array_6=Math.acos(3452.35);
non_array_7=Math.atan(34.346);
non_array_8=Math.pow(234,222);
non_array_9=9374.34524/342734.255;
8. pre_filled_array.push(i) (0.8%):
Push is evil! Push combined to loop is baleful evil! This is for some reason very slow method to assign values into array. Test 5 (direct assignments in loop), is nearly 9 times faster than this method and both methods does exactly the same thing: assign integer 0-9 into preinitialized and prefilled array. I have not tested if this push-for-loop evilness is due to pushing or looping or the combination of both or the looping count. There are in JSPERF.COM other examples that gives conflicting results. It's wiser to test just with the actual data and make decisions. This test may not be compatible with other data than what was used.
And here is the code:
for(var i=0;i<10;i++)
{
pre_filled_array.push(i);
}
9. non_pre_filled_array.push(i) (0.74%):
The last and slowest method in this test is the same as test 8, but the array is not prefilled. A little slower than 9, but the difference is not significant (7.23%). But let's take an example and compare this slowest method to the fastest. The speed of this method is 0.74% of the speed of the method 1, which means that method 1 is 135 times faster than this. So think carefully, if arrays are at all needed in particular use case. If it is only one or few pushes, the total speed difference is not noticeable, but on the other hand if there are only few pushes, they are very simple and elegant to convert to non-array variables.
This is the code:
for(var i=0;i<10;i++)
{
non_pre_filled_array.push(i);
}
And finally the obligatory SO question:
Because the speed difference according to this test seems to be so huge between non-array-variable- assignments and array-assignments, is there any method to get the speed of non-array-variable-assigments and the dynamics of arrays?
I cannot use var variable_$i = 1 in a loop so that $i is converted to some integer. I have to use var variable[i] = 1 which is significantly slower than var variable1 = 1 as the test proved. This may be critical only when there are large arrays and in many cases they are.
EDIT:
I made a new test to confirm the slowness of arrays access and tried to find faster way:
http://jsperf.com/read-write-array-vs-variable
Array-read and/or array-write are significantly slower than using normal variables. If some operations are done to array members, it's wiser to store the array member value to a temp variable, make those operations to temp variable and finally store the value into the array member. And although code becomes larger, it's significantly faster to make those operations inline than in loop.
Conclusion: arrays vs normal variables are analogous to disk vs memory. Usually memory access is faster than disk access and normal variables access is faster than array access. And may be concatenating operations is also faster than using intermediate variables, but this makes code a little non readable.
Assigning values to arrays is significantly slower than assigning to normal variables. Arrays are evil! This is very sad and I really don't understand why this occurs. Arrays are so important!
That's because normal variables are statically scoped and can be (and are) easily optimised. The compiler/interpreter will learn their type, and might even avoid repeated assignments of the same value.
These kind of optimisations will be done for arrays as well, but they're not so easy and will need longer to take effect. There is additional overhead when resolving the property reference, and since JavaScript arrays are auto-growing lists the length needs to be checked as well.
Prepopulating the arrays will help to avoid reallocations for capacity changes, but for your little arrays (length=10) it shouldn't make much difference.
Is there any method to get the speed of non-array-variable-assigments and the dynamics of arrays?
No. Dynamics do cost, but they are worth it - as are loops.
You hardly ever will be in the case to need such a micro-optimisation, don't try it. The only thing I can think of are fixed-sized loops (n <= 4) when dealing with ImageData, there inlining is applicable.
Push is evil!
Nope, only your test was flawed. The jsperf snippets are executed in a timed loop without tearup and -down, and only there you have been resetting the size. Your repeated pushes have been producing arrays with lengths of hundredth thousands, with correspondent need of memory (re-)allocations. See the console at http://jsperf.com/pre-filled-array/11.
Actually push is just as fast as property assignment. Good measurements are rare, but those that are done properly show varying results across different browser engine versions - changing rapidly and unexpected. See How to append something to an array?, Why is array.push sometimes faster than array[n] = value? and Is there a reason JavaScript developers don't use Array.push()? - the conclusion is that you should use what is most readable / appropriate for your use case, not what you think could be faster.

Performance Memory Management in JavaScript

Let me start with the questions, and then fill in the reasons/background.
Question: Are there any memory profiling tools for JavaScript?
Question: Has anybody tested performance memory management in JavaScript already?
I would like to experiment with performance memory management in JavaScript. In C/C++/Assembly I was able to allocate a region of memory in one giant block, then map my data structures to that area. This had several performance advantages, especially for math heavy applications.
I know I cannot allocate memory and map my own data structures in JavaScript (or Java for that matter). However, I can create a stack/queue/heap with some predetermined number of objects, for example Vector objects. When crunching numbers I often need just a few such objects at any one time, but generate a large number over time. By reusing the old vector objects I can avoid the create/delete time, unnecessary garbage collection time, and potentially large memory footprint while waiting for garbage collection. I also hypothesize that they will all stay fairly close in memory because they were created at the same time and are being accessed frequently.
I would like to test this, but I am coming up short for memory profiling tools. I tried FireBug, but it does not tell you how much memory the JavaScript engine is currently allocating.
I was able to code a simple test for CPU performance (see below). I compared a queue with 10 "Vector" objects to using new/delete each time. To make certain I wasn't just using empty data, I assigned the Vector 6 floating point properties, a three value array (floats), and an 18 character string. Each time I created a vector, using either method, I would set all the values to 0.0.
The results were encouraging. The explicit management method was initially faster, but the javascript engine had some caching and it caught up after running the test a couple times. The most interesting part was that FireBug crashed when I tried to run standard new/delete on on 10 million objects, but worked just fine for my queue method.
If I can find memory profiling tools, I would like to test this on different structures (array, heap, queue, stack). I would also like to test it on a real application, perhaps a super simple ray tracer (quick to code, can test very large data sets with lots of math for nice profiling).
And yes, I did search before creating this question. Everything I found was either a discussion of memory leaks in JavaScript or a discussion of GC vs. Explicit Management.
Thanks,
JB
Standard Method
function setBaseVectorValues(vector) {
vector.x = 0.0;
vector.y = 0.0;
vector.z = 0.0;
vector.theta = 0.0;
vector.phi = 0.0;
vector.magnitude = 0.0;
vector.color = [0.0, 0.0, 0.0];
vector.description = "a blank base vector";
}
function standardCreateObject() {
var vector = new Object();
setBaseVectorValues(vector);
return vector;
}
function standardDeleteObject(obj) {
delete obj;
}
function testStandardMM(count) {
var start = new Date().getTime();
for(i=0; i<count; i++) {
obj = standardCreateObject();
standardDeleteObject(obj);
}
var end = new Date().getTime();
return "Time: " + (end - start)
}
Managed Method
I used the JavaScript queue from http://code.stephenmorley.org/javascript/queues/
function newCreateObject() {
var vector = allocateVector();
setBaseVectorValues(vector);
return vector;
}
function newDeleteObject(obj) {
queue.enqueue(obj);
}
function newInitObjects(bufferSize) {
queue = new Queue()
for(i=0; i<bufferSize; i++) {
queue.enqueue(standardCreateObject());
}
}
function allocateVector() {
var vector
if(queue.isEmpty()) {
vector = new Object();
}else {
vector = queue.dequeue();
}
return vector;
}
function testNewMM(count) {
start = new Date().getTime();
newInitObjects(10);
for(i=0; i<count; i++) {
obj = newCreateObject();
newDeleteObject(obj);
obj = null;
}
end = new Date().getTime();
return "Time: " + (end - start) + "Vectors Available: " + queue.getLength();
}
The chrome inspector has a decent javascript profiling tool. I'd try that...
I have never seen such a tool but, in actuality, javascript [almost] never runs independently; it is [almost] always hosted within another application (e.g. your browser). It does not really matter how much memory is associated with your specific data structures, what matters is how the overall memory consumption of the host application is affected by your scripts.
I would recommend finding a generic memory profiling tool for your OS and pointing it at your browser. Run a single page and profile the browser's change in memory consumption before and after triggering your code.
The only exception to what I said above that I can think of right now is node.js... If you are using node then you can use process.memoryUsage().
Edit: Oooo... After some searching, it appears that Chrome has some sweet tools as well. (+1 for Michael Berkompas). I still stand by my original statement, that it is actually more important to see how the memory usage of the browser process itself is affected, but the elegance of the Chrome tools is impressive.

Categories

Resources