Javascript performance array of objects preassignment vs direct use - javascript

I have a doubt about how can be affected to speed the use of object data arrays, that is, use it directly or preasign them to simple vars.
I have an array of elements, for example 1000 elements.
Every array item is an object with 10 properties (for example).
And finally I use some of this properties to do 10 calculations.
So I have APPROACH1
var nn = myarray.lenght;
var a1,a2,a3,a4 ... a10;
var cal1,cal2,.. cal10
for (var x=0;x<nn;x++)
{ // assignment
a1=my_array[x].data1;
..
a10 =my_array[x].data10;
// calculations
cal1 = a1*a10 +a2*Math.abs(a3);
...
cal10 = (a8-a7)*4 +Math.sqrt(a9);
}
And APPROACH2
var nn = myarray.lenght;
for (var x=0;x<nn;x++)
{
// calculations
cal1 = my_array[x].data1*my_array[x].data10 +my_array[x].data2*Math.abs(my_array[x].data3);
...
cal10 = (my_array[x].data8-my_array[x].data7)*4 +Math.sqrt(my_array[x].data9);
}
Assign a1 ... a10 values from my_array and then make calculations is faster than make the calculations using my_array[x].properties; or the right is the opposite ?????
I dont know how works the 'js compiler' ....

The kind of short answer is: it depends on your javascript engine, there is no right and wrong here, only "this has worked in the past" and "this don't seem to speed thing up no more".
<tl;dr> If i would not run a jsperf test, i would go with "Cached example" 1 example down: </tl;dr>
A general rule of thumb is(read: was) that if you are going to use an element in an array more then once, it could be faster to cache it in a local variable, and if you were gonna use a property on an object more then once it should also be cached.
Example:
You have this code:
// Data generation (not discussed here)
function GetLotsOfItems() {
var ret = [];
for (var i = 0; i < 1000; i++) {
ret[i] = { calc1: i * 4, calc2: i * 10, calc3: i / 5 };
}
return ret;
}
// Your calculation loop
var myArray = GetLotsOfItems();
for (var i = 0; i < myArray.length; i++) {
var someResult = myArray[i].calc1 + myArray[i].calc2 + myArray[i].calc3;
}
Depending on your browser (read:this REALLY depends on your browser/its javascript engine) you could make this faster in a number of different ways.
You could for example cache the element being used in the calculation loop
Cached example:
// Your cached calculation loop
var myArray = GetLotsOfItems();
var element;
var arrayLen = myArray.length;
for (var i = 0; i < arrayLen ; i++) {
element = myArray[i];
var someResult = element.calc1 + element.calc2 + element.calc3;
}
You could also take this a step further and run it like this:
var myArray = GetLotsOfItems();
var element;
for (var i = myArray.length; i--;) { // Start at last element, travel backwards to the start
element = myArray[i];
var someResult = element.calc1 + element.calc2 + element.calc3;
}
What you do here is you start at the last element, then you use the condition block to see if i > 0, then AFTER that you lower it by one (allowing the loop to run with i==0 (while --i would run from 1000 -> 1), however in modern code this is usually slower because you will read an array backwards, and reading an array in the correct order usually allow for either run-time or compile-time optimization (which is automatic, mind you, so you don't need to do anything for this work), but depending on your javascript engine this might not be applicable, and the backwards going loop could be faster..
However this will, by my experience, run slower in chrome then the second "kinda-optimized" version (i have not tested this in jsperf, but in an CSP solver i wrote 2 years ago i ended caching array elements, but not properties, and i ran my loops from 0 to length.
You should (in most cases) write your code in a way that makes it easy to read and maintain, caching array elements is in my opinion as easy to read (if not easier) then non-cached elements, and they might be faster (they are, at least, not slower), and they are quicker to write if you use an IDE with autocomplete for javascript :P

Related

Is it worth it to convert array into set to search in NodeJS

I would like to know if it is worth to convert an array into a set in order to search using NodeJS.
My use case is that this search is done lot of times, but not necessary on big sets of data (can go up to ~2000 items in the array from time to time).
Looking for a specific id in a list.
Which approach is better :
const isPresent = (myArray, id) => {
return Boolean(myArray.some((arrayElement) => arrayElement.id === id);
}
or
const mySet = new Set(myArray)
const isPresent = (mySet, id) => {
return mySet.has(id);
}
I know that theoretically the second approach is better as it is O(1) and O(n) for the first approach. But can the instantiation of the set offset the gain on small arrays?
#jonrsharpe - particularly for your case, I found that converting an array of 2k to Set itself is taking ~1.15ms. No doubt searching Set is faster than an Array but in your case, this additional conversion can be little costly.
You can run below code in your browser console to check. new Set(arr) is taking almost ~1.2ms
var arr = [], set = new Set(), n = 2000;
for (let i = 0; i < n; i++) {
arr.push(i);
};
console.time('Set');
set = new Set(arr);
console.timeEnd('Set');
Adding element in the Set is always costly.
Below code shows the time required to insert an item in array/set. Which shows Array insertion is faster than Set.
var arr = [], set = new Set(), n = 2000;
console.time('Array');
for (let i = 0; i < n; i++) {
arr.push(i);
};
console.timeEnd('Array');
console.time('Set');
for (let i = 0; i < n; i++) {
set.add(i);
};
console.timeEnd('Set');
I run the following code to analyze the speed of locating an element in the array and set. Found that set is 8-10 time faster than the array.
You can copy-paste this code in your browser to analyze further
var arr = [], set = new Set(), n = 100000;
for (let i = 0; i < n; i++) {
arr.push(i);
set.add(i);
}
var result;
console.time('Array');
result = arr.indexOf(12313) !== -1;
console.timeEnd('Array');
console.time('Set');
result = set.has(12313);
console.timeEnd('Set');
So for your case array.some is better!
I will offer a different upside for using Set: your code is now more semantic, easier to know what it does.
Other than that this post has a nice comparison - Javascript Set vs. Array performance but make your own measurements if you really feel that this is your bottleneck. Don't optimise things that are not your bottleneck!
My own heuristic is a isPresent-like utility for nicer code but if the check is done in a loop I always construct a Set before.

RangError: too many arguments provided for a function call

I got a nice solution to get HTML Comments from the HTML Node Tree
var findComments = function(el) {
var arr = [];
for (var i = 0; i < el.childNodes.length; i++) {
var node = el.childNodes[i];
if (node.nodeType === 8) {
arr.push(node);
} else {
arr.push.apply(arr, findComments(node));
}
}
return arr;
};
var commentNodes = findComments(document);
// whatever you were going to do with the comment...
console.log(commentNodes[0].nodeValue);
from this thread.
Everything I did was adding this small loop to print out all the nodes.
var arr = [];
var findComments = function(el) {
for (var i = 0; i < el.childNodes.length; i++) {
var node = el.childNodes[i];
if (node.nodeType === 8) {
arr.push(node);
} else {
arr.push.apply(arr, findComments(node));
}
}
return arr;
};
var commentNodes = findComments(document);
//I added this
for (var counter = arr.length; counter > 0; counter--) {
console.log(commentNodes[counter].nodeValue);
}
I keep getting this Error Message:
RangeError: too many arguments provided for a function call debugger
eval code:9:13
EDIT: i had a typo while pasting changed the code from i-- to counter--
see this comment in MDN docs about the use of apply to merge arrays:
Do not use this method if the second array (moreVegs in the example) is very large, because the maximum number of parameters that one function can take is limited in practice. See apply() for more details.
the other note from apply page:
But beware: in using apply this way, you run the risk of exceeding the JavaScript engine's argument length limit. The consequences of applying a function with too many arguments (think more than tens of thousands of arguments) vary across engines (JavaScriptCore has hard-coded argument limit of 65536), because the limit (indeed even the nature of any excessively-large-stack behavior) is unspecified. Some engines will throw an exception. More perniciously, others will arbitrarily limit the number of arguments actually passed to the applied function. To illustrate this latter case: if such an engine had a limit of four arguments (actual limits are of course significantly higher), it would be as if the arguments 5, 6, 2, 3 had been passed to apply in the examples above, rather than the full array.
As the array start from index of 0, actually the last item in the array is arr.length - 1.
you can fix it by:
for (var counter = arr.length - 1; counter >= 0; counter--)
Notice I've added arr.length -1 and counter >= 0 as zero is the first index of the array.
Adding the for loop is not the only thing you changed (and see the other answer about fixing that loop too). You also moved the declaration of arr from inside the function to outside, making arr relatively global.
Because of that, each recursive call to findComments() works on the same array, and the .apply() call pushes the entire contents back onto the end of the array every time. After a while, its length exceeds the limit of the runtime.
The original function posted at the top of your question has arr declared inside the function. Each recursive call therefore has its own local array to work with. In a document with a lot of comment nodes, it could still get that Range Error however.

V8 doesn't optimize function after 'manually' doing typed array .set()s

I have the following function (I'm posting it entirely because all the code parts might be relevant):
function buildUploadBuffers(cmds, len, vertexUploadBuffer, matrixUploadBuffer)
{
var vertexOffset = 0;
var matrixOffset = 0;
var quadLen = 24; //96 bytes for each quads, /4 since we're stepping 4 bytes at a time
var matLen = 16; //64/4, 4 rows of 4x floats with 4 bytes each, again stepping 4 bytes at a time
for (var i = 0; i < len; ++i)
{
var cmd = cmds[i];
var cmdQuads = cmd._numQuads;
var source = cmd._quadU32View;
var slen = cmdQuads * quadLen;
vertexUploadBuffer.set(source,vertexOffset);
vertexOffset += slen;
var mat = cmd._stackMatrixMat;
for(var j=0;j<cmdQuads * 4;++j)
{
matrixUploadBuffer.set(mat, matrixOffset);
matrixOffset += matLen;
}
}
}
It retrieves some typedArrays from each cmd in the cmds array and uses it to set values in some typedarray buffers.
This function is optimized fine, however, 'len' here is quite large and the data that is copied from the source typedArrays is quite small, and I have tested and profiled in the past that manually writing out the "set()"s can be significantly faster than relying on the compiler to optimize correctly. Further sometimes you can merge computations (such as here in the second loop because I copy the same thing 4 times to different places, but this is omitted in the following code since it doesn't change the results, for simplicity)
Doing this with the function above turns it into this:
function buildUploadBuffers(cmds, len, vertexUploadBuffer, matrixUploadBuffer)
{
var vertexOffset = 0;
var matrixOffset = 0;
var quadLen = 24; //96/4 since we're stepping 4 bytes at a time
var matLen = 16; //64/4
for (var i = 0; i < len; ++i)
{
var cmd = cmds[i];
var cmdQuads = cmd._numQuads;
var source = cmd._quadU32View;
var slen = cmdQuads * quadLen;
for(var j=0;j<slen; ++j)
{
vertexUploadBuffer[vertexOffset + j] = source[j];
}
vertexOffset += slen;
var mat = cmd._stackMatrixMat;
for(var j=0;j<cmdQuads * 4;++j)
{
for(var k=0;k<matLen; ++k)
{
matrixUploadBuffer[matrixOffset + k] = mat[k];
}
matrixOffset += matLen;
}
}
}
However, this second function is not optimized ("optimized too many times") despite doing essentially the same thing.
Running v8 with deopt traces produces the following suspicious statements (these are repeated several times in the output, until finally the compiler says no thanks and stops optimizing):
[compiling method 0000015F320E2B59 JS Function buildUploadBuffers
(SharedFunctionInfo 0000002ACC62A661) using Crankshaft OSR]
[optimizing 0000015F320E2B59 JS Function buildUploadBuffers
(SharedFunctionInfo 0000002ACC62A661) - took 0.070, 0.385, 0.093 ms]
[deoptimizing (DEOPT eager): begin 0000015F320E2B59 JS Function
buildUploadBuffers (SharedFunctionInfo 0000002ACC62A661) (opt #724)
#28, FP to SP delta: 280, caller sp: 0x2ea1efcb50]
;;; deoptimize at 4437: Unknown map in polymorphic access
So it seems that the deoptimization fails because of polymorphic access somewhere. Needless to say, the types contained in cmds are not always the same. They can be one of two concrete types that share the same prototype one step up the chain ('base class') where all the queried attributes come from (numQuads, quadU32View etc.).
Further
Why would it not just fail optimization with the first function that just uses .set() then? I'm accessing the same properties on the same objects. I'd think polymorphic access would break it in either case.
Type info of the function seems to be fine? When optimizing it the debug output says
ICs with typeinfo: 23/23 (100%), generic ICs: 0/23 (0%)]
Assuming there's nothing weird going on and the fact that the cmds can be one of two different types is indeed the culprit, how can I help out the optimizer here? The data in those cmds that I need from them is always the same, so there should be some way to package it up better for the optimizer, right? Maybe put a "quadData" object inside each cmd that contains numQuads, quadU32View, etc.? (just stabbing in the dark here)
Something that's very weird: Commenting out either of the two inner loops (or both at the same time of course) leads to the function getting optimized again. Is the function getting too long for the optimizer or something?
Because of the above point, I figured something might be weird with the (j) loop variable, so I tried using different ones for the different loops, which didn't change anything.
edit: Sure enough, the function optimizes again after I (e.g.) take out the second inner loop (uploading the matrix) and put it into a separate function. Interestingly enough this separate function is then seemingly inlined perfectly and I got the performance improvement I hoped for. Still makes me wonder what's going on here that prevents optimization. Just for completeness here's the thing that now optimizes well (and performs better, by about 25%):
function uploadMatrix(matrixUploadBuffer, mat, matLen, numVertices, matrixOffset)
{
for(var j=0;j<numVertices;++j)
{
for(var k=0;k<matLen; ++k)
{
matrixUploadBuffer[matrixOffset + k] = mat[k];
}
matrixOffset += matLen;
}
return matrixOffset;
}
function buildUploadBuffers(cmds, len, vertexUploadBuffer, matrixUploadBuffer)
{
var vertexOffset = 0;
var matrixOffset = 0;
var quadLen = 24; //96/4 since we're stepping 4 bytes at a time
var matLen = 16; //64/4
for (var i = 0; i < len; ++i)
{
var cmd = cmds[i];
var cmdQuads = cmd._numQuads;
var source = cmd._quadU32View;
var slen = cmdQuads * quadLen;
for(var j=0;j<slen; ++j)
{
vertexUploadBuffer[vertexOffset + j] = source[j];
}
vertexOffset += slen;
var mat = cmd._stackMatrixMat;
matrixOffset = uploadMatrix(matrixUploadBuffer,mat, matLen, cmdQuads *4, matrixOffset);
}
}

Optimizing a for loop in JS (push inside)

I have the following loop. The length is around 1500 points. But this snippit might get called multiple times on a page load (6-7).
buffer[xname] = [xname];
buffer[yname] = [yname];
for (var i = 0; i < rawdata.length; i++) {
buffer[xname].push( rawdata[i][0] );
buffer[yname].push( rawdata[i][1] );
}
I need to do this operation in the browser (it is used to condition the data before plotting them).
Currently this makes the browser very slow.
I tried to use a setTimeout() to ease the event loop a bit. That works but it takes seconds.
Is there any way to make this loop faster? Maybe some sort of mapping?
You can reduce the loop to half by doing:
buffer[xname] = [xname];
buffer[yname] = [yname];
var dataLength = rawdata.length;
for (var i = 0; i < dataLength / 2; i++) {
buffer[xname][i] = rawdata[i][0];
buffer[yname][i] = rawdata[i][1];
buffer[xname][dataLength - i -1] = rawdata[dataLength - i -1][0];
buffer[yname][dataLength - i -1] = rawdata[dataLength - i -1][1];
}
Not sure if the change between using push or direct assignment would impact enough to make the execution time the same.
Thanks to #royhowie
Why is array.push sometimes faster than array[n] = value?
If you have control over the source of rawdata, you might want to consider changing it so it can be used without additional processing.

use Array.indexOf for relatively big arrays

Is there a way to achieve indexOf functionality, to find out if a string is on an array, for very big arrays relatively fast? When my array grows beyond 40,000 values, my app freezes for a few seconds.
Consider the following code:
var arr = [];
function makeWord()
{
var text = "";
var possible = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
for( var i=0; i < 5; i++ )
text += possible.charAt(Math.floor(Math.random() * possible.length));
return text;
}
function populateArr(){
for (var i=0;i<40000;i++){
arr[i] = makeWord();
}
console.log("finished populateArr");
}
function checkAgainst(){
for (var i=0;i<40000;i++){
var wordToSearch = makeWord();
if (isFound(wordToSearch)){
console.log("found "+wordToSearch);
}
}
console.log("finished checkAgainst");
}
function isFound(wordToSearch){
//return $.inArray(wordToSearch,arr) > -1;
return arr.indexOf(wordToSearch) > -1;
}
populateArr();
checkAgainst();
FIDDLE here
In this code I'm populating an array arr with 40k random strings. Than, in checkAgainst I'm creating 40,000 other random strings, and than each one is checked if it is found on arr. This makes chrome freeze for about 2 seconds. Opening the profiler on Chrome DevTools, I see that isFound is obviously expensive in terms of CPU. even if I lower the for loop iterations number to just 4000 in checkAgainst , it still freezes for about a second or so.
In reality, I have a chrome extension and an array of keywords that grows to about 10k strings. Than, I have to use Array.indexOf to see if chucks of 200 other keywords are in that array. This makes my page freeze every once in a while, and from this example I suspect this is the cause. Ideas?
Try using keys in an object instead:
var arr = {};
function makeWord() // unchanged
function populateArr(){
for (var i=0;i<40000;i++){
arr[makeWord()] = true;
}
console.log("finished populateArr");
}
function checkAgainst() // unchanged
function isFound(wordToSearch){
return arr[wordToSearch];
}
populateArr();
checkAgainst();
If you then need the array of words, you can use Object.keys(arr)
Alternatively, combine the two: have an array and an object. Use the object to look up if a word is in the array, or the array to get the words themselves. This would be a classic compromise, trading memory usage for time.

Categories

Resources