Chrome for loop optimization - javascript

So I was curious what would be faster for iterating through an array, the normal for loop or forEach so I executed this code in the console:
var arr = [];
arr.length = 10000000;
//arr.fill(1);
for (var i_1 = 0; i_1 < arr.length; i_1++) {
arr[i_1] = 1;
}
//////////////////////////////////
var t = new Date();
var sum = 0;
for (var i = 0; i < arr.length; i++) {
var a = arr[i];
if (a & 1) {
sum += a;
}
else {
sum -= a;
}
}
console.log(new Date().getTime() - t.getTime());
console.log(sum);
t = new Date();
sum = 0;
arr.forEach(function (value, i, aray) {
var a = value;
if (a & 1) {
sum += a;
}
else {
sum -= a;
}
});
console.log(new Date().getTime() - t.getTime());
console.log(sum);
Now the results in Chrome are 49ms for the for loop, 376ms for the forEach loop. Which is ok but the results in Firefox and IE (and Edge) are a lot different.
In both other browsers the first loop takes ~15 seconds (yes seconds) while the forEach takes "only" ~4 seconds.
My question is can someone tell me the exact reason Chrome is so much faster?
I tried all kinds of operations inside the loops, the results were always in favor for Chrome by a mile.

Disclaimer: I do not know the specifics of V8 in Chrome or the interpreter of Firefox / Edge, but there are some very general insights. Since V8 compiles Javascript to native code, let's see what it potentially could do:
Very crudely: variables like your var i can be modelled as a very general Javascript variable, so that it can take any type of value from numbers to objects (modelled as a pointer to a struct Variable for instance), or the compiler can deduce the actual type (say an int in C++ for instance) from your JS and compile it like that. The latter uses less memory, exploits caching, uses less indirection, and can potentially be as fast as a for-loop in C++. V8 probably does this.
The above holds for your array as well: maybe it compiles to a memory efficient array of ints stored contiguously in memory; maybe it is an array of pointers to general objects.
Temporary variables can be removed.
The second loop could be optimized by inlining the function call, maybe this is done, maybe it isn't.
The point being: all JS interpreters / compilers can potentially exploit these optimizations. This depends on a lot of factors: the trade-off between compilation and execution time, the way JS is written, etc.
V8 seems to optimize a lot, Firefox / Edge maybe don't in this example. Knowing why precisely requires in-depth understanding of the interpreter / compiler.

For loop is the afastest when compared to other iterators in every browser. But when comparing browsers ie is the slowest in iteration of for loops. Go and try jsperf.com for optimization is going to be my best recommendation. V8 engine implementation is the reason. After chrome split from webkit it stripped off more than 10k line of code in first few days.

Related

Does putting intermediary math results into variables improve performance in javascript?

I have been playing around with the Riemann zeta function.
I want to optimize execution time as much as possible here so I put the intermediary results in temporary variables. But testing revealed that I get no performance boost from this. At least not noticeably.
function zeta(z, limit){
var zres = new Complex(0, 0);
for(var x = 1; x <= limit; x++){
var ii = z.imaginary * Math.log(1/x);
var pp = Math.pow(1/x, z.real);
zres.real += pp * Math.cos(ii);
zres.imaginary += pp * Math.sin(ii);
}
return zres;
}
My question is: Even though I couldn't measure a difference in execution time, what's theoretically faster? Calculating ii and pp once and handing them over as variables, or calculating them twice and not wasting time with the declaration?
Putting things in (local) variables on its own will usually not have a major effect on performance. If anything it could increase the pressure on the register allocator (or equivalent) and slightly reduce performance.
Avoiding calculating expressions multiple times by putting the result into a local variable can improve performance if the just-in-time compiler (or runtime) isn't smart enough to do the equivalent optimization (i.e. compute the value only once and re-use the computation result each times the expression is used).
There's really no universally applicable rule here. You need to benchmark and optimize on the specific system you want best performance on.
Instanciating a variable is surely faster then a math operation (like Math.log or Math.pow) so it is better to instantiate them. If you want to prevent the local scope of the for to waste some very little less time due to the variable initializzation and collection you can declare pp and ii out of the loop. This is a quite ridiculous time in respect of all the other operations.
function zeta(z, limit){
var zres = new Complex(0, 0);
var ii, pp;
for(var x = 1; x <= limit; x++){
ii = z.imaginary * Math.log(1/x);
pp = Math.pow(1/x, z.real);
zres.real += pp * Math.cos(ii);
zres.imaginary += pp * Math.sin(ii);
}
return zres;
}

Why does adding in an immediately invoked lambda make my JavaScript code 2x faster?

I'm optimizing the compiler of a language to JavaScript, and found a very interesting, if not frustrating, case:
function add(n,m) {
return n === 0 ? m : add(n - 1, m) + 1;
};
var s = 0;
for (var i = 0; i < 100000; ++i) {
s += add(4000, 4000);
}
console.log(s);
It takes 2.3s to complete on my machine[1]. But if I make a very small change:
function add(n,m) {
return (() => n === 0 ? m : add(n - 1, m) + 1)();
};
var s = 0;
for (var i = 0; i < 100000; ++i) {
s += add(4000, 4000);
}
console.log(s);
It completes in 1.1s. Notice the only difference is the addition of an immediately invoked lambda, (() => ...)(), around the return of add. Why does this added call make my program two times faster?
[1] MacBook Pro 13" 2020, 2.3 GHz Quad-Core Intel Core i7, Node.js v15.3.0
Interesting! From looking at the code, it seems fairly obvious that the IIFE-wrapped version should be slower, not faster: in every loop iteration, it creates a new function object and calls it (which the optimizing compiler will eventually avoid, but that doesn't kick in right away), so generally just does more work, which should be taking more time.
The explanation in this case is inlining.
A bit of background: inlining one function into another (instead of calling it) is one of the standard tricks that optimizing compilers perform in order to achieve better performance. It's a double-edged sword though: on the plus side, it avoids calling overhead, and can often enable further optimizations, such as constant propagation, or elimination of duplicate computation (see below for an example). On the negative side, it causes compilation to take longer (because the compiler does more work), and it causes more code to be generated and stored in memory (because inlining a function effectively duplicates it), and in a dynamic language like JavaScript where optimized code typically relies on guarded assumptions, it increases the risk of one of these assumptions turning out to be wrong and a large amount of optimized code having to be thrown away as a result.
Generally speaking, making perfect inlining decisions (not too much, not too little) requires predicting the future: knowing in advance how often and with which parameters the code will be executed. That is, of course, impossible, so optimizing compilers use various rules/"heuristics" to make guesses about what might be a reasonably good decision.
One rule that V8 currently has is: don't inline recursive calls.
That's why in the simpler version of your code, add will not get inlined into itself. The IIFE version essentially has two functions calling each other, which is called "mutual recursion" -- and as it turns out, this simple trick is enough to fool V8's optimizing compiler and make it sidestep its "don't inline recursive calls" rule. Instead, it happily inlines the unnamed lambda into add, and add into the unnamed lambda, and so on, until its inlining budget runs out after ~30 rounds. (Side note: "how much gets inlined" is one of the somewhat-complex heuristics and in particular takes function size into account, so whatever specific behavior we see here is indeed specific to this situation.)
In this particular scenario, where the involved functions are very small, inlining helps quite a bit because it avoids call overhead. So in this case, inlining gives better performance, even though it is a (disguised) case of recursive inlining, which in general often is bad for performance. And it does come at a cost: in the simple version, the optimizing compiler spends only 3 milliseconds compiling add, producing 562 bytes of optimized code for it. In the IIFE version, the compiler spends 30 milliseconds and produces 4318 bytes of optimized code for add. That's one reason why it's not as simple as concluding "V8 should always inline more": time and battery consumption for compiling matters, and memory consumption matters too, and what might be acceptable cost (and improve performance significantly) in a simple 10-line demo may well have unacceptable cost (and potentially even cost overall performance) in a 100,000-line app.
Now, having understood what's going on, we can get back to the "IIFEs have overhead" intuition, and craft an even faster version:
function add(n,m) {
return add_inner(n, m);
};
function add_inner(n, m) {
return n === 0 ? m : add(n - 1, m) + 1;
}
On my machine, I'm seeing:
simple version: 1650 ms
IIFE version: 720 ms
add_inner version: 460 ms
Of course, if you implement add(n, m) simply as return n + m, then it terminates in 2 ms -- algorithmic optimization beats anything an optimizing compiler could possibly accomplish :-)
Appendix: Example for benefits of optimization. Consider these two functions:
function Process(x) {
return (x ** 2) + InternalDetail(x, 0, 2);
}
function InternalDetail(x, offset, power) {
return (x + offset) ** power;
}
(Obviously, this is silly code; but let's assume it's a simplified version of something that makes sense in practice.)
When executed naively, the following steps happen:
evaluate temp1 = (x ** 2)
call InternalDetail with parameters x, 0, 2
evaluate temp2 = (x + 0)
evaluate temp3 = temp2 ** 2
return temp3 to the caller
evaluate temp4 = temp1 + temp3
return temp4.
If an optimizing compiler performs inlining, then as a first step it will get:
function Process_after_inlining(x) {
return (x ** 2) + ( (x + 0) ** 2 );
}
which allows two simplifications: x + 0 can be folded to just x, and then the x ** 2 computation occurs twice, so the second occurrence can be replaced by reusing the result from the first:
function Process_with_optimizations(x) {
let temp1 = x ** 2;
return temp1 + temp1;
}
So comparing with the naive execution, we're down to 3 steps from 7:
evaluate temp1 = (x ** 2)
evaluate temp2 = temp1 + temp1
return temp2
I'm not predicting that real-world performance will go from 7 time units to 3 time units; this is just meant to give an intuitive idea of why inlining can help reduce computational load by some amount.
Footnote: to illustrate how tricky all this stuff is, consider that replacing x + 0 with just x is not always possible in JavaScript, even when the compiler knows that x is always a number: if x happens to be -0, then adding 0 to it changes it to +0, which may well be observable program behavior ;-)

Golang vs JavaScript (v8/node.js) map performance

Out of curiosity I wrote some trivial benchmarks comparing the performance of golang maps to JavaScript (v8/node.js) objects used as maps and am surprised at their relative performance. JavaScript objects appear to perform roughly twice as fast as go maps (even including some minor performance edges for go)!
Here is the go implementation:
// map.go
package main
import "fmt"
import "time"
func elapsedMillis(t0, t1 time.Time) float64 {
n0, n1 := float64(t0.UnixNano()), float64(t1.UnixNano())
return (n1 - n0) / 1e6
}
func main() {
m := make(map[int]int, 1000000)
t0 := time.Now()
for i := 0; i < 1000000; i++ {
m[i] = i // Put.
_ = m[i] + 1 // Get, use, discard.
}
t1 := time.Now()
fmt.Printf("go: %fms\n", elapsedMillis(t0, t1))
}
And here is the JavaScript:
#!/usr/bin/env node
// map.js
function elapsedMillis(hrtime0, hrtime1) {
var n0 = hrtime0[0] * 1e9 + hrtime0[1];
var n1 = hrtime1[0] * 1e9 + hrtime1[1];
return (n1 - n0) / 1e6;
}
var m = {};
var t0 = process.hrtime();
for (var i=0; i<1000000; i++) {
m[i] = i; // Put.
var _ = m[i] + 1; // Get, use, discard.
}
var t1 = process.hrtime();
console.log('js: ' + elapsedMillis(t0, t1) + 'ms');
Note that the go implementation has a couple of minor potential performance edges in that:
Go is mapping integers to integers directly, whereas JavaScript will convert the integer keys to string property names.
Go makes its map with initial capacity equal to the benchmark size, whereas JavaScript is growing from its default capacity).
However, despite the potential performance benefits listed above the go map usage seems to perform at about half the rate of the JavaScript object map! For example (representative):
go: 128.318976ms
js: 48.18517ms
Am I doing something obviously wrong with go maps or somehow comparing apples to oranges?
I would have expected go maps to perform at least as well - if not better than JavaScript objects as maps. Is this just a sign of go's immaturity (1.4 on darwin/amd64) or does it represent some fundamental difference between the two language data structures that I'm missing?
[Update]
Note that if you explicitly use string keys (e.g. via s := strconv.Itoa(i) and var s = ''+i in Go and JavaScript, respectively) then their performance is roughly equivalent.
My guess is that the very high performance from v8 is related to a specific optimization in that runtime for objects whose keys are consecutive integers (e.g. by substituting an array implementation instead of a hashtable).
I'm voting to close since there is likely nothing to see here...
Your benchmark is synthetic a bit, just like any benchmarks are. Just for curious try
for i := 0; i < 1000000; i += 9 {
in Go implementation. You may be surprised.

Which way of storing and operating on bitfields in javascript is the fastest? (200k+ bits)

I am profiling my javascript code intended to be used on embedded browser on Android (PhoneGap).
Basically I need a very large bitfield (200k+ bits) for my calculations.
I've tried to put them into array of unsigned integers with each item storing 32 bits - this indeed reduced memory usage but made execution time drastically too slow (over 30 seconds for simple iterating and reversing all bits in the bitfield on modern PC!)
Than I made good old fashion array of bools. This increased memory usage (but still it was less than 15 mega on Android for entire PhoneGap framework around my code). Profiling showed me that initial step in my algorithm - setting all elements of the bitfield to 1 (simple for- loop) - takes half of the execution time (~1.5 seconds on PC, more than few minutes on Android). I can rewrite my code so default value would be 0 not 1 (reverse all conditions), but I still don't know how to set such large array to 0'es fast.
Edit adding my code, as requested:
var count = 200000;
var myArr = [];
myArr.length = count;
for(var i = 0; i < count ; i++)
myArr[i] = true;
Could someone point me how can I clear very large array, or is there any faster way to store and operate on large bitfields in javascript?
See if this is a faster way to create the array:
var myArray = [true];
var desiredLength = 200000;
while (myArray.length < desiredLength) {
myArray = myArray.concat(myArray);
}
if (myArray.length > desiredLength) {
myArray.splice(desiredLength);
}
I've added a few more test cases to the jsperf page that Asad linked in his comment. By far the fastest in my browser (Chrome 23.0.1271.101 on Mac OS X 10.8.2) is this one:
var count = 200000;
var myArr = [];
for (var i = 0; i < count; i++) {
myArr.push(true);
}
Why pre-fill the array in the first place! Use undefined to your advantage. Remember that undefined acts as a falsey value. So it will act exactly like 0/false when you do a boolean check.
var myArray = new Array(200000);
if (myArray[1]) {
//I am a truthy value
} else {
//I am a falsey value
}
So when you initialize the array this way, there is no reason to prefill! That means no extra processing and take advantage of the sparse Array!

In JavaScript, why is a "reverse while" loop an order of magnitude faster than "for"?

In these benchmarks, http://jsperf.com/the-loops, Barbara Cassani showed that a "reverse while" loop is way faster,
while (iterations > 0) {
a = a + 1;
a = a - 1;
iterations--;
}
than a usual "for" loop:
for (i = 0; i < iterations; i++) {
a = a + 1;
a = a - 1;
}
Why?
Update
Okay, forget about it, there is a bug in the test, iterations = 100, is executed only once per page. Therefore reducing it, well, means that we don't really enter the loops. Sorry.
Except for the big bug in the initial test, here are the results:
for vs while makes no difference
but > or < are better than !==
http://jsperf.com/the-loops/15
It is because of of specifics of internals of each JavaScript engine. Don't use it for optimization, because you can't logically count on it always be faster as engines change. For example, check out last revision of test you've linked and note that difference is much more smaller if exists at all on recent browsers.

Categories

Resources