Improve performance of code coming out of eval

Improve performance of code coming out of eval - javascript

I heard "eval" function in javascript/node.js is evil, but it's needed in our application which takes a string sent to it by a sister program in certain format and evaluate it and record the result. Yes, we can trust the string it's going to be evaluated.
The problem is on performance. The following code takes 552ms. However, I replace the eval(...) by function add2(a,b) { return a+b}, it took only 12ms.
The question is, how do we improve the performance for code generated after evaluation. Would appreciate if anyone has any idea how to improve the performance with eval.
eval('function add2(a,b) { return a+b}')
let start = Date.now();
let total = 0;
for (let i = 0; i < 10000000; i++) {
total += add2(i, 1);
}
console.log(`took ${Date.now() - start} total=${total}`)
Update 1:
Thanks to CertainPerformance who pointed out the same code runs fast on Chrome browser.
The above snippet (with eval but without the wrapper of ((() => { ...}) ()` took 75ms, 12ms with the wrapper.
The Javascript v8 engine for my Chrome browser is V8 10.3.174.14.
I installed node.js v18.5.0 which uses V8 javascript engine 10.2.154.4-node.8, however, it's not much improvement over node v10.15.1 which uses v8 engine version 6.8.275.32-node.12.
By the way, you can use the command node -p process.versions.v8 to find the V8 engine version for node.js
Since node.js v18.5.0 is the latest release, I can't find a node with v8 engine 10.3.174.14. So I have to wait until the next release of node. I doubt the two javascript v8 engine versions (10.3.174.14 vs 10.2.154.4-node.8) will have such a huge performance difference (12ms vs ~500ms). My guess is that node.js has some inefficiency, especially when it comes to the "evil" function "eval" --- not always evil but need to be used with precaution.

There are two issues here:
eval sometimes produces a slow function for some reason (fix: use new Function)
Changing variables on the top level is slow when done a whole lot (fix: declare the variables that change frequently inside a block)
If you put the total inside an IIFE so that when it gets reassigned, it doesn't change a value on the top level, this micro-benchmark looks to improve to the desired handful of milliseconds.
Below runs in 10-20ms on my machine:
eval('function add2(a,b) { return a+b}')
let start = Date.now();
(() => {
let total = 0;
for (let i = 0; i < 10000000; i++) {
total += add2(i, 1);
}
console.log(`took ${Date.now() - start} total=${total}`)
})();
For some reason, the evaled function appears to be slow if created inside a block. Below runs in 1000+ms on my machine:
(() => {
eval('function add2(a,b) { return a+b}')
let start = Date.now();
let total = 0;
for (let i = 0; i < 10000000; i++) {
total += add2(i, 1);
}
console.log(`took ${Date.now() - start} total=${total}`)
})();
Using new Function instead creates a dramatic improvement (below runs in 10-20ms on my machine):
(() => {
const add2 = new Function('a', 'b', 'return a + b');
let start = Date.now();
let total = 0;
for (let i = 0; i < 10000000; i++) {
total += add2(i, 1);
}
console.log(`took ${Date.now() - start} total=${total}`)
})();
So, consider changing your serializing algorithm to pass values such that the function can be re-created with new Function instead of with eval.
If you can't change the source that's sending the function string, and the parameters aren't too complicated, you could use a regular expression to parse the function string into its parameters and body, and then pass that to new Function.
Another example of the phenomena without dynamic functions in the mix - see how the IIFE version below runs significantly faster.
const start = Date.now();
let i = 0;
while (i < 5e8) {
i = i + 1;
}
console.log(Date.now() - start);
(() => {
const start = Date.now();
let i = 0;
while (i < 5e8) {
i = i + 1;
}
console.log(Date.now() - start);
})();
That said, micro-benchmarks like these are generally a very poor indicator of whether there will actually be a problem when your code runs in the real world. Don't take them to mean much.
I heard "eval" function in javascript/node.js is evil
It has its place. Though, here, new Function would be more appropriate (and is a lot faster in certain circumstances). The general technique of serializing a function to a string to be parsed back into an actual function in a different environment isn't unheard of - this is the same technique that web scrapers like Puppeteer make extensive use of.

Related

Attribute vs constant access in javascript

I'm working on High-performance oriented web components and I doubt if it's worths assign an object attribute's value to a constant before accessing it multiple times.
I mean turning this:
let counter = 0;
for (let i = 0, len = parentObject.myObject.items.length; i < len; i++) {
// items is an array of integers
counter += parentObject.myObject.items[i] ;
}
Into this:
let counter = 0;
const { myObject } = parentObject;
const { items } = myObject;
for (let i = 0, len =items.length; i < len; i++) {
counter += items[i] ;
}
In Python this change would have a sentitive impact in performance. However the tests I have made (code at https://gist.github.com/Edorka/fbfb0778c859d8f518f0508414d3e6a2) shows no difference:
caseA total 124999750000
Execution time (hr): 0s 1.88101ms
caseB total 124999750000
Execution time (hr): 0s 1.117547ms
I doubt if I'm making my tests wrong or if the VM has any optimization for this case I'm not aware of.
UPDATE: Following #George Jempty suggestion I made a quick adaptation on JSPerf at https://jsperf.com/attribute-vs-constants but results keep being quite erratic.

Nested property access is one of the most frequently executed operation in JavaScript. You can expect it to be heavily optimized.
Indeed, the V8 engine caches object properties at run time, so the performance benefit of caching manually would be negligible.
Live demo on jsperf.com
Conclusion: don't worry about it!

Performance of array includes vs mapping to an Object and accessing it in JavaScript

According to the fundamentals of CS
the search functionality of an unsorted list has to occur in O(n) time where as direct access into an array will occur in O(1) time for HashMaps.
So is it more performant to map an array into a dictionary and then access the element directly or should I just use includes? This question is specifically for JavaScript because I believe this would come down to core implementation details of how includes() and {} is implemented.
let y = [1,2,3,4,5]
y.includes(3)
or...
let y = {
1: true,
2: true
3: true
4: true
5: true
}
5 in y

It's true that object lookup occurs in constant time - O(1) - so using object properties instead of an array is one option, but if you're just trying to check whether a value is included in a collection, it would be more appropriate to use a Set, which is a (generally unordered) collection of values, which can also be looked up in linear time. (Using a plain object instead would require you to have values in addition to your keys, which you don't care about - so, use a Set instead.)
const set = new Set(['foo', 'bar']);
console.log(set.has('foo'));
console.log(set.has('baz'));
This will be useful when you have to look up multiple values for the same Set. But, adding items to the Set (just like adding properties to an object) is O(N), so if you're just going to look up a single value, once, there's no benefit to this nor the object technique, and you may as well just use an array includes test.

Updated 04/29/2020
As the commenter rightly pointed out it would seem V8 was optimizing out the array includes calls. An updated version that assigns to a var and uses it produces more expected results. In that case Object address is fastest, followed by Set has and in a distant third is Array includes (on my system / browser).
All the same, I do stand by my original point, that if making micro-optimizations it is worth testing assumptions. Just make sure your tests are valid ;)
Original
Well. Despite the obvious expectation that Object address and Set has should outperform Array includes, benchmarks against Chrome indicate that implementation trumps expectation.
In the benches I ran against Chrome Array includes was far and away the best performer.
I also tested locally with Node and got more expected results. In that Object address wins, followed closely by Set has, then Array includes was marginally slower than both.
Bottom line is, if you're making micro-optimizations (not recommending that) it's worth benchmarking rather than assuming which might be best for your particular case. Ultimately it comes down to the implementation, as your question implies. So optimizing for the target platform is key.
Here's the results I got:
Node (12.6.0):
ops for Object address 7804199
ops for Array includes 5200197
ops for Set has 7178483
Chrome (75.0):
https://jsbench.me/myjyq4ixs1/1

This isn't necessarily a direct answer to the question but here is a related performance test I ran real quick in my chrome dev tools
function getRandomInt(max) {
return Math.floor(Math.random() * max);
}
var arr = [1,2,3];
var t = performance.now();
for (var i = 0; i < 100000; i++) {
var x = arr.includes(getRandomInt(3));
}
console.log(performance.now() - t);
var t = performance.now();
for (var i = 0; i < 100000; i++) {
var n = getRandomInt(3);
var x = n == 1 || n == 2 || n == 3;
}
console.log(performance.now() - t);
VM44:9 9.100000001490116
VM44:16 5.699999995529652
I find the array includes syntax nice to look at, so I wanted to know if the performance was likely to be an issue the way I use it, for checking if a variable is one of a set of enums for instance. It doesn't seem to be much of an impact for situations like this with a short list. Then I ran this.
function getRandomInt(max) {
return Math.floor(Math.random() * max);
}
var t = performance.now();
for (var i = 0; i < 100000; i++) {
var x = [1,2,3].includes(getRandomInt(3));
}
console.log(performance.now() - t);
var t = performance.now();
for (var i = 0; i < 100000; i++) {
var n = getRandomInt(3);
var x = n == 1 || n == 2 || n == 3;
}
console.log(performance.now() - t);
VM83:8 12.600000001490116
VM83:15 4.399999998509884
and so the way I actually use it and like lookin at it is quite worse with performance, despite still not being very significant unless run a few million times, so using it inside of an Array.filter that may run a lot as a react redux selector may not be a great idea like I was about to do when I decided to test this.

Why is using a generator function slower than filling and iterating an array in this example?

A Tale of Two Functions
I have one function that fills an array up to a specified value:
function getNumberArray(maxValue) {
const a = [];
for (let i = 0; i < maxValue; i++) {
a.push(i);
}
return a;
}
And a similar generator function that instead yields each value:
function* getNumberGenerator(maxValue) {
for (let i = 0; i < maxValue; i++) {
yield i;
}
}
Test Runner
I've written this test for both these scenarios:
function runTest(testName, numIterations, funcToTest) {
console.log(`Running ${testName}...`);
let dummyCalculation;
const startTime = Date.now();
const initialMemory = process.memoryUsage();
const iterator = funcToTest(numIterations);
for (let val of iterator) {
dummyCalculation = numIterations - val;
}
const finalMemory = process.memoryUsage();
// note: formatNumbers can be found here: https://jsfiddle.net/onz1ozjq/
console.log(formatNumbers `Total time: ${Date.now() - startTime}ms`);
console.log(formatNumbers `Rss: ${finalMemory.rss - initialMemory.rss}`);
console.log(formatNumbers `Heap Total: ${finalMemory.heapTotal - initialMemory.heapTotal}`);
console.log(formatNumbers `Heap Used: ${finalMemory.heapUsed - initialMemory.heapUsed}`);
}
Running the Tests
Then when running these two like so:
const numIterations = 999999; // 999,999
console.log(formatNumbers `Running tests with ${numIterations} iterations...\n`);
runTest("Array test", numIterations, getNumberArray);
console.log("");
runTest("Generator test", numIterations, getNumberGenerator);
I get results similar to this:
Running tests with 999,999 iterations...
Running Array test...
Total time: 105ms
Rss: 31,645,696
Heap Total: 31,386,624
Heap Used: 27,774,632
Running Function generator test...
Total time: 160ms
Rss: 2,818,048
Heap Total: 0
Heap Used: 1,836,616
Note: I am running these tests on node v4.1.1 on Windows 8.1. I am not using a transpiler and I'm running it by doing node --harmony generator-test.js.
Question
The increased memory usage with an array is obviously expected... but why am I consistently getting faster results for an array? What's causing the slowdown here? Is doing a yield just an expensive operation? Or maybe there's something up with the method I'm doing to check this?

The terribly unsatisfying answer is probably this: Your ES5 function relies on features that (with the exceptions of let and const) have been in V8 since it was released in 2008 (and presumably for some time before, as I understand that what became V8 originated as part of Google's web crawler). Generators, on the other hand, have only been in V8 since 2013. So not only has the ES5 code had seven years to be optimized while the ES6 code has had only two, almost nobody (compared to the many millions of sites using code just like your ES5 code) is using generators in V8 yet, which means there has been very little opportunity to discover, or incentive to implement, optimizations for it.
If you really want a technical answer as to why generators are comparatively slow in Node.js, you'll probably have to dive into the V8 source yourself, or ask the people who wrote it.

In the OP's example, the generator will always be slower
While JS engine authors will continue working to improve performance, there are some underlying structural realities that virtually guarantee that, for the OP's test case, building the array will always be faster than using an iterator.
Consider that a generator function returns a generator object.
A generator object will, by definition, have a next() function, and calling a function in Javascript means adding an entry to your call stack. While this is fast, it's likely never going to be as fast as direct property access. (At least, not until the singularity.)
So if you are going to iterate over every single element in a collection, then a for loop over a simple array, which accesses elements via direct property access, is always going to be faster than a for loop over an iterator, which accesses each element via a call to the next() function.
As I'm writing this in January of 2022, running Chrome 97, the generator function is 60% slower than the array function using the OP's example.
Performance is use-case-dependent
It's not difficult to imagine scenarios where the generator would be faster. The major downside to the array function is that it must build the entire collection before the code can start iterating over the elements, whether or not you need all the elements.
Consider a basic search operation which will only access, on average, half the elements of the collection. In this scenario, the array function exposes its "Achilles' heel": it must build an array with all the results, even though half will never be seen. This is where a generator has the potential to far outstrip the array function.
To demonstrate this, I slightly modified the OP's use-case. I made the elements of the array slightly more expensive to calculate (with a little division and square root logic) and modified the loop to terminate at about the halfway mark (to mimic a basic search).
Setup
function getNumberArray(maxValue) {
const a = [];
for (let i = 0; i < maxValue; i++) {
const half = i / 2;
const double = half * 2;
const root = Math.sqrt(double);
const square = Math.round(root * root);
a.push(square);
}
return a;
}
function* getNumberGenerator(maxValue) {
for (let i = 0; i < maxValue; i++) {
const half = i / 2;
const double = half * 2;
const root = Math.sqrt(double);
const square = Math.round(root * root);
yield square;
}
}
let dummyCalculation;
const numIterations = 99999;
const searchNumber = numIterations / 2;
Generator
const iterator = getNumberGenerator(numIterations);
for (let val of iterator) {
dummyCalculation = numIterations - val;
if (val > searchNumber) break;
}
Array
const iterator = getNumberArray(numIterations);
for (let val of iterator) {
dummyCalculation = numIterations - val;
if (val > searchNumber) break;
}
With this code, the two approaches are neck-and-neck. After repeated test runs, the generator and array functions trade first and second place. It's not difficult to imagine that if the elements of the array were even more expensive to calculate (for example, cloning a complex object, making a REST callout, etc), then the generator would win easily.
Considerations beyond performance
While recognizing that the OP's question is specifically about performance, I think it's worth calling out that generator functions were not primarily developed as a faster alternative to looping over arrays.
Memory efficiency
The OP has already acknowledged this, but memory efficiency is one of the main benefits that generators provide over building arrays. Generators can build objects on the fly and then discard them when they are no longer needed. In its most ideal implementation, a generator need only hold one object in memory at a time, while an array must hold all of them simultaneously.
For a very memory-intensive collection, a generator would allow the system to build objects as they are needed and then reclaim that memory when the calling code moves on to the next element.
Representation of non-static collections
Generators don't have to resolve the entire collection, which free them up to represent collections that might not exist entirely in memory.
For example, a generator can represent collections where the logic to fetch the "next" item is time-consuming (such as paging over the results of a database query, where items are fetched in batches) or state-dependent (such as iterating over a collection where operations on the current item affect which item is "next") or even infinite series (such as a fractal function, random number generator or a generator returning all the digits of π). These are scenarios where building an array would be either impractical or impossible.
One could imagine a generator that returns procedurally generated level data for a game based on a seed number, or even to represent a theoretical AI's "stream of consciousness" (for example, playing a word association game). These are interesting scenarios that would not be possible to represent using a standard array or list, but where a loop structure might feel more natural in code.

FYI this question is ancient in internet terms and generators have caught up (at least when tested in Chrome) https://jsperf.com/generator-vs-loops1

Try replacing the 'let' in the generator function with a function scoped 'var'. It seems that the 'let' inside the loop incurs a lot of overhead. Only use let if you absolutely have to.

In fact, running this benchmark now, generators at ~2x faster.
I've modified the code slightly (moved let i) and here is the full gist: https://gist.github.com/antonkatz/6b0911c85ddadae39c434bf8ac32b340
On my machine, these are the results:
Running Array...
Total time: 4,022ms
Rss: 2,728,939,520
Heap Total: 2,726,199,296
Heap Used: 2,704,236,368
Running Generator...
Total time: 2,541ms
Rss: 851,968
Heap Total: 0
Heap Used: -5,073,968
I was very curious myself and could not find a proper answer. Thanks #David for providing the test code.

global variables javascript, which is faster "varname" or "window.varname"

var testvar = 'boat';
function testA() {
console.log(testvar);
}
function testB() {
console.log(window.testvar);
}
I know that if I don't put the "window." for my global variable, then javascript searches all the scopes from method testA onward until it finds the variable testvar, so if I do window.testvar instead does it make it faster because I'm directly telling javascript which scope to look in for the variable? Or slower because I'm first telling javascript to look for the window object and then the variable?

Try both of the codes below separately and see the results for yourself. Indeed this might not be the most accurate testcase however by avoiding all other manipulation and doing a simple assignment inside a long enough for loop it ought to be accurate enough.
I have to say I was also surprised to see that by not specifying window Chrome persistently reported about 20% faster execution for the second code.
CODE 1
// window.testvar testcase.
window.testvar = 'Hi there! I am a testvar!';
var tmp;
var start = new Date();
for(var i = 0; i < 1000000; i++){
tmp = window.testvar;
}
var stop = new Date();
console.log('This took exactlly ' + (stop.getTime() - start.getTime()) + ' milliseconds!');
RESULTS:
1695ms
1715ms
1737ms
1704ms
1695ms
CODE 2
// direct testvar testcase
testvar = 'Hi there! I am a testvar!';
var tmp;
var start = new Date();
for(var i = 0; i < 1000000; i++){
tmp = testvar;
}
var stop = new Date();
console.log('This took exactlly ' + (stop.getTime() - start.getTime()) + ' milliseconds!');
RESULTS:
1415ms
1450ms
1422ms
1428ms
1450ms
Tested in Chrome 20.0.1132.47.

Vedaant's jsperf was not helpful. It was only creating functions, not executing them. Try this one: http://jsperf.com/epictest/9. It too shows that not specifying window is faster. I also added a test to show that copying to a local variable is dramatically faster. Vary the loop counter to see that you win for anything more than a single reference to the global.

Chrome has a useful javascript CPU profiler. Just create a loop to run the function several thousand times and start the profiler. I'm guessing that the difference is very small but this would be a good way to know for sure.

I just made a jsPerf test for you, check it out at: http://jsperf.com/epictest. It seems that
function testA() {
console.log(testvar);
}
is a bit faster.

Benchmark javascript execution with callback functions

I have some JavaScript that I'm trying to benchmark the time it takes to execute.
The problem with this is that the for loop completes quickly, meanwhile the execution of the Item.save() method is not yet complete.
Any suggestions how to time this that takes into account the full execution time within the contents of the loop?
Thank you!
var start = new Date().getTime();
var Item = new Item();
for (i = 0; i < 500; i++) {
var item = {};
item.name = 5;
item.id = 10;
item.set = [];
Item.save(item, function (err, res) {
console.log(res);
});
}
var elapsed = new Date().getTime() - start;
console.log(elapsed);
EDIT: This is on a nodejs server.

Just use Chrome's profiling tools. They give you total insight into exactly how much CPU time every function call on your page is taking up:
http://code.google.com/chrome/devtools/docs/cpu-profiling-files/two_profiles.png
For Node, you can try node-inspector's experimental profiler.

The best way to handle this would be to modify the Item.save() function to take in the start time and then do your comparison at the very end. Or, implement a callback function (succes:) on Item.save().

The answer is simple: create a jsPerf test case. It allows running asynchronous or “deferred” tests.
Alternatively, you could use Benchmark.js and set up a deferred test case manually.
Don’t simply compare two new Date timestamps, as that only works for synchronous tests. (Also, this is not an accurate way of measuring things across all browsers and devices.)

Develop Reference

JavaScript is the programming language of the Web.

Improve performance of code coming out of eval - javascript

Related

Attribute vs constant access in javascript

Performance of array includes vs mapping to an Object and accessing it in JavaScript

Why is using a generator function slower than filling and iterating an array in this example?

global variables javascript, which is faster "varname" or "window.varname"

Benchmark javascript execution with callback functions

Categories

Resources