I've been tinkering with a Javascript chess engine for a while. Yeah yeah I know (chuckles), not the best platform for that sorta thing. It's a bit of a pet project, I'm enjoying the academic exercise and am intrigued by the challenge of approaching compiled language speeds. There are other quirky challenges in Javascript, like the lack of 64bit integers, that make it unfit for chess, but paradoxically interesting, too.
A while back I realized that it was extremely important to be careful with constructs, function parameters, etc. Everything matters in chess programming, but it seems that a lot matters when working with JIT compilers (V8 Turbofan) via Javascript in Chrome.
Via some traces, I'm seeing some eager DEOPTs that I'm having trouble figuring out how to avoid.
DEOPT eager, wrong map
The code that's referenced by the trace:
if (validMoves.length) { ...do some stuff... }
The trace points directly to the validMoves.length argument of the IF conditional. validMoves is only ever an empty array [] or an array of move objects [{Move},{Move},...]
Would an empty array [] kick off a DEOPT?
Incidentally, I have lots of lazy and soft DEOPTs, but if I understand correctly, these are not so crucial and just part of how V8 wraps its head around my code before ultimately optimizing it; in --trace-opt, the functions with soft,lazy DEOPTs, do seem to eventually be optimized by Turbofan, and perhaps don't hurt performance in the long run so much. (For that matter, the eager DEOPT'ed functions seem to eventually get reoptimized, too.) Is this a correct assessment?
Lastly, I have found at times that by breaking up functions that have shown DEOPTs, into multiple smaller function calls, I've had notable performance gains. From this I've inferred that the larger more complex functions are having trouble getting optimized and that by breaking them up, the smaller compartmentalized functions are being optimized and thus feeding my gains. Does that sound reasonable?
the lack of 64bit integers
Well, there are BigInts now :-)
(But in most engines/cases they're not suitable for high-performance operations yet.)
Would an empty array [] kick off a DEOPT?
Generally no. There are, however, different internal representations of arrays, so that may or may not be what's going on there.
[lazy, soft, eager...] Is this a correct assessment?
Generally yes. Usually you don't have to worry about deopts, especially for long-running programs that experience a few deopts early on. This is true for all the adjectives that --trace-deopt reports -- those are all just internal details. ("eager" and "lazy" are direct opposites of each other and simply indicate whether the activation of the function that had to be deoptimized was top-of-stack or not. "soft" is a particular reason for a deopt, namely a lack of type feedback, and V8 choosing to deoptimize instead of generating "optimized" code despite lack of type feedback, which wouldn't be very optimized at all.)
There are very few cases where you, as a JavaScript developer, might want to care about deopts. One example is when you've encountered a case where the same deopt happens over and over again. That's a bug in V8 when it happens; these "deopt loops" are rare, but occasionally they do occur. If you have found such a case, please file a bug with repro instructions.
Another case is when every CPU cycle matters, especially during startup / in short-running applications, and some costly functions gets deoptimized for a reason that might be avoidable. That doesn't seem to be your case though.
[breaking up functions...] Does that sound reasonable?
Breaking up functions can be beneficial, yes; especially if the functions you started with were huge. Generally, functions of all sizes get optimized; obviously larger functions take longer to optimize. This is a tricky area with no simple answers; if functions are too small then that's not helpful for performance either. V8 will perform some inlining, but the decisions are based on heuristics that naturally aren't always perfect. In my experience, manually splitting functions can in particular pay off for long-running loops (where you'd put the loop into its own function).
EDIT: to elaborate on the last point as requested, here's an example: instead of
function big() {
for (...) {
// long-running loop
}
/* lots more stuff... */
}
You'd split it as:
function loop() {
for (...) {
// same loop as before
}
}
function outer() {
loop();
/* same other stuff as before */
}
For a short loop, this is totally unnecessary, but if significant time is spent in the loop and the overall size of the function is large, then this split allows optimization to happen in more fine-grained chunks and with fewer ("soft") deopts.
And to be perfectly clear: I only recommend doing this if you are seeing a particular problem (e.g.: --trace-opt telling you that your biggest function is optimized two or more times, taking a second each time). Please don't walk away from reading this answer thinking "everyone should always split their functions", that's not at all what I'm saying. In extreme cases of huge functions, splitting them can be beneficial.
Related
Let's say we have next function:
const x = a => a;
const result = x('hello')
Do we have any guarantees in Google V8 / Firefox Quantum that x will be optimized to const result = 'hello'?
Why I'm asking it?
Please see my answer. Some times the only way to infer type in TypeScript is to make simple function.
type Validation<T> = T
const x = <T>(arg:Validation<T>)=>arg
It is an overhead. So, I'm curious, can I use such kind of technique for type infering and don't worry about function overhead?
(V8 developer here.)
Generally speaking: there are no guarantees about what will or won't get optimized. An engine's optimization decisions also change over time: the goal is not to optimize as much as possible (because optimization itself has a cost that isn't always worth it); the goal is to optimize the right things at the right time. That may well mean that an engineering team decides to make their engine optimize a little less, for overall better performance (or less jankiness, or less memory consumption, or whatever).
What will likely happen in this specific case: it depends (as usual). If that function's definition and its call site are top-level code (executed a single time), it is very likely that it won't get optimized -- because optimizing such code is usually not worth it. You won't be able to measure the difference, but if you were able to measure it, you'd see that not optimizing is faster for code that runs only once.
If, on the other hand, this identity function is called from "hot" code, where that hot code itself is selected for optimization after a while, then it's very likely that the optimizing compiler will inline the function, and then (trivially) optimize it away.
If the definition of the identity function is executed repeatedly, then (at least before/unless inlining happens) this will unnecessarily create several function objects (because JavaScript functions have object identity). That's an inefficiency that's easy to avoid (so I'd avoid it, personally); but again: whether it really matters, i.e. whether it has a measurable effect, depends on how often the code is executed.
In short: it's probably fine not to worry about function overhead in such a case; however there is no guarantee that calls to the identity function will get optimized away.
(Taking a step back, looking at the broader background: I don't know much about TypeScript's advanced features, but my gut feeling is that some sort of plugin for the TS compiler might be a more elegant way to enforce particular typechecks for literals. If strings are constructed at runtime, then TS's checks won't help anyway, just like the rest of TS's type checking system.)
Question for the v8 developers/experts.
Is it correct to assume, that v8 will completely eliminate the dead code, structured like this:
module1.js
export const DEBUG = false
module2.js
import { DEBUG } from './module1.js'
if (DEBUG) {
// dead code eliminated?
}
Please no comments like - "the overhead of 'if' check is very small and you should XXX instead of asking this question", I just want to know if v8 is capable of this (yes/no, preferably with a little details of course).
Thank you!
V8 developer here. The answer is, as so often, "it depends".
V8's optimizing compiler supports dead code elimination, so yes, under the right circumstances, a conditional branch that can never be taken will get eliminated.
That said, in the specific example you posted, the top-level code won't get optimized (probably -- depends on what else is in there), so in that case no, the if (DEBUG) check will be compiled (to unoptimized bytecode) and executed -- once, because executing it once is way faster than first trying to optimize (and possibly eliminate) it.
Another thing to consider is that V8 compiles functions "lazily", i.e. on demand. That means if you have an entire function that never gets called (e.g. because its only call site is in an if (DEBUG)-block and DEBUG is false), then that function won't even get compiled to bytecode, much less optimized code. That isn't dead code elimination in the traditional meaning of the term, but one could say that it's even better :-)
In conclusion: if you have a little DEBUG-code sprinkled over your app, it's totally fine to leave it in. Either it'll be in rarely executed paths, in which case the cost of executing the checks doesn't matter; or it'll be in hot paths, in which case V8 will optimize it and eliminate the conditionals. However, if you have lots of such code, then removing it would have two advantages: download size, and parsing time. When JavaScript code arrives in the browser, the engine has no choice but to look at every single byte of it at least briefly (if only to figure out which functions there are, and which parts of the code are in the top-level and must be executed immediately), and the fewer bytes there are, the more quickly that step completes. Parsing is fast, but parsing half as much is even faster!
Running in-browser Javascript in Chrome 79 for Windows :: From similar threads, it sounds like self-time includes only time to run the in-line code within a particular function, and excludes any time spent running sub functions.
But, in practice, I have noticed that some functions in my app that make a lot of sub-calls seem to have an inordinate amount of self-time compared to other similarly-sized functions with little or no sub-calls. (i.e. I'm comparing two functions with relatively similar operations, as well as number of ops). The self-time of these two functions can vary by 10x.
I'm wondering if self-time includes time to prepare for those calls, etc?
Perhaps, it's possible that some of that higher self-time of the function with sub-calls is due to later optimization by V8 and therefore during the sample time of the profiler, I'm comparing the self-time of an optimized function vs a not-yet-optimized function, which could run 100x slower prior to optimization. Maybe this is the culprit?
self-time includes only time to run the in-line code within a particular function, and excludes any time spent running sub functions
Yes, "self time" is the number of tick samples that occurred in the given function.
I'm wondering if self-time includes time to prepare for those calls, etc?
"time to prepare calls" is not measured separately.
I have noticed that some functions in my app that make a lot of sub-calls seem to have an inordinate amount of self-time
I would guess that what you're observing is caused by inlining. When a function gets optimized, and the compiler decides to inline one or more called functions, then the profiler afterwards can't possibly distinguish which instructions originally came from where (part of the reason why inlining can be beneficial is because it can allow elimination of redundancies, which naturally blurs the lines of which original function a given instruction "belongs to"). Does that make sense?
If you want to exclude the effects of inlining when profiling, you can turn off inlining. In Node/V8, you can run with --noturbo-inlining. (FWIW, in C/C++ this is true as well, where GCC/Clang understand -fno-inline.) Note that turning off inlining changes the performance characteristics of your app, so it can yield misleading results (specifically: it could be that without inlining you'll observe a performance issue that simply goes away when inlining is turned on); but it can also be helpful for pinpointing what is slow.
I was writing a Javascript code in which I needed to show and hide some sections of a web. I ended up with functions like these:
function hideBreakPanel() {
$('section#break-panel').addClass('hide');
}
function hideTimerPanel() {
$('section#timer-panel').addClass('hide');
}
function showBreakPanel() {
resetInputValues();
$('section#break-panel').removeClass('hide');
}
function showTimerPanel() {
resetInputValues();
$('section#timer-panel').removeClass('hide');
}
My question is related with code quality and refactoring. When is better to have simple functions like these or invoke a Javascript/jQuery function directly? I suppose that the last approach have a better performance, but in this case performance is not a problem as it is a really simple site.
I think you're fine with having functions like these, after all hideBreakPanel might later involve something more than applying a class to an element. The only thing I'd point out is to try to minimize the amount of repeated code in those functions. Don't worry about the fact that you're adding a function call overhead, unless you're doing this in a performance-critical scenario, the runtime interpreter couldn't care less.
One way you could arrange the functions to avoid repeating yourself:
function hidePanel(name) {
$('section#' + name + '-panel').addClass('hide');
}
function showPanel(name) {
resetInputValues();
$('section#' + name + '-panel').removeClass('hide');
}
If you absolutely must have a shorthand, you can then do:
function hideBreakPanel() {
hidePanel("break");
}
Or even
var hideBreakPanel = hidePanel.bind(hidePanel, "break");
This way you encapsulate common functionality in a function, and you won't have to update all your hide functions to ammend the way hiding is done.
My question is related with code quality and refactoring. When is
better to have simple functions like these or invoke a
Javascript/jQuery function directly? I suppose that the last approach
have a better performance, but in this case performance is not a
problem as it is a really simple site.
Just from a general standpoint, you can get into a bit of trouble later if you have a lot of one-liner functions and multiple lines of code crammed into one and things like that if the goal is merely syntactical sugar and a very personal definition of clarity (this can be quite transient and change like fashion trends).
It's because the quality that gives code longevity is often, above all, familiarity and, to a lesser extent, centralization (less branches of code to jump through). Being able to recognize and not absolutely loathe code you wrote years later (not finding it bizarre/alien, e.g.) often favors those qualities that reduce the number of concepts in the system, and flow down towards very idiomatic use of languages and libraries. There are human metrics here beyond formal SE metrics like just being motivated to keep maintaining the same code.
But it's a balancing act. If the motivation to seek these shorter and sweeter function calls has more to do with concepts beyond syntax like having a central place to modify and extend and instrument the behavior, to improve safety in otherwise error-prone code, etc., then even a bunch of one-liner functions could start to become of great aid in the future. The key in that case to keep the familiarity is to make sure you (and your team if applicable) have plenty of reuse for such functions, and incorporate it into the daily practices and standards.
Idiomatic code tends to be quite safe here because we tend to be saturated by examples of it, keeping it familiar. Any time you start going deep on the end of establishing proprietary interfaces, we risk losing that quality. Yet proprietary interfaces are definitely needed, so the key is to make them count.
Another kind of esoteric view is that functions that depend on each other tend to age together. An image processing function that just operates on very simple types provided by a language tends to age well. We can find, for example, C functions of this sort that are still relevant and easily-applicable today that date back all the way to the 80s. Such code stays familiar. If it depends on a very exotic pixel and color library and math routines outside of the norm, then it tends to age a lot more quickly (loses the familiarity/applicability), because that image processing routine now ages with everything it depends on. So again, always with an eye towards tightrope-balancing and trade-offs, it can sometimes be useful to avoid the temptation to venture too far outside the norms, and avoid coupling your code to too many exotic interfaces (especially ones that serve little more than sugar). Sometimes the slightly-more verbose form of code that favors reducing the number of concepts and more directly uses what is already available in the system can be preferable.
Yet, as is often the case, it depends. But these might be some less frequently-mentioned qualities to keep in mind when making all of your decisions.
If resetInputValues() method returns undefined (meaning returns nothing e.g) or any falsy value, you could refactorize it to:
function togglePanel(type, toHide) {
$('section#' + type + '-panel').toggleClass('hide', toHide || resetInputValues());
}
Use e.g togglePanel('break'); for showBreakPanel() and togglePanel('break', true) for hideBreakPanel().
Are there any good tutorials on how to write fast, efficient code for v8 (specifically, for node.js)?
What structures should I avoid using? What are the idioms that v8 optimises well?
From my experience:
It does inlining
Function call overhead is minimal (inlining)
What is expensive is to pass huge strings to functions, since those need to be copied and from my experience V8 isn't always as smart as it could be in this case
Scope lookup is expensive (surprise)
Don't do tricks e.g. I have a binary encoder for JS Object, cranking out some extra performance with bit shifting there (instead of Math.floor) latest Crankshaft (yes alpha, but still) runs the code 30% slower
Don't use magic. eval, arguments.callee etc. Those pretty much kill any optimization since code can no longer be inlined
Some of the new ES5 stuff e.g. .bind() is really slow in V8 at the moment
Somehow new Object() and new Array() are a bit faster currently (MICROoptimization, unless you're writing some crazy encoder stick with {} and [])
My rules:
Write good code
Write working code
Write code that works in strict mode (support still has to land, but when it does further optimization can be applied by V8)
If you're an JS expert and your already applying all good practices to your code, there's hardly anything you can do to improve performance.
If you encounter performance issues:
Verify them
Change the code / algorithm
And as a last resort: Write a C++ extension (and watch every commit to ry/node on GitHub since nobody cares whether some internal changes break your build)
The docs give a great answer: http://code.google.com/apis/v8/design.html
Understanding V8 is a set of slides from nodecamp.eu and gives very some interesting tips. In particular, I found the notes on avoiding "dictionary mode" useful i.e. it helps if you keep the "shape" of objects constant and don't add arbitrary properties to them.
You should also run node with --crankshaft --trace-opt --trace-bailout (the --crankshaft is only needed on 64-bit platforms e.g. OS X) to see whether V8 is "bailing" on JITing certain functions. There is a ton of other trace options including --trace-gc and various other GC tracing, which can be useful for optimisation.
Let me know if you have any specific questions about the slides above as they're a bit concise. :-) They're not mine but I've done some research about the areas they cover.