JavaScript: performance constraints of `delete` keyword - javascript

I'm trying to better learn how JS works under the hood and I've heard in the past that the delete keyword (specifically node.js or browsers using V8) results in poor performance, so I want to see if I can figure out what the benefits/detriments are for using that keyword.
I believe the reasoning for not using delete is that removing a property leads to a rebuilding of hidden class transitions and thus a recompiling of the inline cache. However, I believe it is also true that the object prototype will no longer enumerate that property, so if the object is used heavily the upfront cost may eventually pay off.
Are my assumptions about the tradeoffs correct?
If they are correct, is one factor more important than the other (e.g. is rebuilding the IC much more expensive than many prototype enumerations)?

V8 developer here. Short answer: "it depends".
Having an unused property doesn't hurt; there is no general "enumeration cost" unless you actually perform explicit enumerations. In other words, an "enumeration cost" only exists if you find yourself doing something like this:
for (var p in object) {
if (p === old_property_that_I_could_have_deleted) continue;
/* process other properties... */
The key reason why it's hard to give a concrete answer (or to provide a canonical example where an effect would be measurable) is because the effects are non-local: they depend both on what exactly you're doing with the object in question, and on what the rest of your app is doing. Deleting a property from one object may well cause operations on other objects to become slower. Or faster. It depends.
To take a step back and look at the high-level situation: JavaScript as a language sort of assumes that objects are represented as dictionaries. Deleting an entry in a dictionary should be perfectly fine, which is why it makes sense that the delete operator exists. In practice, it turns out that an engine can achieve huge performance improvements for read-heavy apps, which is by far the most common case, if it does not store objects as dictionaries, but instead more like something that resembles C/C++ structs. However, such an object representation is (1) generally hard/inefficient to do when properties get deleted, and (2) the engine may well interpret even the first deletion of a property as a hint that the programmer wants this particular object to behave like a dictionary, so it might switch the internal representation over. If a fast-to-modify dictionary is what you wanted, then that's fine (it will provide a benefit even); however if you wanted the object to remain in slow-to-modify/fast-to-read mode, you would perceive the transition to fast-to-modify/slow-to-read dictionary mode as a performance problem.
Thankfully there is a great solution nowadays: when you want a dictionary, use a Map or Set. Engines can (and usually will) assume that you'll want to delete entries from these, so the implementations are optimized for making that possible without negative side effects; in particular no hidden classes are involved.
A few remarks on your assumptions: deleting a property makes an object (mostly) leave the system of hidden class transitions, no transitions will be rebuilt. There is no single global "inline cache", there are many inline caches sprinkled all over your functions. They don't get rebuilt, they just transition to slower and slower modes the more different cases they have to handle. (That's generally how caching works: caching a single case provides huge speedups; on the other end of the scale if you have as many different cases as executions, then a cache just wastes time and memory without providing any benefit.) Again the effect of dictionary-mode objects depends on the overall situation: an inline cache dealing with (mostly) dictionary-mode objects typically exhibits performance somewhere in between (1) an inline cache that only has to deal with objects sharing the single same hidden class, and (2) an inline cache that has to deal with hundreds or thousands of different hidden classes.


Why is getting from Map slower than getting from object?

I'm considering migrating my state management layer to using Map versus using a standard object.
From what I've read, Map is effectively a hash table whereas Objects use hidden classes under the hood. Generally it's advised, where the properties are likely to be dynamically added or removed it's more efficient to use Map.
I set up a little test and to my surprise accessing values in the Object version was faster.
The article also mentions fast and slow properties. Perhaps the reason my code sample in test1 is so fast is because it is using fast properties? This seems unlikely as the object has 100,000 keys. How can I tell if the object is using fast properties or dictionary lookup? Why would the Map version be slower?
And yes, in practice, looks like a premature optimization, root of all evil ... etc etc. However, I'm interested in the internals and curious to know of best practices of choosing Map over Object.
(V8 developer here.)
Beware of microbenchmarks, they are often misleading.
V8's object system is implemented the way it is because in many cases it turns out to be very fast -- as you can see here.
The primary reason why we recommend using Map for map-like use cases is because of non-local performance effects that the object system can exhibit when certain parts of the machinery get "overloaded". In a small test like the one you have created, you won't see this effect, because nothing else is going on. In a large app (using many objects with many properties in many different usage patterns), it's still not guaranteed (because it depends on what the rest of the app is doing) but there's a good chance that using Maps where appropriate will improve overall performance -- if the overall system previously happened to run into one of the unfortunate situations.
Another reason is that Maps handle deletion of entries much better than Objects do, because that's a use case their implementation explicitly anticipates as common.
That said, as you already noted, worrying about such details in the abstract is a case of premature optimization. If you have a performance problem, then profile your app to figure out where most time is being spent, and then focus on improving those areas. If you do end up suspecting that the use of objects-as-maps is causing issues, then I recommend to change the implementation in the app itself, and measure (with the real app!) whether it makes a difference.
(See here for a related, similarly misleading microbenchmark, where even the microbenchmark itself started producing opposite results after minor modifications: Why "Map" manipulation is much slower than "Object" in JavaScript (v8) for integer keys?. That's why we recommend benchmarking with real apps, not with simplistic miniature scenarios.)

Avoid garbage collection in high performance JavaScript applications [duplicate]

I have a fairly complex Javascript app, which has a main loop that is called 60 times per second. There seems to be a lot of garbage collection going on (based on the 'sawtooth' output from the Memory timeline in the Chrome dev tools) - and this often impacts the performance of the application.
So, I'm trying to research best practices for reducing the amount of work that the garbage collector has to do. (Most of the information I've been able to find on the web regards avoiding memory leaks, which is a slightly different question - my memory is getting freed up, it's just that there's too much garbage collection going on.) I'm assuming that this mostly comes down to reusing objects as much as possible, but of course the devil is in the details.
The app is structured in 'classes' along the lines of John Resig's Simple JavaScript Inheritance.
I think one issue is that some functions can be called thousands of times per second (as they are used hundreds of times during each iteration of the main loop), and perhaps the local working variables in these functions (strings, arrays, etc.) might be the issue.
I'm aware of object pooling for larger/heavier objects (and we use this to a degree), but I'm looking for techniques that can be applied across the board, especially relating to functions that are called very many times in tight loops.
What techniques can I use to reduce the amount of work that the garbage collector must do?
And, perhaps also - what techniques can be employed to identify which objects are being garbage collected the most? (It's a farly large codebase, so comparing snapshots of the heap has not been very fruitful)
A lot of the things you need to do to minimize GC churn go against what is considered idiomatic JS in most other scenarios, so please keep in mind the context when judging the advice I give.
Allocation happens in modern interpreters in several places:
When you create an object via new or via literal syntax [...], or {}.
When you concatenate strings.
When you enter a scope that contains function declarations.
When you perform an action that triggers an exception.
When you evaluate a function expression: (function (...) { ... }).
When you perform an operation that coerces to Object like Object(myNumber) or
When you call a builtin that does any of these under the hood, like Array.prototype.slice.
When you use arguments to reflect over the parameter list.
When you split a string or match with a regular expression.
Avoid doing those, and pool and reuse objects where possible.
Specifically, look out for opportunities to:
Pull inner functions that have no or few dependencies on closed-over state out into a higher, longer-lived scope. (Some code minifiers like Closure compiler can inline inner functions and might improve your GC performance.)
Avoid using strings to represent structured data or for dynamic addressing. Especially avoid repeatedly parsing using split or regular expression matches since each requires multiple object allocations. This frequently happens with keys into lookup tables and dynamic DOM node IDs. For example, lookupTable['foo-' + x] and document.getElementById('foo-' + x) both involve an allocation since there is a string concatenation. Often you can attach keys to long-lived objects instead of re-concatenating. Depending on the browsers you need to support, you might be able to use Map to use objects as keys directly.
Avoid catching exceptions on normal code-paths. Instead of try { op(x) } catch (e) { ... }, do if (!opCouldFailOn(x)) { op(x); } else { ... }.
When you can't avoid creating strings, e.g. to pass a message to a server, use a builtin like JSON.stringify which uses an internal native buffer to accumulate content instead of allocating multiple objects.
Avoid using callbacks for high-frequency events, and where you can, pass as a callback a long-lived function (see 1) that recreates state from the message content.
Avoid using arguments since functions that use that have to create an array-like object when called.
I suggested using JSON.stringify to create outgoing network messages. Parsing input messages using JSON.parse obviously involves allocation, and lots of it for large messages. If you can represent your incoming messages as arrays of primitives, then you can save a lot of allocations. The only other builtin around which you can build a parser that does not allocate is String.prototype.charCodeAt. A parser for a complex format that only uses that is going to be hellish to read though.
The Chrome developer tools have a very nice feature for tracing memory allocation. It's called the Memory Timeline. This article describes some details. I suppose this is what you're talking about re the "sawtooth"? This is normal behavior for most GC'ed runtimes. Allocation proceeds until a usage threshold is reached triggering a collection. Normally there are different kinds of collections at different thresholds.
Garbage collections are included in the event list associated with the trace along with their duration. On my rather old notebook, ephemeral collections are occurring at about 4Mb and take 30ms. This is 2 of your 60Hz loop iterations. If this is an animation, 30ms collections are probably causing stutter. You should start here to see what's going on in your environment: where the collection threshold is and how long your collections are taking. This gives you a reference point to assess optimizations. But you probably won't do better than to decrease the frequency of the stutter by slowing the allocation rate, lengthening the interval between collections.
The next step is to use the Profiles | Record Heap Allocations feature to generate a catalog of allocations by record type. This will quickly show which object types are consuming the most memory during the trace period, which is equivalent to allocation rate. Focus on these in descending order of rate.
The techniques are not rocket science. Avoid boxed objects when you can do with an unboxed one. Use global variables to hold and reuse single boxed objects rather than allocating fresh ones in each iteration. Pool common object types in free lists rather than abandoning them. Cache string concatenation results that are likely reusable in future iterations. Avoid allocation just to return function results by setting variables in an enclosing scope instead. You will have to consider each object type in its own context to find the best strategy. If you need help with specifics, post an edit describing details of the challenge you're looking at.
I advise against perverting your normal coding style throughout an application in a shotgun attempt to produce less garbage. This is for the same reason you should not optimize for speed prematurely. Most of your effort plus much of the added complexity and obscurity of code will be meaningless.
As a general principle you'd want to cache as much as possible and do as little creating and destroying for each run of your loop.
The first thing that pops in my head is to reduce the use of anonymous functions (if you have any) inside your main loop. Also it'd be easy to fall into the trap of creating and destroying objects that are passed into other functions. I'm by no means a javascript expert, but I would imagine that this:
var options = {var1: value1, var2: value2, ChangingVariable: value3};
function loopfunc()
//do something
$.each(listofthings, loopfunc);
options.ChangingVariable = newvalue;
would run much faster than this:
$.each(listofthings, function(){
//do something on the list
var1: value1,
var2: value2,
ChangingVariable: newvalue
Is there ever any downtime for your program? Maybe you need it to run smoothly for a second or two (e.g. for an animation) and then it has more time to process? If this is the case I could see taking objects that would normally be garbage collected throughout the animation and keeping a reference to them in some global object. Then when the animation ends you can clear all the references and let the garbage collector do it's work.
Sorry if this is all a bit trivial compared to what you've already tried and thought of.
I'd make one or few objects in the global scope (where I'm sure garbage collector is not allowed to touch them), then I'd try to refactor my solution to use those objects to get the job done, instead of using local variables.
Of course it couldn't be done everywhere in the code, but generally that's my way to avoid garbage collector.
P.S. It might make that specific part of code a little bit less maintainable.

Best practices for reducing Garbage Collector activity in Javascript

I have a fairly complex Javascript app, which has a main loop that is called 60 times per second. There seems to be a lot of garbage collection going on (based on the 'sawtooth' output from the Memory timeline in the Chrome dev tools) - and this often impacts the performance of the application.
So, I'm trying to research best practices for reducing the amount of work that the garbage collector has to do. (Most of the information I've been able to find on the web regards avoiding memory leaks, which is a slightly different question - my memory is getting freed up, it's just that there's too much garbage collection going on.) I'm assuming that this mostly comes down to reusing objects as much as possible, but of course the devil is in the details.
The app is structured in 'classes' along the lines of John Resig's Simple JavaScript Inheritance.
I think one issue is that some functions can be called thousands of times per second (as they are used hundreds of times during each iteration of the main loop), and perhaps the local working variables in these functions (strings, arrays, etc.) might be the issue.
I'm aware of object pooling for larger/heavier objects (and we use this to a degree), but I'm looking for techniques that can be applied across the board, especially relating to functions that are called very many times in tight loops.
What techniques can I use to reduce the amount of work that the garbage collector must do?
And, perhaps also - what techniques can be employed to identify which objects are being garbage collected the most? (It's a farly large codebase, so comparing snapshots of the heap has not been very fruitful)
A lot of the things you need to do to minimize GC churn go against what is considered idiomatic JS in most other scenarios, so please keep in mind the context when judging the advice I give.
Allocation happens in modern interpreters in several places:
When you create an object via new or via literal syntax [...], or {}.
When you concatenate strings.
When you enter a scope that contains function declarations.
When you perform an action that triggers an exception.
When you evaluate a function expression: (function (...) { ... }).
When you perform an operation that coerces to Object like Object(myNumber) or
When you call a builtin that does any of these under the hood, like Array.prototype.slice.
When you use arguments to reflect over the parameter list.
When you split a string or match with a regular expression.
Avoid doing those, and pool and reuse objects where possible.
Specifically, look out for opportunities to:
Pull inner functions that have no or few dependencies on closed-over state out into a higher, longer-lived scope. (Some code minifiers like Closure compiler can inline inner functions and might improve your GC performance.)
Avoid using strings to represent structured data or for dynamic addressing. Especially avoid repeatedly parsing using split or regular expression matches since each requires multiple object allocations. This frequently happens with keys into lookup tables and dynamic DOM node IDs. For example, lookupTable['foo-' + x] and document.getElementById('foo-' + x) both involve an allocation since there is a string concatenation. Often you can attach keys to long-lived objects instead of re-concatenating. Depending on the browsers you need to support, you might be able to use Map to use objects as keys directly.
Avoid catching exceptions on normal code-paths. Instead of try { op(x) } catch (e) { ... }, do if (!opCouldFailOn(x)) { op(x); } else { ... }.
When you can't avoid creating strings, e.g. to pass a message to a server, use a builtin like JSON.stringify which uses an internal native buffer to accumulate content instead of allocating multiple objects.
Avoid using callbacks for high-frequency events, and where you can, pass as a callback a long-lived function (see 1) that recreates state from the message content.
Avoid using arguments since functions that use that have to create an array-like object when called.
I suggested using JSON.stringify to create outgoing network messages. Parsing input messages using JSON.parse obviously involves allocation, and lots of it for large messages. If you can represent your incoming messages as arrays of primitives, then you can save a lot of allocations. The only other builtin around which you can build a parser that does not allocate is String.prototype.charCodeAt. A parser for a complex format that only uses that is going to be hellish to read though.
The Chrome developer tools have a very nice feature for tracing memory allocation. It's called the Memory Timeline. This article describes some details. I suppose this is what you're talking about re the "sawtooth"? This is normal behavior for most GC'ed runtimes. Allocation proceeds until a usage threshold is reached triggering a collection. Normally there are different kinds of collections at different thresholds.
Garbage collections are included in the event list associated with the trace along with their duration. On my rather old notebook, ephemeral collections are occurring at about 4Mb and take 30ms. This is 2 of your 60Hz loop iterations. If this is an animation, 30ms collections are probably causing stutter. You should start here to see what's going on in your environment: where the collection threshold is and how long your collections are taking. This gives you a reference point to assess optimizations. But you probably won't do better than to decrease the frequency of the stutter by slowing the allocation rate, lengthening the interval between collections.
The next step is to use the Profiles | Record Heap Allocations feature to generate a catalog of allocations by record type. This will quickly show which object types are consuming the most memory during the trace period, which is equivalent to allocation rate. Focus on these in descending order of rate.
The techniques are not rocket science. Avoid boxed objects when you can do with an unboxed one. Use global variables to hold and reuse single boxed objects rather than allocating fresh ones in each iteration. Pool common object types in free lists rather than abandoning them. Cache string concatenation results that are likely reusable in future iterations. Avoid allocation just to return function results by setting variables in an enclosing scope instead. You will have to consider each object type in its own context to find the best strategy. If you need help with specifics, post an edit describing details of the challenge you're looking at.
I advise against perverting your normal coding style throughout an application in a shotgun attempt to produce less garbage. This is for the same reason you should not optimize for speed prematurely. Most of your effort plus much of the added complexity and obscurity of code will be meaningless.
As a general principle you'd want to cache as much as possible and do as little creating and destroying for each run of your loop.
The first thing that pops in my head is to reduce the use of anonymous functions (if you have any) inside your main loop. Also it'd be easy to fall into the trap of creating and destroying objects that are passed into other functions. I'm by no means a javascript expert, but I would imagine that this:
var options = {var1: value1, var2: value2, ChangingVariable: value3};
function loopfunc()
//do something
$.each(listofthings, loopfunc);
options.ChangingVariable = newvalue;
would run much faster than this:
$.each(listofthings, function(){
//do something on the list
var1: value1,
var2: value2,
ChangingVariable: newvalue
Is there ever any downtime for your program? Maybe you need it to run smoothly for a second or two (e.g. for an animation) and then it has more time to process? If this is the case I could see taking objects that would normally be garbage collected throughout the animation and keeping a reference to them in some global object. Then when the animation ends you can clear all the references and let the garbage collector do it's work.
Sorry if this is all a bit trivial compared to what you've already tried and thought of.
I'd make one or few objects in the global scope (where I'm sure garbage collector is not allowed to touch them), then I'd try to refactor my solution to use those objects to get the job done, instead of using local variables.
Of course it couldn't be done everywhere in the code, but generally that's my way to avoid garbage collector.
P.S. It might make that specific part of code a little bit less maintainable.

Why is caching values in objects taking more time?

As I have learnt, its better to cache the values in objects which we need repeatedly. For example, doing
var currentObj = myobject.myinnerobj.innermostobj[i]
and using 'currentObj' for further operations is better for performance than just
everywhere, like say in loops.. I am told it saves the script from looking-up inside the objects every time..
I have around 1000 lines of code, the only change I did to it with the intention of improving performance is this (at many locations) and the total time taken to execute it increased from 190ms to 230ms. Both times were checked using firebug 1.7 on Firefox 4.
Is what I learnt true (meaning either I am overusing it or mis-implemented it)? Or are there any other aspects to it that I am unaware of..?
There is an initial cost for creating the variable, so you have to use the variable a few times (depending on the complexity of the lookup, and many other things) before you see any performance gain.
Also, how Javascript is executed has changed quite a bit in only a few years. Nowadays most browsers compile the code in some form, which changes what's performant and what's not. It's likely that the perforance gain from caching reference is less now than when the advice was written.
The example you have given appears to simply be Javascript, not jQuery. Since you are using direct object property references and array indices to existing Javascript objects, there is no lookup involved. So in your case, adding var currentObj... could potentially increase overhead by the small amount needed to instantiate currentObj. Though this is likely very minor, and not uncommon for convenience and readability in code, in a long loop, you could possibly see the difference when timing it.
The caching you are probably thinking of has to do with jQuery objects, e.g.
var currentObj = $('some_selector');
Running a jQuery selector involves a significant amount of processing because it must look through the entire DOM (or some subset of it) to resolve the selector. So doing this, versus running the selector each time you refer to something, can indeed save a lot of overhead. But that's not what you're doing in your example.
See this fiddle:
In firefox and chrome (didn't test IE) -- the time is identical in pretty much any scenario.
Is what I learnt true (meaning either
I am overusing it or mis-implemented
it)? Or are there any other aspects to
it that I am unaware of..?
It's not obvious if either is the case because you didn't post a link to your code.
I think most of your confusion comes from the fact that JavaScript developers are mainly concerned with caching DOM objects. DOM object lookups are substantially more expensive than looking up something like myobj.something.something2. I'd hazard a guess that most of what you've been reading about the importance of caching are examples like this (since you mentioned jQuery):
var myButton = $('#my_button');
In such cases, caching the DOM references can pay dividends in speed on pages with a complex DOM. With your example, it'd probably just reduce the readability of the code by making you have to remember that currentObj is just an alias to another object. In a loop, that'd make sense, but elsewhere, it wouldn't be worth having another variable to remember.

javascript constructs to avoid?

I have been writing a JS algorithm. Its blazing fast in chrome and dog slow in FF. In the chrome profiler, I spend <10% in a method, in FF the same method is 30% of the execution time. Are there javascript constructs to avoid because they are really slow in one browser or another?
One thing I have noticed is that things like simple variable declaration can be expensive if you do it enough. I sped up my algorithm noticable by not doing things like
var x = y.x;
and just doing
for example.
As you've found, different things are issues in different implementations. In my experience, barring doing really stupid things, there's not much point worrying about optimizing your JavaScript code to be fast until/unless you run into a specific performance problem when testing on your target browsers. Such simple things as the usual "count down to zero" optimization (for (i = length - 1; i >= 0; --i) instead of for (i = 0; i < length; ++i)) aren't even reliable across implementations. So I tend to stick to writing code that's fairly clear (because I want to be nice to whoever has to maintain it, which is frequently me), and then worry about optimization if and when.
That said, looking through the Google article that tszming linked to in his/her answer reminded me that there are some performance things that I tend to keep in mind when writing code originally. Here's a list (some from that article, some not):
When you're building up a long string out of lots of fragments, surprisingly you usually get better performance if you build up an array of the fragments and then use the Array#join method to create the final string. I do this a lot if I'm building a large HTML snippet that I'll be adding to a page.
The Crockford private instance variable pattern, though cool and powerful, is expensive. I tend to avoid it.
with is expensive and easily misunderstood. Avoid it.
Memory leaks are, of course, expensive eventually. It's fairly easy to create them on browsers when you're interacting with DOM elements. See the article for more detail, but basically, hook up event handlers using a good library like jQuery, Prototype, Closure, etc. (because that's a particularly prone area and the libraries help out), and avoid storing DOM element references on other DOM elements (directly or indirectly) via expando properties.
If you're building up a significant dynamic display of content in a browser, innerHTML is a LOT faster in most cases than using DOM methods (createElement and appendChild). This is because parsing HTML into their internal structures efficiently is what browsers do, and they do it really fast, using optimized, compiled code writing directly to their internal data structures. In contrast, if you're building a significant tree using the DOM methods, you're using an interpreted (usually) language talking to an abstraction that the browser than has to translate to match its internal structures. I did a few experiments a while back, and the difference was about an order of magnitude (in favor of innerHTML). And of course, if you're building up a big string to assign to innerHTML, see the tip above — best to build up fragments in an array and then use join.
Cache the results of known-slow operations, but don't overdo it, and only keep things as long as you need them. Keep in mind the cost of retaining a reference vs. the cost of looking it up again.
I've repeatedly heard people say that accessing vars from a containing scope (globals would be the ultimate example of this, of course, but you can do it with closures in other scopes) is slower than accessing local ones, and certainly that would make sense in a purely interpreted, non-optimized implementation because of the way the scope chain is defined. But I've never actually seen it proved to be a sigificant difference in practice. (Link to simple quick-and-dirty test) Actual globals are special because they're properties of the window object, which is a host object and so a bit different than the anonymous objects used for other levels of scope. But I expect you already avoid globals anyway.
Here's an example of #6. I actually saw this in a question related to Prototype a few weeks back:
for (i = 0; i < $$('.foo').length; ++i) {
if ($$('.foo')[i].hasClass("bar")) { // I forget what this actually was
$$('.foo')[i].setStyle({/* ... */});
In Prototype, $$ does an expensive thing: It searches through the DOM tree looking for matching elements (in this case, elements with the class "foo"). The code above is searching the DOM three times on each loop: First to check whether the index is in bounds, then when checking whether the element has the class "bar", and then when setting the style.
That's just crazy, and it'll be crazy regardless of what browser it's running on. You clearly want to cache that lookup briefly:
list = $$('.foo');
for (i = 0; i < list.length; ++i) {
if (list[i].hasClass("bar")) { // I forget what this actually was
list[i].setStyle({/* ... */});
...but taking it further (such as working backward to zero) is pointless, it may be faster on one browser and slower on another.
Here you go:
I don't think this is really a performance thing, but something to avoid for sure unless you really know what's happening is:
var a = something.getArrayOfWhatever();
for (var element in a) {
// aaaa! no!! please don't do this!!!
In other words, using the for ... in construct on arrays should be avoided. Even when iterating through object properties it's tricky.
Also, my favorite thing to avoid is to avoid omission of var when declaring local variables!

