Javascript memory leaks after unloading a web page

Javascript memory leaks after unloading a web page - javascript

I have been reading up to try to make sense of memory leaks in browsers, esp. IE. I understand that the leaks are caused by a mismatch in garbage collection algorithms between the Javascript engine and the DOM object tree, and will persist past. What I don't understand is why (according to some statements in the articles I'm reading) the memory is not reclaimed after the page is unloaded by the browser. Navigating away from a webpage should put all the DOM and javascript objects out of scope at that point, shouldn't it?

Here's the problem. IE has a separate garbage collector for the DOM and for javascript. They can't detect circular references between the two.
What we used to was to clean up all event handlers from all nodes at page unload. This could, however, halt the browser while unloading. This only addressed the case where the circular reference was caused by event handlers. It could also be caused by adding direct references from DOM nodes to js objects which had a reference to the DOM node itself.
Another good thing to remember is that if you are deleting nodes, it's a good idea to remove the handlers yourself first. Ext-js has a Ext.destroy method that does just that (if you've set the handlers using ext).
Example
// Leaky code to wrap HTML elements that allows you to find the custom js object by adding
//a reference as an "expando" property
function El(node) {
this.dom = node;
node.el = this;
}
Then Microsoft hacked IE so it removed all event handlers and expando properties when unloading internally, therefore it's much faster than doing it with js. This fix seemed to fix our memory problems, but not all problems as there are people still having the problem.
MS's description of the problem
MS releases patch that "fixes" memory leaks:
Blog about fixed memory leaks
IE still has some problems
At our company, we use ext-js. By always setting event handlers using ext-js, which has a an internal clean up routine, we have not experienced memory leaks. In reality, memory usage grows but stops at about 250Mb for a machine with 4Gb of RAM. We don't think that's too bad since we load about 2Mb(uncompressed) of js files and all the elements on the page are dynamic.
There's a lot to be said about this and we've researched this extensively where I work. Feel free to ask a more specific question. I may be able to help you.

The best thing I ever read about Javascript memory leaks was written by Doulgas Crockford.
To answer your question, yes, the browser absolutely should unload all the objects (and most importantly, event handlers) at the appropriate time. If it did, it wouldnt have leaks :)

You don't have to make sense of them -- they are bugs in broswers and being fixed from versions to versions.

Related

Why does this JavaScript cause a memory leak?

The following code is regarded as causing a memory leak because element maintains a reference to function bar and bar maintains a reference to element via closure (if I understand correctly).
Why does this cause a memory leak? Does it only cause a leak when element is a DOM node?
function foo(element, a, b) {
element.onclick = function bar() { /* uses a and b */ };
}

This is called a Javascript closure and is an expected feature of Javascript. This is not a memory leak in any modern browser.
If the DOM element represented by element is removed from the DOM, then the onclick handler will be garbage collected and then the closure itself will be garbage collected.
During the lifetime of element, the variables a and b will be part of the closure. This is an expected language feature as they are part of the closure that this code creates. When and if, element is deleted from the DOM and it is garbage collected, the closure and its references a and b will be eligible for garbage collection too.
There are some old browsers that did not always handle this properly, but that is generally not considered a design consideration any more for modern browsers. Further, it only caused an issue of consequence if you ran code like this over and over (removing DOM elements each time too) such that a large enough amount of memory was consumed by things that weren't being garbage collected when they should have been. This typically only happened in long running single page applications that had a lot of dynamic DOM stuff going on. But, as I said with modern browsers, this is no longer an issue as the browser's garbage collector handles this situation now.

This code only caused a memory leak in certain old versions of Internet Explorer. Internet Explorer 8 made some changes to memory management which mitigated the problem:
https://msdn.microsoft.com/en-us/library/dd361842(v=vs.85).aspx
As all of the affected versions of Internet Explorer are now thoroughly obsolete, this is no longer an issue you need to be concerned with.

jQuery .remove() vs Node.removeChild() and external maps to DOM nodes

The jQuery API documentation for jQuery .remove() mentions:
In addition to the elements themselves, all bound events and jQuery
data associated with the elements are removed.
I assume "bound events" here means "event handlers"; documentation for the similar .empty() says:
To avoid memory leaks, jQuery removes other constructs such as data
and event handlers from the child elements before removing the
elements themselves.
It does sound like leaks would ensue if one were to not use these functions and use Node.removeChild() (or ChildNode.remove()) instead.
Is this true for modern browsers?
If so, why exactly can't properties and event handlers be collected once the node is removed?
If not, do we still need to use .data()? Is it only good to retrieve HTML5 data- attributes?
Documentation for jQuery.data() (lower-level function) says:
The jQuery.data() method allows us to attach data of any type to DOM
elements in a way that is safe from circular references and therefore
free from memory leaks. jQuery ensures that the data is removed when
DOM elements are removed via jQuery methods, and when the user leaves
the page.
This sounds an awful lot like a solution to the old IE DOM/JS circular leak pattern which, AFAIK, is solved in all browsers today.
However, a comment in the jQuery src/data.js code (snapshot) says:
Provide a clear path for implementation upgrade to WeakMap in 2014
Which suggests that the idea of storing data strictly associated to a DOM node outside of the DOM using a separate data store with a map is still considered in the future.
Is this just for backward-compatibility, or is there more to it?
Answers provided to other questions like this one also seem to imply that the sole reason for an external map is to avoid cyclic refs between DOM objects and JS objects, which I consider irrelevant in the context of this question (unless I'm mistaken).
Furthermore, I've seen plugins that now set properties on relevant DOM nodes directly (e.g. selectize.js) and it doesn't seem to bother anyone. Is this an OK practice? It certainly looks that way, as it makes removing entire DOM trees very easy. No need to walk it down, no need to clean up any external data store, just detach it from the parent node, lose the reference, and let the garbage collector do its thing.
Further notes, context and rationale to the question:
This kind of capability is especially interesting for frameworks that manage views (e.g. Durandal), which often times have to replace entire trees that represent said views in their architecture. While most of them certainly support jQuery explicitly, this solution does not scale at all. Every component that uses a similar data store must also be cleaned up. In the case of Durandal, it seems they (at least in one occurrence, the dialog plugin - snapshot) rely on Knockout's .removeNode() (snapshot) utility function, which in turn uses jQuery's internal cleanData(). That's, IMHO, a prime example of horrible special-casing (I'm not sure it even works as it is now if jQuery is used in noConflict mode, which it is in most AMD setups).
This is why I'd love to know if I can safely ignore all of this or if we'll have to wait for Web Components in order to regain our long-lost sanity.

"It does sound like leaks would ensue if one were to not use these functions and use Node.removeChild() (or ChildNode.remove()) instead.
Is this true for modern browsers?
If so, why exactly can't properties and event handlers be collected once the node is removed?"
Absolutely. The data (including event handlers) associated with an element is held in a global object held at jQuery.cache, and is removed via a serial number jQuery puts on the element.
When it comes time for jQuery to remove an element, it grabs the serial number, looks up the entry in jQuery.cache, manually deletes the data, and then removes the element.
Destroy the element without jQuery, you destroy the serial number and the only association to the element's entry in the cache. The garbage collector has no knowledge of what the jQuery.cache object is for, and so it can't garbage collect entries for nodes that were removed. It just sees it as a strong reference to data that may be used in the future.
While this was a useful approach for old browsers like IE6 and IE7 that had serious problems with memory leaks, modern implements have excellent garbage collectors that reliably find things like circular references between JavaScript and the DOM. You can have some pretty nasty circular references via object properties and closures, and the GC will find them, so it's really not such a worry with those browsers.
However, since jQuery holds element data in the manner it does, we now have to be very careful when using jQuery to avoid jQuery-based leaks. This means never use native methods to remove elements. Always use jQuery methods so that jQuery can perform its mandatory data cleanup.
"Furthermore, I've seen plugins that now set properties on relevant DOM nodes directly (e.g. selectize.js) and it doesn't seem to bother anyone. Is this an OK practice?"
I think it is for the most part. If the data is just primitive data types, then there's no opportunity for any sort of circular references that could happen with functions and objects. And again, even if there are circular references, modern browsers handle this nicely. Old browsers (especially IE), not so much.
"This is why I'd love to know if I can safely ignore all of this or if we'll have to wait for Web Components in order to regain our long-lost sanity."
We can't ignore the need to use jQuery specific methods when destroying nodes. Your point about external frameworks is a good one. If they're not built specifically with jQuery in mind, there can be problems.
You mention jQuery's $.noConflict, which is another good point. This easily allows other frameworks/libraries to "safely" be loaded, which may overwrite the global $. This opens the door to leaks IMO.
AFAIK, $.noConflict also enables one to load multiple versions of jQuery. I don't know if there are separate caches, but I would assume so. If that's the case, I would imagine we'd have the same issues.
If jQuery is indeed going to use WeakMaps in the future as the comment you quoted suggests, that will be a good thing and a sensible move. It'll only help in browsers that support WeakMaps, but it's better than nothing.
"If not, do we still need to use .data()? Is it only good to retrieve HTML5 data- attributes?"
Just wanted to address the second question. Some people think .data() should always be used for HTML5 data- attributes. I don't because using .data() for that will import the data into jQuery.cache, so there's more memory to potentially leak.
I can see it perhaps in some narrow cases, but not for most data. Even with no leaks, there's no need to have most data- stored in two places. It increases memory usage with no benefit. Just use .attr() for most simple data stored as data- attributes.

In order to provide some of its features, jQuery has its own storage for some things. For example, if you do
$(elem).data("greeting", "hello");
Then, jQuery, will store the key "greeting" and the data "hello" on its own object (not on the DOM object). If you then use .removeChild(elem) to remove that element from the DOM and there are no other references to it, then that DOM element will be freed by the GC, but the data that you stored with .data() will not. This is a memory leak as the data is now orphaned forever (while you're on that web page).
If you use:
$(elem).remove();
or:
$(some parent selector).empty()
Then, jQuery will not only remove the DOM elements, but also clean up its extra shadow data that it keeps on items.
In addition to .data(), jQuery also keeps some info on event handlers that are installed which allows it to perform operations that the DOM by itself can't do such as $(elem).off(). That data also will leak if you don't dispose of an object using jQuery methods.
In a touch of irony, the reason jQuery doesn't store data as properties on the DOM elements themselves (and uses this parallel storage) is because there are circumstances where storing certain types of data on the DOM elements can itself lead to memory leaks.
As for the consequences of all this, most of the time it is a negligible issue because it's a few bytes of data that is recovered by the browser as soon as the user navigates to a new page.
The kinds of things that could make it become material are:
If you have a very dynamic web page that is constantly creating and removing DOM elements thousands of times and using jQuery features on those objects that store side data (jQuery event handlers, .data() on those elements, then any memory leak per operation could add up over time and become material.
If you have a very long running web page (e.g. a single page app) that stays on screen for very long periods of time and thus over time the memory leaks could accumulate.

Count all objects and variables in use

Is it possible to count created objects and variables in javascript?
I am using Google Chrome to analyse my web app. But to debug and find the objects that causes "Memory Leak" is not so easy (at least for me). So I want to know all objects and variables that are created on the current page so I can know if they are removed.

No, you can't do that in Chrome (or any other major browser). You can use Chrome's "memory" page (chrome://memory/) to get some idea what's going on, but it's not down to the object level, and it's important to understand that garbage collection does not happen synchronously or immediately. The browser / JavaScript engine may well allocate memory, use it for some JavaScript objects, and then later correctly understand that those objects aren't used anymore, but keep the memory handy for future use.
Instead, what you can do is study how JavaScript works in detail, which tells you what will (usually) be kept in memory, and why. Understand how closures work (disclosure: that's a post on my anemic little blog), and understand how IE doesn't handle circular references between DOM elements and JavaScript objects well (specifically, it doesn't clean them up well when nothing refers to either of them anymore, which is otherwise not normally a problem). And in general, don't worry too much about it until/unless you have a specific issue to address. (Which absolutely happens, but not as much as people sometimes think.)

Do i need to delete dom fragments or will the garbage collector remove them?

This maybe a bit of a silly question. I'm assuming that the garbage collector disposes of any dangling variables after a function ends execution, but I was wondering if this also applies to DOM fragments.
If I create a DOM fragment or any unattached node for that matter, will the garbage collector delete it after the function has finished execution?
//would this create a memory leak?
setInterval(function exampleScope() {
var unusedDiv = document.createElement('div');
}, 10);
I know this example is useless but its the simplest form of the pattern I'm worried about. I just want to know I'm doing the right thing. I've been hard at work building a very high performance JavaScript game engine, Red Locomotive. I don't want to add any memory leaks.

There is an IE 7 memory leak with DOM elements in event handlers: jQuery leaks solved, but why?
With other browser you should be fine. See Freeing memory used by unattached DOM nodes in Javascript.
If you worry much about memory leaks and care for your tech illiterate IE users, you should read this: Understanding and Solving Internet Explorer Leak Patterns

Well, it's not 100% conclusive but I made a quick JSFiddle to experiment with this. This is a tight loop that runs your code above 100000000 times. Using the Chrome Task Manager memory usage jumped from 47.9MB to 130MB and stayed fairly constant until completed, where it dropped down to 60MB ish.
http://jsfiddle.net/u7yPM/
This suggests that the DOM nodes are definitely being garbage collected, else the memory usage would continue increasing for the whole run of the test.
EDIT: I ran this on Chrome 14.

Can jQuery.data cause a memory leak?

Would the following piece of code create a memory leak.
According to the jQuery documentation use of the data function avoids memory leaks. It would be useful to confirm whether the following is safe.
var MyClass = function(el) {
// Store reference of element in object.
this.element = $(el);
};
// Store reference of object in element.
$('#something').data('obj', new MyClass('#something'));

Obviously the code as it stands would take up extra memory as long as the DOM element is still connected to the DOM. But I'm guessing you're asking whether it would continue using extra memory after the DOM element is no longer in use.
Update: Thanks to Joey's answer (which he has since deleted), I spent some time reading up on memory leaks in javascript, and it appears my assumptions in the paragraph below are incorrect. Because DOM elements don't use pure garbage collection, a circular reference like this would normally prevent both the DOM element and the javascript object from ever being released. However, I believe the remainder of this answer is still correct.
Without a deep knowledge of how javascript engines implement garbage collection, I can't speak authoritatively on the topic. However, my general understanding of garbage collection makes me think that your code would be "safe" in the sense that after the #something element is removed from the DOM, the resulting MyClass object would only have a reference to an object that has no other connections. The graph algorithms of the garbage collector should be able to identify that the DOM element and its MyClass object are "floating in space" and unconnected to everything else.
Furthermore, jQuery goes out of its way to strip data and events that are associated with a given DOM element once it is removed from the DOM. From the documentation:
jQuery ensures that the data is removed when DOM elements are removed via jQuery methods, and when the user leaves the page.
So assuming you use jQuery consistently, you would only have a one-way reference once the object is removed from the DOM anyway, which makes it that much easier possible for the garbage collector to know it can get rid of these objects.
So as long as you don't have something else referencing the MyClass object once the DOM element is removed, you shouldn't have a memory leak.

I suppose it depends on the Javascritp engine.
You have have the question precisely enought to perform a test. I added a long string in the object and ran the potential leak in a large loop.
As a result, I don't think in leaks in IE8 nor in Chrome.
But I could not reproduce these leakeage patterns either.

This can lead to a memory leak.
the theory of jQuery.data method may use A Data inner class to cache data for the dom element.
of course,when you remove the cache data,jQuery will unreference the data.
but the inner cache is a increasing array,when you you it ,it will go upon.
so ,in the end,there will be very big cache array,which will lead memeory leak.
In a long run web app,this may leak memory crash.

The data attribute only stores string values.

Develop Reference

JavaScript is the programming language of the Web.