I want to understand What are WeakRef and Finalizers in ES2021 with a real simple example and Where to use them.
I know, WeakRef is a class. This will allow developers to create weak references to objects, and Finalizer or FinalizationRegistry allows you to register callback functions that will be invoked when an object is garbage collected
const myWeakRef = new WeakRef({
name: 'Cache',
size: 'unlimited'
})
// Log the value of "myWeakRef":
console.log(myWeakRef.deref())
As always, MDN's docs help.
A WeakRef object contains a weak reference to an object, which is called its target or referent. A weak reference to an object is a reference that does not prevent the object from being reclaimed by the garbage collector. In contrast, a normal (or strong) reference keeps an object in memory. When an object no longer has any strong references to it, the JavaScript engine's garbage collector may destroy the object and reclaim its memory. If that happens, you can't get the object from a weak reference anymore.
In almost every other part of JS, if some object (A) holds a reference to another object (B), B will not be garbage-collected until A can be fully garbage-collected as well. For example:
// top level
const theA = {};
(() => {
// private scope
const theB = { foo: 'foo' };
theA.obj = obj;
})();
In this situation, the theB will never be garbage collected (unless theA.obj gets reassigned) because theA on the top level contains a property that holds a reference to theB; it's a strong reference, which prevents garbage collection.
A WeakRef, on the other hand, provides a wrapper with access to an object while not preventing garbage collection of that object. Calling deref() on the WeakRef will return you the object if it hasn't been garbage collected yet. If it has been GC'd, .deref() will return undefined.
FinalizationRegistry deals with a similar issue:
A FinalizationRegistry object lets you request a callback when an object is garbage-collected.
You first define the registry with the callback you want to run, and then you call .register on the registry with the object you want to observe. This will let you know exactly when something gets garbage collected. For example, the following will log Just got GCd! once the obj gets reclaimed:
console.log('script starting...');
const r = new FinalizationRegistry(() => {
console.log('Just got GCd!');
});
(() => {
// private closure
const obj = {};
r.register(obj);
})();
You can also pass a value when calling .register that gets passed to the callback when the object gets collected.
new FinalizationRegistry((val) => {
console.log(val);
});
r.register(obj, 'the object named "obj"')
will log the object named "obj" it gets GC'd.
All this said, there is rarely a need for these tools. As MDN says:
Correct use of FinalizationRegistry takes careful thought, and it's best avoided if possible. It's also important to avoid relying on any specific behaviors not guaranteed by the specification. When, how, and whether garbage collection occurs is down to the implementation of any given JavaScript engine. Any behavior you observe in one engine may be different in another engine, in another version of the same engine, or even in a slightly different situation with the same version of the same engine. Garbage collection is a hard problem that JavaScript engine implementers are constantly refining and improving their solutions to.
Best to let the engine itself deal with garbage collection automatically whenever possible, unless you have a really good reason to care about it yourself.
The main use of weak references is to implement caches or mappings to large objects. In many scenarios, we don't want to keep a lot of memory for a long time saving this rarely used cache or mappings. We can allow the memory to be garbage collected soon and later if we need it again, we can generate a fresh cache. If the variable is no longer reachable, the JavaScript garbage collector automatically removes it.
const callback = () => {
const aBigObj = {
name: "Hello world"
};
console.log(aBigObj);
}
(async function(){
await new Promise((resolve) => {
setTimeout(() => {
callback();
resolve();
}, 2000);
});
})();
When executing the above code, it prints "Hello world" after 2 seconds. Based on how we use the callback() function, aBigObj is stored in memory forever, maybe.
Let us make aBigObj a weak reference.
const callback = () => {
const aBigObj = new WeakRef({ name: "Hello world" }); console.log(aBigObj.deref().name);}
(async function(){
await new Promise((resolve) => {
setTimeout(() => {
callback(); // Guaranteed to print "Hello world"
resolve();
}, 2000);
});
await new Promise((resolve) => {
setTimeout(() => {
callback(); // No Gaurantee that "Hello world" is printed
resolve();
}, 5000);
});
})();
The first setTimeout() will surely print the value of the name. That is guaranteed in the first turn of the event loop after creating the weak reference.
But there is no guarantee that the second setTimeout() prints "Backbencher". It might have been sweeped by the garbage collector. Since the garbage collection works differently in different browsers, we cannot guarantee the output. That is also why we use WeakRef in situations like managing the cache.
More Information...
Related
This is a little bit tricky to explain, but I'll give it a try:
In a node.js server application I would like to deal with data objects that can be used in more than one place at once. The main problem is, that these objects are only referred to by an object id and are loaded from the database.
However, as soon as an object is already loaded into one scope, it should not be loaded a second time when requested, but instead the same object should be returned.
This leads me to the question of garbage collection: As soon as an object is no longer needed in any scope, it should be released completely to prevent having the whole database in the server's memory all the time. But here starts the problem:
There are two ways I can think of to create such a scenario: Either use a global object reference (which prevents any object from being collected) or, really duplicate these objects but synchronize them in a way that each time a property in one scope gets changed, inform the other instances about that change.
Again, therefore each instance would have to register an event handler, which in turn is pointing back to that instance thus preventing it from being collected again.
Did anyone come up with a solution for such a scenario I just didn't realize? Or is there any misconception in my understanding of the garbage collector?
What I want to avoid is manual reference counting for every object in the memory. Everytime when an object is being removed from any collection, I would have to adapt the reference count manually (there is even no destructor or "reference decreased" event in js)
Using the weak module, I implemented a WeakMapObj that works like we originally wanted WeakMap to work. It allows you to use a primitive for the key and an object for the data and the data is retained with a weak reference. And, it automatically removes items from the map when their data is GCed. It turned out to be fairly simple.
const weak = require('weak');
class WeakMapObj {
constructor(iterable) {
this._map = new Map();
if (iterable) {
for (let array of iterable) {
this.set(array[0], array[1]);
}
}
}
set(key, obj) {
if (typeof obj === "object") {
let ref = weak(obj, this.delete.bind(this, key));
this._map.set(key, ref);
} else {
// not an object, can just use regular method
this._map.set(key, obj);
}
}
// get the actual object reference, not just the proxy
get(key) {
let obj = this._map.get(key);
if (obj) {
return weak.get(obj);
} else {
return obj;
}
}
has(key) {
return this._map.has(key);
}
clear() {
return this._map.clear();
}
delete(key) {
return this._map.delete(key);
}
}
I was able to test it in a test app and confirm that it works as expected when the garbage collector runs. FYI, just making one or two objects eligible for garbage collection did not cause the garbage collector to run in my test app. I had to forcefully call the garbage collector to see the effect. I assume that would not be an issue in a real app. The GC will run when it needs to (which may only run when there's a reasonable amount of work to do).
You can use this more generic implementation as the core of your object cache where an item will stay in the WeakMapObj only until it is no longer referenced elsewhere.
Here's an implementation that keeps the map entirely private so it cannot be accessed from outside of the WeakMapObj methods.
const weak = require('weak');
function WeakMapObj(iterable) {
// private instance data
const map = new Map();
this.set = function(key, obj) {
if (typeof obj === "object") {
// replace obj with a weak reference
obj = weak(obj, this.delete.bind(this, key));
}
map.set(key, obj);
}
// add methods that have access to "private" map
this.get = function(key) {
let obj = map.get(key);
if (obj) {
obj = weak.get(obj);
}
return obj;
}
this.has = function(key) {
return map.has(key);
}
this.clear = function() {
return map.clear();
}
this.delete = function(key) {
return map.delete(key);
}
// constructor implementation
if (iterable) {
for (let array of iterable) {
this.set(array[0], array[1]);
}
}
}
Sounds like a job for a Map object used as a cache storing the object as the value (along with a count) and the ID as the key. When you want an object, you first look up its ID in the Map. If it's found there, you use the returned object (which will be shared by all). If it's not found there, you fetch it from the database and insert it into the Map (for others to find).
Then, to make it so that the Map doesn't grow forever, the code that fetches something from the Map would also need to release an object from the Map. When the useCnt goes to zero upon a release, you would remove an object from the Map.
This can be made entirely transparent to the caller by creating some sort of cache object that contains the Map and has methods for getting an object or releasing an object and it would be entirely responsible for maintaining the refCnt on each object in the Map.
Note: you will likely have to write the code that fetches it from the DB and inserts it into the Map carefully in order to not create a race condition because the fetching form the database is likely asynchronous and you could get multiple callers all not finding it in the Map and all in the process of getting it from the database. How to avoid that race condition depends upon the exact database you have and how you're using it. One possibility is for the first caller to insert a place holder in the Map so subsequent callers will know to wait for some promise to resolve before the object is inserted in the Map and available to them to use.
Here's a general idea for how such an ObjCache could work. You call cache.get(id) when you want to retrieve an item. This always returns a promise that resolves to the object (or rejects if there's an error getting it from the DB). If the object is in the cache already, the promise it returns will be already resolved. If the object is not in the cache yet, the promise will resolve when it has been fetched from the DB. This works even when multiple parts of your code request an object that is "in the process" of being fetched from the DB. They all get the same promise that is resolved with the same object when the object has been retrieved from the DB. Every call to cache.get(id) increases the refCnt for that object in the cache.
You then call cache.release(id) when a given piece of code is done with an object. That will decrement the internal refCnt and remove the object from the cache if the refCnt hits zero.
class ObjCache() {
constructor() {
this.cache = new Map();
}
get(id) {
let cacheItem = this.cache.get(id);
if (cacheItem) {
++cacheItem.refCnt;
if (cacheItem.obj) {
// already have the object
return Promise.resolve(cacheItem.obj);
}
else {
// object is pending, return the promise
return cacheItem.promise;
}
} else {
// not in the cache yet
let cacheItem = {refCnt: 1, promise: null, obj: null};
let p = myDB.get(id).then(function(obj) {
// replace placeholder promise with actual object
cacheItem.obj = obj;
cacheItem.promise = null;
return obj;
});
// set placeholder as promise for others to find
cacheItem.promise = p;
this.cache.set(id, cacheItem);
return p;
}
}
release(id) {
let cacheItem = this.cache.get(id);
if (cacheItem) {
if (--cacheItem.refCnt === 0) {
this.cache.delete(id);
}
}
}
}
Ok, for anyone who faces similar problems, I found a solution. jfriend00 pushed me towards this solution by mentioning WeakMaps which were not exactly the solution themselves, but pointed my focus on weak references.
There is an npm module simply called weak that will do the trick. It holds a weak reference to an object and safely returns an empty object once the object was garbage collected (thus, there is a way to identify a collected object).
So I created a class called WeakCache using a DataObject:
class DataObject{
constructor( objectID ){
this.objectID = objectID;
this.dataLoaded = new Promise(function(resolve, reject){
loadTheDataFromTheDatabase(function(data, error){ // some pseudo db call
if (error)
{
reject(error);
return;
}
resolve(data);
});
});
}
loadData(){
return this.dataLoaded;
}
}
class WeakCache{
constructor(){
this.cache = {};
}
getDataObjectAsync( objectID, onObjectReceived ){
if (this.cache[objectID] === undefined || this.cache[objectID].loadData === undefined){ // object was not cached yet or dereferenced, recreate it
this.cache[objectID] = weak(new DataObject( objectID )function(){
// Remove the reference from the cache when it got collected anyway
delete this.cache[this.objectID];
}.bind({cache:this, objectID:objectID});
}
this.cache[objectID].loadData().then(onObjectReceived);
}
}
This class is still in progress but at least this is a way how it could work. The only downside to this (but this is true for all database-based data, pun alert!, therefore not such a big deal), is that all data access has to be asynchronous.
What will happen here, is that the cache at some point may hold an empty reference to every possible object id.
I just stumbled on the IndexedDB example on MDN which contains the following:
function openDb() {
var req = indexedDB.open(DB_NAME, DB_VERSION);
req.onsuccess = function (evt) {
// Better use "this" than "req" to get the result
// to avoid problems with garbage collection.
// db = req.result;
db = this.result;
};
// Rest of code omitted for brevity
}
What is the problem with the garbage collector that should better be avoided?
This advice looks weird: the object the req variable refers to (the same the this would refer to) as well as the anonymous function objects (which are hold by onsuccess, onerror and onupgradeneeded properties) would all be garbage collectible simultaneously as soon as the query has completed and callbacks have been invoked.
Technically - req represents another reference to the object; practically it cannot cause any "problems with garbage collection".
To summarize: it's neither an "optimisation" nor "micro optimisation", both would perform equally.
As far as I can tell if you reference 'req', 'openDb' invocation scope is tied to 'onsuccess' (as parent scope) so you create a closure. If - on the other hand - you reference only 'this', the 'openDb' invocation scope can be discarded as soon as you exit the function.
What might be causing confusion is that the object refered to by 'req' lives beyond the lifetime of 'openDb' - it is not used exclusiveley from within that function.
If I can't avoid using the "new" keyword in my Node.js app, how can I efficiently mark the object for garbage collection? I create a new object with a fairly high level constructor (by that, I mean new myObj() actually creates several new Something() objects deeper down in the process.
Let's say my constructor looks a little like the following:
function User(name, age) {
this.something = x;
this.greet = function() { return "hello"; };
this.somethingElse = new OtherThing();
return this;
}
Then I create a new User within another function:
function createUserAndGreet(name, age) {
var user1 = new User(name, age);
user1.greet();
// now I don't need user1 anymore
}
(btw, I know this code sucks but I think it gets my point across)
After I have got him to greet, how can I then get rid of him and free up the memory he occupied?
So if I just set user1 to null with user1 = null; after I'm finished with him, will he be gone forever and GC will come and clean him up? Will that also delete the new OtherThing() that was created? Or is there something else I need to do? I hear that using new is notorious for memory leaks so I just want to know if there's a way around this.
The GC takes care of everything that is not referenced. Your code is "GC-Compatible" as is, because after the call to createUserAndGreet(), all references are destroyed.
The only things you have to care about regarding GC is what is attached directly to the Window object (globals), these will never be collected. Your two variables are scoped inside createUserAndGreet() and this scope is destroyed when the function call ends.
I've been trying to track down any issues with garbage collection within my application code. I've stripped it down to pure knockout code and there seems to be an issue collecting created objects depending on how a computed property is created.
Please see the following JS fiddle: http://jsfiddle.net/SGXJG/
Open Chrome profiler.
Take a heap snapshot
Click Make All
Take another heap snapshot
Compare the snapshots
Test2 & Test3 still remain in memory while Test1 is correctly collected.
Please see the following code for the viewmodel:
function ViewModel() {
this.test1 = null;
this.test2 = null;
this.test3 = null;
}
ViewModel.prototype = {
makeAll: function () {
this.make1();
this.make2();
this.make3();
},
make1: function () {
this.test1 = new Test1();
this.test1.kill();
delete this.test1;
},
make2: function () {
this.test2 = new Test2();
this.test2.kill();
delete this.test2;
},
make3: function () {
this.test3 = new Test3();
this.test3.kill();
delete this.test3;
},
};
ko.applyBindings(new ViewModel());
And here are the three tests classes:
function Test1() {
var one = this.one = ko.observable();
var two = this.two = ko.observable();
this.three = ko.computed(function () {
return one() && two();
});
}
Test1.prototype = {
kill: function () {
this.three.dispose();
}
};
function Test2() {
this.one = ko.observable();
this.two = ko.observable();
this.three = ko.computed(function () {
return this.one() && this.two();
}, this);
}
Test2.prototype = {
kill: function () {
this.three.dispose();
}
};
function Test3() {
var self = this;
self.one = ko.observable();
self.two = ko.observable();
self.three = ko.computed(function () {
return self.one() && self.two();
});
self.kill = function () {
self.three.dispose();
};
}
The difference being that Test1 'three' computed does not use this or self to reference the 'one' & 'two' observable properties. Can someone explain what's happening here? I guess there's something in the way the closures contain the object reference but I don't understand why
Hopefully I've not missed anything. Let me know if I have and many thanks for any responses.
Chrome, like all modern browsers, uses a mark-and-sweep algorithm for garbage collection. From MDN:
This algorithm assumes the knowledge of a set of objects called roots (In JavaScript, the root is the global object). Periodically, the garbage-collector will start from these roots, find all objects that are referenced from these roots, then all objects referenced from these, etc. Starting from the roots, the garbage collector will thus find all reachable objects and collect all non-reachable objects.
The garbage collector doesn't run right away, which is why you still see the objects in Chrome's snapshot even though you've dereferenced them (edit: As mentioned here, running the heap snapshot first runs the garbage collector. Possibly it's not processing everything and so doesn't clear the objects; see below.)
One thing that seems to generally trigger the garbage collector is to create new objects. We can verify this using your example. After going through the steps in your question, click on "Make1" and take another heap snapshot. You should see that Test2 is gone. Now do it again, and you'll see that Test3 is also gone.
Some more notes:
Chrome doesn't clean up everything during garbage collection. "[Chrome]...processes only part of the object heap in most garbage collection cycles." (google.com) Thus we see that it takes a couple runs of the garbage collector to clear everything.
As long as all external references to a object are cleared, the object will be cleaned up by the garbage collector eventually. In your example, you can even remove the dispose calls, and all the objects will be cleared.
I think this is a classical loop reference problem.
Let's call:
var test2 = new Test2();
now test2.three holds a reference of test2! Because you literally asked knockout to bind a function(){...} with that "this" object, the test2 object.
Since test2 naturally holds a reference of test2.three, you now got a loop reference between the two objects!
You can see this is same for your Test3.
But for Test1, let's call:
var test1 = new Test1();
test1.three holds references of two objects (test1.one and test2.two), test1 holds three references (test1.one, test1.two and test1.three), there is no loop reference.
In some other languages like Java and Objective-C, the language supports weak reference to deal with this kind of issue. But so far, weak reference not implemented in Javascript.
+1 thanks for your question! It gave my brain some spin, helped me to understand Javascript more :)
I think the problem is that you use && in your code, it will return a boolean, properly "true".
this.three = ko.computed(function () {
// !
return this.one() && this.two();
},
So this.three == true and not self.one + self.two if this was the intention?
and when you dispose
this.three.dispose();
You just get rid of a boolean.
Is there a reason why you have an extra "this" in "function Test2()"?
I'm quite new to Javascript and I was just reading following article.
you can define an ajax connection
once, and reuse it multiple times, and
start and stop it later on. Here's an
example:
var myAjaxRequest = A.io.request('test.html', {
method: 'POST',
data: {
key1: 'value1'
}
});
Now later on, if I want to make that
same ajax call again, all I have to do
is call:
myAjaxRequest.start();
What if I had a very frequently used auction page and I wanted to use the myAjaxRequest connection for all actions a user does from his browser. What are the rules for lifetime of the myAjaxRequest instance ? I suppose it is destroyed on page refresh. But is it anything else that destroys it ? Let say that the object is created within YUI sandbox, but it doesn't matter.
Its a shame this was answered in comments because nobody gets closure (sorry, terrible pun). #Šime Vidas and #WaiLam deserve the credit but I will at least attempt to craft an answer:
While you have a reference to the object (though the variable myAjaxRequest) it will remain in memory until the document is unloaded. If you assign null to your variable (myAjaxRequest = null), and there are no other references to the object, then the garbage collector will reclaim the memory used to store it.
A reference can exist even if myAjaxRequest is a local variable within a function. The function can return a reference to the local variable, for example as a object property e.g:
function sandbox () {
var myAjaxRequest = A.io.request(/* constructor... */);
return {
myRequest: myAjaxRequest
};
}
var mySandbox = sandbox();
mySandbox.myRequest.start();
or it can return a reference through a closure (excellent explanation here), e.g:
function sandbox () {
var myAjaxRequest = A.io.request(/* constructor... */);
return {
getRequest: function () {
return myAjaxRequest;
}
};
}
var mySandbox = sandbox();
mySandbox.getRequest().start();
As long as you have a reference to your object it will not be garbage collected. You can safely call the start method until the page is unloaded.