I have researched on net about the benefits of immutablejs over Object.freeze() but didn't find anything satisfying!
My question is why I should use this library and work with non native data structures when I can freeze a plain old javascript object?
I don't think you understood what immutablejs offers. It's not a library which just turns your objects immutable, it's a library around working with immutable values.
Without simply repeating their docs and mission statement, I'll state two things it provides:
Types. They implemented (immutable) infinite ranges, stacks, ordered sets, lists, ...
All of their types are implemented as Persistent Data Structures.
I lied, here's a quote of their mission statement:
Immutable data cannot be changed once created, leading to much simpler application development, no defensive copying, and enabling advanced memoization and change detection techniques with simple logic. Persistent data presents a mutative API which does not update the data in-place, but instead always yields new updated data.
I urge you to read the articles and videos they link to and more about Persistent Data Structures (since they're the thing immutablejs is about), but I'll summarise in a sentence or so:
Let's imagine you're writing a game and you have a player which sits on a 2d plane. Here, for instance, is Bob:
var player = {
name: 'Bob',
favouriteColor: 'moldy mustard',
x: 4,
y: 10
};
Since you drank the FP koolaid you want to freeze the player (brrr! hope Bob got a sweater):
var player = Object.freeze({
name: 'Bob',
...
});
And now enter your game loop. On every tick the player's position is changed. We can't just update the player object since it's frozen, so we copy it over:
function movePlayer(player, newX, newY) {
return Object.freeze(Object.assign({}, player, { x: newX, y: newY }));
}
That's fine and dandy, but notice how much useless copying we're making: On every tick, we create a new object, iterate over one of our objects and then assign some new values on top of them. On every tick, on every one of your objects. That's quite a mouthful.
Immutable wraps this up for you:
var player = Immutable.Map({
name: 'Bob',
...
});
function movePlayer(player, newX, newY) {
return player.set('x', newX).set('y', newY);
}
And through the ノ*✧゚ magic ✧゚*ヽ of persistent data structures they promise to do the least amount of operations possible.
There is also the difference of mindsets. When working with "a plain old [frozen] javascript object" the default actions on the part of everything is to assume mutability, and you have to work the extra mile to achieve meaningful immutability (that's to say immutability which acknowledges that state exists). That's part of the reason freeze exists: When you try to do otherwise, things panic. With Immutablejs immutability is, of course, the default assumption and it has a nice API on top of it.
That's not to say all's pink and rosy with cherry on top. Of course, everything has its downsides, and you shouldn't cram Immutable everywhere just because you can. Sometimes, just freezeing an object is Good Enough. Heck, most of the time that's more than enough. It's a useful library which has its niche, just don't get carried away with the hype.
According to my benchmarks, immutable.js is optimized for write operations, faster than Object.assign(), however, it is slower for read operations. So the descision depends the type of your application and its read/write ratio. Following are the summary of the benchmarks results:
-- Mutable
Total elapsed = 103 ms = 50 ms (read) + 53 ms (write).
-- Immutable (Object.assign)
Total elapsed = 2199 ms = 50 ms (read) + 2149 ms (write).
-- Immutable (immutable.js)
Total elapsed = 1690 ms = 638 ms (read) + 1052 ms (write).
-- Immutable (seamless-immutable)
Total elapsed = 91333 ms = 31 ms (read) + 91302 ms (write).
-- Immutable (immutable-assign (created by me))
Total elapsed = 2223 ms = 50 ms (read) + 2173 ms (write).
Ideally, you should profile your application before introducing any performance optimization, however, immutability is one of those design decision must be decided early. When you start using immutable.js, you need to use it throughout your entire application to get the performance benefits, because interop with plain JS objects using fromJS() and toJS() is very costly.
PS: Just found out that deep freeze'ed array (1000 elements) become very slow to update, about 50 times slower, therefore you should only use deep freeze in development mode only. Benchmarks results:
-- Immutable (Object.assign) + deep freeze
Total elapsed = 45903 ms = 96 ms (read) + 45807 ms (write).
Both of them don't make the object deeply immutable.
However, using Object.freeze you'll have to create the new instances of the object / array by yourself, and they won't have structural sharing. So every change which will require deeply copying everything, and the old collection will be garbage collected.
immutablejs on the other hand will manage the collections, and when something changes, the new instance will use the parts of the old instance that haven't changed, so less copying and garbage collecting.
There are a couple of major differences between Object.freeze() and immutable.js.
Let's address the performance cost first. Object.freeze() is shallow. It will make the object immutable, but the nested properties and methods inside said object can still be mutated. The Object.freeze() documentation addresses this and even goes on to provide a "deepFreeze" function, which is even more costly in terms of performance. Immutable.js on the other hand will make the object as a whole (nested properties, method, etc) immutable at a lower cost.
Additionally should you ever need to clone an immutable variable Object.freeze() will force you to create an entirely new variable, while Immutable.js can reuse the existing immutable variable to create the clone more efficiently. Here's an interesting quote about this from this article:
"Immutable methods like .set() can be more efficient than cloning
because they let the new object reference data in the old object: only
the changed properties differ. This way you can save memory and
performance versus constantly deep-cloning everything."
In a nutshell, Immutable.js makes logical connections between the old and new immutable variables, thus improving the performance of cloning and the space frozen variables take in memory. Object.freeze() sadly does not - every time you clone a new variable from a frozen object you basically write all the data anew, and there is no logical connection between the two immutable variables even if (for some odd reason) they hold identical data.
So in terms of performance, especially if you constantly make use of immutable variables in your program, Immutable.js is a great choice. However, performance is not everything and there are some big caveats to using Immutable.js. Immutable.js uses it's own data structure, which makes debugging, or even just logging data to the console, a royal pain. It also might lead to a loss of basic JavaScript functionality (for example, you cannot use ES6 de-structuring with it) The Immutable.js documentation is infamously impossible to understand (because it was originally written for use only within Facebook itself), requiring a lot of web-searching even when simple issues arise.
I hope this covers the most important aspects of both approaches and helps you decide which will work best for you.
Object.freeze does not do any deep freezing natively, I believe that immutable.js does.
The same with any library -- why use underscore, jquery, etc etc.
People like re-using the wheels that other people built :-)
The biggest reason that comes to mind - outside of having a functional api that helps with immutable updates, is the structural sharing utilized by Immutable.js. If you have an application that needs enforced immutability (ie, you're using Redux) then if you're only using Object.freeze then you're going to be making a copy for every 'mutation'. This isn't really efficient over time, since this will lead to GC thrasing. With Immutable.js, you get structural sharing baked in (as opposed to having to implement an object pool/a structural sharing model of your own) since the data structures returned from immutable are Tries. This means that all mutations are still referenced within the data structure, so GC thrashing is kept to a minimum. More about this is on Immutable.js's docsite (and a great video going into more depth by the creator, Lee Byron):
https://facebook.github.io/immutable-js/
Related
I don't know why I can't find an answer to this, but, with state managers requiring immutability, doesn't this come at a significant performance slowdown for large states? I am building an app for fun that wouldn't have arrays large enough to cause a problem (n would max out around a thousand or so, maybe, in extreme cases, a couple thousand).
But, lets say that the app had to keep a gigantic store of information in memory for some reason in state. Every time you have to edit one object, that would mean you have to completely reconstruct the array.
I'm just thinking that, maybe in the future I would need to have states that store such an amount of data, although I can't necessarily think of a concrete example where you would require a state manager to handle it for you. I'm just curious, is there any example where this weird hypothetical of having a huge state happens.
On a side note, the only reference I could find of immutability affecting speed is in increasing it, but this was on the note of reference equality being used for comparison. But wouldn't mutating data have nothing to do with whether you can compare the reference of the object or not?
You're misunderstanding how immutable updates work.
A correct immutable update doesn't require deep-cloning all objects. It's more like a nested shallow clone. Only the set of objects that actually have to be updated get copied, not all of them.
If I want to update state.items[5].completed, I need to make a copy of the item object at index 5, the items array, and state. All the other objects in the items array, and all the other sections of state, remain untouched.
There certainly is some amount of cost to copying those objects, but for the most part it's not meaningful unless they're exceptionally large.
So far most of "starter boilerplates" and some posts about react / redux I've seen encourage usage of immutable.js to address mutability. I personally rely on Object.assign or spread operators to handle this, hence don't really see advantage in immutable.js as it adds additional learning and shifts a bit from vanilla js techniques used for mutability. I was trying to find valid reasons for a switch, but wasn't able to hence I am asking here to see why it is so popular.
This is all about efficiency.
Persistent Data Structures
A persistent data structure keeps previous versions of itself when it is mutated by always yielding a new data structure. To avoid expensive cloning only the difference to the previous data structure is stored, whereas the intersection is shared between them. This strategy is called structural sharing. Hence persistent data structures are much more efficient then cloning with Object.assign or the spread operator.
Drawbacks of persistent data structures in Javascript
Unfortunately Javascript doesn't support persistent data structures natively. That is the reason immutable.js exists and that its objects differ greatly from plain old Javascript Objects. This leads to more verbose code and a lot of conversions of persistent data structures to native Javascript data structures.
The crucial question
When does the benefits of immutable.js's structural sharing (efficiency) exceed its disadvantages (verbosity, conversions)?
I guess the library pays off only in large projects with numerous and extensive objects and collections, when cloning of whole data structures and garbage collection gets more expensive.
I have created the performance benchmarks for multiple immutable libraries, the script and results are located inside the immutable-assign (GitHub project), which shows that immutable.js is optimized for write operations, faster than Object.assign(), however, it is slower for read operations. Following are the summary of the benchmarks results:
-- Mutable
Total elapsed = 50 ms (read) + 53 ms (write) = 103 ms.
-- Immutable (Object.assign)
Total elapsed = 50 ms (read) + 2149 ms (write) = 2199 ms.
-- Immutable (immutable.js)
Total elapsed = 638 ms (read) + 1052 ms (write) = 1690 ms.
-- Immutable (seamless-immutable)
Total elapsed = 31 ms (read) + 91302 ms (write) = 91333 ms.
-- Immutable (immutable-assign (created by me))
Total elapsed = 50 ms (read) + 2173 ms (write) = 2223 ms.
Therefore, whether to use immutable.js or not will depend on the type of your application, and its read to write ratio. If you have lots of write operations, then immutable.js will be a good option.
Premature optimization is the root of all evil
Ideally, you should profile your application before introducing any performance optimization, however, immutability is one of those design decision must be decided early. When you start using immutable.js, you need to use it throughout your entire application to get the performance benefits, because interop with plain JS objects using fromJS() and toJS() is very costly.
I think the main advantage of ImmutableJs is in its data structures and speed. Sure, it also enforces immutability, but you should be doing that anyways, so that's just an added benefit.
For example, say you have a very large object in your reducer and you want to change a very small part of that object. Because of immutability, you can't change the object directly, but you must create a copy of the object. You do that by copying everything (in ES6 using the spread operator).
So what's the problem? Copying very large objects is very slow. Data structures in ImmutableJS do something called structural sharing where you really only change the data you want. The other data that you aren't changing is shared between the objects, so it doesn't get copied.
The result of this are highly efficient data structures, with fast writes.
ImmutableJS also offers easy comparisons for deep objects. For example
const a = Immutable.Map({ a: Immutable.Map({ a: 'a', b: 'b'}), b: 'b'});
const b = Immutable.Map({ a: Immutable.Map({ a: 'a', b: 'b'}), b: 'b'});
console.log(a.equals(b)) // true
Without this, you'd need some sort of deep comparison function, that would also take a lot of time, whereas the root nodes here contain a hash of the entire datastructure (don't quote me on this, this is how I remember it, but the comparisons are always instant), so comparisons are always O(1) i.e. instant, regardless of object size.
This can be especially useful in the React shouldComponentUpdate method, where you can just compare the props using this equals function, which runs instantaneously.
Of course, there are also downsides, if you mix immutablejs structures and regular objects, it can be hard to tell what's what. Also your codebase is littered with the immutableJS syntax, which is different from regular Javascript.
Another downside is that if you aren't going to use deeply nested objects, it is going to be a bit slower than plain old js, since the data structures do have some overhead.
Just my 2 cents.
The advantage of immutable.js is that it enforces an immutable redux store (you might forget to extend, or do a sloppy job at protecting your store from modifications) and simplifies the reducers a lot compared to Object.assign or spread operators (both are shallow by the way!).
That being said I've used redux+immutable in a big Angular 1.x project and there are downsides: performance issues, it's not clear what is immutable and what is not, Angular cannot use immutable.js structures in ng-repeat, etc. Not sure about React though, would love to hear opinions.
immutable.js gives you a lot of utilities that always return a new object.
example:
var Immutable = require('immutable');
var map1 = Immutable.Map({a:1, b:2, c:3});
var map2 = map1.set('b', 50);
map1.get('b'); // 2
map2.get('b'); // 50
For just react / redux I personally think that Object.assign is powerful enought, but in some cases the use of that library could save you a few lines of code.
It is not only about performance gains via persistent data structures. Immutability is highly desirable quality in itself, since it completely eliminates any bugs caused by accidental mutations. Plus, it's nice to have sane data structures with convenient API in Javascript, something actually usable, and immutable.js is way ahead of vanillla objects and arrays, even with recent additions of ES6 and ES7. It's like jQuery - it's so popular because it's API is really great and easy to use compared to vanilla DOM API (which is just a nightmare). Sure, you can try to stay immutable with vanilla objects, but then you have not to forget to Object.assign and array spread EVERY FREAKING TIME, and with immutable.js you simply can't go wrong, any mutations here are explicit
Coming from Java, Javascript object reminds me of HashMap in Java.
Javascript:
var myObject = {
firstName: "Foo",
lastName: "Bar",
email: "foo#bar.com"
};
Java:
HashMap<String, String> myHashMap = new HashMap<String, String>();
myHashMap.put("firstName", "Foo");
myHashMap.put("lastName", "Bar");
myHashMap.put("email", "foo#bar.com");
In Java HashMap, it uses the hashcode() function of the key to determine the bucket location (entries) for storage, and retrieval. Majority of the time, for basic operations such as put() and get(), the performance is constant time, until a hash collision occurs which becomes O(n) for these basic operations because it forms a linked list in order to store the collided entries.
My question is:
How does Javascript stores object?
What is the performance of operations?
Will there ever be any collision or other scenarios which will degrade the performance like in Java
Thanks!
Javascript looks like it stores things in a map, but that's typically not the case. You can access most properties of an object as if they were an index in a map, and assign new properties at runtime, but the backing code is much faster and more complicated than just using a map.
There's nothing requiring VMs not to use a map, but most try to detect the structure of the object and create an efficient in-memory representation for that structure. This can lead to a lot of optimizations (and deopts) while the program is running, and is a very complicated situation.
This blog post, linked in the question comments by #Zirak, has a quite good discussion of the common structures and when VMs may switch from a struct to a map. It can often seem unpredictable, but is largely based on a set of heuristics within the VM and how many different objects it believes it has seen. That is largely related to the properties (and their types) of return values, and tends to be centered around each function (especially constructor functions).
There are a few questions and articles that dig into the details (but are hopefully still understandable without a ton of background):
slow function call in V8 when using the same key for the functions in different objects
Why is getting a member faster than calling hasOwnProperty?
http://mrale.ph/blog/2013/08/14/hidden-classes-vs-jsperf.html (and the rest of this blog)
The performance varies greatly, based on the above. Worst case should be a map access, best case is a direct memory access (perhaps even a deref).
There are a large number of scenarios that can have performance impacts, especially given how the JITter and VM will create and destroy hidden classes at runtime, as they see new variations on an object. Suddenly encountering a new variant of an object that was presumed to be monomorphic before can cause the VM to switch back to a less-optimal representation and stop treating the object as an in-memory struct, but the logic around that is pretty complicated and well-covered in this blog post.
You can help by making sure objects created from the same constructor tend to have very similar structures, and making things as predictable as possible (good for you, maintenance, and the VM). Having known properties for each object, set types for those properties, and creating objects from constructors when you can should let you hit most of the available optimizations and have some awfully quick code.
Before my re-entry in JavaScript (and related) I've done lots of ActionScript 3 and there they had a Dictionary object that had weak keys just like the upcoming WeakMap; but the AS3 version still was enumerable like a regular generic object while the WeakMap specifically has no .keys() or .values().
The AS3 version allowed us to rig some really interesting and usefull constructs but I feel the JS version is somewhat limited. Why is that?
If the Flash VM could do it then what is keeping browsers from doing same? I read how it would be 'non-deterministic' but that is sort of the point right?
Finally found the real answer: http://tc39wiki.calculist.org/es6/weak-map/
A key property of Weak Maps is the inability to enumerate their keys. This is necessary to prevent attackers observing the internal behavior of other systems in the environment which share weakly-mapped objects. Should the number or names of items in the collection be discoverable from the API, even if the values aren't, WeakMap instances might create a side channel where one was previously not available.
It's a tradeoff. If you introduce object <-> object dictionaries that support enumerability, you have two options with relation to garbage collection:
Consider the key entry a strong reference that prevents garbage collection of the object that's being used as a key.
Make it a weak reference that allows its keys to be garbage collected whenever every other reference is gone.
If you do #1 you will make it extremely easy to to shoot yourself in the foot by leaking large objects into memory all over the place. On the other hand, if you go with option #2, your key dictionary becomes dependent on the state of garbage collection in the application, which will inevitably lead to impossible to track down bugs.
I have significant garbage collection pauses. I'd like to pinpoint the objects most responsible for this collection before I try to fix the problem. I've looked at the heap snapshot on Chrome, but (correct me if I am wrong) I cannot seem to find any indicator of what is being collected, only what is taking up the most memory. Is there a way to answer this empirically, or am I limited to educated guesses?
In chrome profiles takes two heap snapshots, one before doing action you want to check and one after.
Now click on second snapshot.
On the bottom bar you will see select box with option "summary". Change it to "comparision".
Then in select box next to it select snaphot you want to compare against (it should automaticaly select snapshot1).
As the results you will get table with data you need ie. "New" and "Deleted" objects.
With newer Chrome releases there is a new tool available that is handy for this kind of task:
The "Record Heap Allocations" profiling type. The regular "Heap SnapShot" comparison tool (as explained in Rafał Łużyński answers) cannot give you that kind of information because each time you take a heap snapshot, a GC run is performed, so GCed objects are never part of the snapshots.
However with the "Record Heap Allocations" tool constantly all allocations are being recorded (that's why it may slow down your application a lot when it is recording). If you are experiencing frequent GC runs, this tool can help you identify places in your code where lots of memory is allocated.
In conjunction with the Heap SnapShot comparison you will see that most of the time a lot more memory is allocated between two snapshots, than you can see from the comparison. In extreme cases the comparison will yield no difference at all, whereas the allocation tool will show you lots and lots of allocated memory (which obviously had to be garbage collected in the meantime).
Unfortunately the current version of the tool does not show you where the allocation took place, but it will show you what has been allocated and how it is was retained at the time of the allocation. From the data (and possibly the constructors) you will however be able to identify your objects and thus the place where they are being allocated.
If you're trying to choose between a few likely culprits, you could modify the object definition to attach themselves to the global scope (as list under document or something).
Then this will stop them from being collected. Which may make the program faster (they're not being reclaimed) or slower (because they build up and get checked by the mark-and-sweep every time). So if you see a change in performance, you may have found the problem.
One alternative is to look at how many objects are being created of each type (set up a counter in the constructor). If they're getting collected a lot, they're also being created just as frequently.
Take a look at https://developers.google.com/chrome-developer-tools/docs/heap-profiling
especially Containment View
The Containment view is essentially a "bird's eye view" of your
application's objects structure. It allows you to peek inside function
closures, to observe VM internal objects that together make up your
JavaScript objects, and to understand how much memory your application
uses at a very low level.
The view provides several entry points:
DOMWindow objects — these are objects considered as "global" objects
for JavaScript code; GC roots — actual GC roots used by VM's garbage
collector; Native objects — browser objects that are "pushed" inside
the JavaScript virtual machine to allow automation, e.g. DOM nodes,
CSS rules (see the next section for more details.) Below is the
example of what the Containment view looks like: