objects in javascript - javascript

Primitive values are stored in a stack in javascript but objects are stored in a heap. I understand why to store primitives in stack but any reason why objects are stored in heaps?

Actually, in JavaScript even primitives are stored in the heap rather than on a stack (see note below the break below, though). When control enters a function, an execution context (an object) for that call to the function is created, which has a variable object. All vars and arguments to the function (plus a couple of other things) are properties of that anonymous variable object, exactly like other properties of named objects. A call stack is used, but the spec doesn't require the stack be used for "local" variable storage, and JavaScript's closures would make using a stack a'la C, C++, etc. for that impractical. Details in the spec.
Instead, a chain (linked list) is used. When you refer to an unqualified symbol, the interpreter checks the variable object for the current execution context to see if it has a property for that name. If so, it gets used; if not, the next variable object in the scope chain is checked (note that this is in the lexical order, not the call order like a call stack), and so on until the global execution context is reached (the global execution context has a variable object just like any other execution context does). The variable object for the global EC is the only one we can directly access in code: this points to it in global scope code (and in any function called without this being explicitly set). (On browsers, we have another way of accessing it directly: The global variable object has a property called window that it uses to point to itself.)
Re your question of why objects are stored in the heap: Because they can be created and released independently of one another. C, C++, and others that use a stack for local variables can do so because variables can (and should) be destroyed when the function returns. A stack is a nice efficient way to do that. But objects aren't created an destroyed in that straightforward a way; three objects created at the same time can have radically different lifecycles, so a stack doesn't make sense for them. And since JavaScript's locals are stored on objects and those objects have a lifecycle that's (potentially) unrelated to the function returning...well, you get the idea. :-) In JavaScript, the stack is pretty much just for return addresses.
However, it's worth noting that just because things are as described above conceptually, that doesn't mean that an engine has to do it that way under the hood. As long as it works externally as described in the spec, implementations (engines) are free to do what they like. I understand that V8 (Google's JavaScript engine, used in Chrome and elsewhere) does some very clever things, like for instance using the stack for local variables (and even local object allocations within the function) and then only copying those out into the heap if necessary (e.g., because the execution context or individual objects on it survive the call). You can see how in the majority of cases, this would minimize heap fragmentation and reclaim memory used for temporaries more aggressively and efficiently than relying on GC, because the execution context associated with most function calls doesn't need to survive the call. Let's look at an example:
function foo() {
var n;
n = someFunctionCall();
return n * 2;
}
function bar() {
var n;
n = someFunction();
setCallback(function() {
if (n === 2) {
doThis();
}
else {
doThat();
}
});
}
In the above, an engine like V8 that aggressively optimizes can detect that the conceptual execution context for a call to foo never needs to survive when foo returns. So V8 would be free to allocate that context on the stack, and use a stack-based mechanism for cleanup.
In contrast, the execution context created for a call to bar has to stick around after bar returns, because there's a closure (the anonymous function we passed into setCallback) relying on it. So when compiling bar (because V8 compiles to machine code on-the-fly), V8 may well use a different strategy, actually allocating the context object in the heap.
(If either of the above had used eval in any way, by the way, it's likely V8 and other engines don't even attempt any form of optimization, because eval introduces too many optimization failure modes. Yet another reason not to use eval if you don't have to, and you almost never have to.)
But these are implementation details. Conceptually, things are as described above the break.

The size of objects can grow dynamically. Therefore, you would need to adjust their memory requirements. That is why, they are stored in heap.

Both primitive values and objects are always stored in some other object - they are properties of some object.
There is not one primitive value / object that is not a property of another object. (The only exception here is the global object).

Related

How v8 handle stack allocated variable in closure?

I read a lot of articles that say "v8 uses stack for allocating primitive like numbers".
I also ready about the CG works only for the heap allocation.
But if I combine the stack allocated variables with the closure, who is in changed to free the stack allocated variable?
For example:
function foo() {
const b = 5;
return function bar(x) {
return x * b
}
}
// This invocation allocate in the stack the variable `b`
// in the head the code of `bar`
const bar = foo()
// here the `b` should be freed
// here `b` is used, so should not be free
bar()
how it works?
How can bar function point to b if b lives in the stack?
How is the [[Environment]] built here?
(V8 developer here.)
I don't know where this myth is coming from that "primitives are allocated on the stack". It's generally false: the regular case in JavaScript is that everything is allocated on the heap, primitive or not.
There may be implementation-specific special cases where some heap allocations can be optimized out and/or replaced by stack allocations, but that's the exception, not the rule; and it's never directly observable (i.e. never changes behavior, only performance), because that's the general rule for all internal optimizations.
To dive deeper, we need to distinguish two concepts: variables themselves, and the things they point at.
A variable can be thought of as a pointer. In other words, it's not in itself the "container" or "space" where an object is allocated; instead it's a reference that points at an object that's allocated elsewhere. All variables have the same size (1 pointer), the things they point at can vary wildly in size. One illustrating consequence is that the same variable can point at different things over time, for example you could have a loop over an array where element = array[i] points at each array element in turn.
In modern, high-performance JS engines, function-local variables are usually stored on the stack (regardless of what they point at!). That's because this is both fast and convenient. So while this is technically still an implementation-specific exception to the rule that everything is allocated on the heap, it's a fairly common exception.
As you rightly observe, storing variables on the stack doesn't work if they need to survive the function that created them. Therefore, JavaScript engines perform analysis passes to find out which variables are referenced from nested closures, and store these variables on the heap right away, in order to allow them to stay around as long as they are needed.
I wouldn't be surprised if an engine that prefers simplicity over performance chose to always store all variables on the heap, so it wouldn't have to distinguish several cases.
Regarding the value that the variable points at: that's always on the heap, regardless of its type or primitive-ness (with exceptions to the rule, see below).
var a = true --> true is on the heap.
var b = "hello" --> "hello" is on the heap.
var c = 42.2 --> 42.2 is on the heap.
var d = 123n --> 123n is on the heap.
var e = new Object(); --> the object is on the heap.
Again, there are engine-specific cases where heap allocations can be optimized out under the right circumstances. For example, V8 (inspired by some other VMs) has a well-known trick where it can store small integers ("Smis") directly in the pointer using a tag bit, so in this case the pointer doesn't actually point at a value, the pointer is the value so to speak. An alternative trick is called "NaN-boxing", it's used e.g. by Spidermonkey and has the effect that all JS Numbers can be stored directly in the pointer (or technically the other way round: everything's a Number in this approach, and pointers are stored as special numbers).
As another example, once a function gets hot enough for optimization, an optimizing compiler may be able to figure out that a given object isn't accessible outside the function and hence doesn't need to be allocated at all; if necessary some of the object's properties will be held in registers or on the stack for the part of the function where they are needed.
So, to summarize the above:
"All primitives are allocated on the stack" is incorrect. Most primitives are allocated on the heap.
Sometimes, an engine can avoid allocations (of both primitives and objects), which may or may not mean that the respective value is briefly held on the stack (it could also be eliminated entirely, or only ever held in registers). Such optimizations never change observable behavior; in cases where doing the optimization would affect behavior, the optimization can't be applied.
Variables, regardless of what they refer to, are stored on the heap or on the stack or not at all, depending on the requirements of the situation.

Where are values stored in JavaScript?

I've heard that in JavaScript, primitive types are stored on the stack and that objects are stored on the heap. Is that true for all cases even for values inside of a function's execution scope? Moreover, are all globally scoped variables and functions stored on the global object (window in the browser) in JavaScript and is that considered to be the "heap" or part of the heap? Or are the primitive types themselves stored on the stack and reference types on the heap and then the identifiers are added as properties to the global objects and point to those values on the stack/heap?
There is no heap and no stack. JavaScript's memory model is very abstract. How things end up in your computers memory is totally up to the engine. Given that modern engines perform a lot of optimizations, values might end up in different regions of the memory even in different optimization stages, so one really can't tell.

Does a Javascript closure retain the entire parent lexical environment or only the subset of values the closure references? [duplicate]

This question already has answers here:
About closure, LexicalEnvironment and GC
(3 answers)
Closed 3 years ago.
Consider the following example:
function makeFunction() {
let x = 3;
let s = "giant string, 100 MB in size";
return () => { console.log(x); };
}
// Are both x and s held in memory here
// or only x, because only x was referred to by the closure returned
// from makeFunction?
let made = makeFunction();
// Suppose there are no further usages of makeFunction after this point
// Let's assume there's a thorough GC run here
// Is s from makeFunction still around here, even though made doesn't use it?
made();
So if I close around just one variable from a parent lexical environment, is that variable kept around or is every sibling variable in its lexical environment also kept around?
Also, what if makeFunction was itself nested inside another outer function, would that outer lexical environment be retained even though neither makeFunction nor makeFunction's return value referred to anything in that outer lexical environment?
I'm asking for performance reasons - do closures keep a bunch of stuff around or only what they directly refer to? This impacts memory usage and also resource usage (e.g. open connections, handles, etc.).
This would be mostly in a NodeJS context, but could also apply in the browser.
V8 developer here. This is a bit complicated ;-)
The short answer is: closures only keep around what they need.
So in your example, after makeFunction has run, the string referred to by s will be eligible for garbage collection. Due to how garbage collection works, it's impossible to predict when exactly it'll be freed; "at the next garbage collection cycle". Whether makeFunction runs again doesn't matter; if it does run again, a new string will be allocated (assuming it was dynamically computed; if it's a literal in the source then it's cached). Whether made has already run or will run again doesn't matter either; what matters is that you have a variable referring to it so you could run it (again). Engines generally can't predict which functions will or won't be executed in the future.
The longer answer is that there are some footnotes. For one thing, as comments already pointed out, if your closure uses eval, then everything has to be kept around, because whatever source snippet is eval'ed could refer to any variable. (What one comment mentioned about global variables that could be referring to eval is not true though; there is a semantic difference for "global eval", a.k.a. "indirect eval": it cannot see local variables. Which is usually considered an advantage for both performance and debuggability -- but even better is to not use eval at all.)
The other footnote is that somewhat unfortunately, the tracking is not as fine-grained as it could be: each closure will keep around what any closure needs. We have tried fixing this, but as it turns out finer-grained tracking causes more memory consumption (for metadata) and CPU consumption (for doing the work) and is therefore usually not worth it for real code (although it can have massive impact on artificial tests stressing precisely this scenario). To give an example:
function makeFunction() {
let x = 3;
let s = "giant string, 100 MB in size";
let short_lived = function() { console.log(s.length); }
// short_lived(); // Call this or don't, doesn't matter.
return function long_lived() { console.log(x); };
}
let long_lived = makeFunction();
With this modified example, even though long_lived only uses x, short_lived does use s (even if it's never called!), and there is only one bucket for "local variables from makeFunction that are needed by some closure", so that bucket keeps both x and s alive. But as I said earlier: real code rarely runs into this issue, so this is usually not something you have to worry about.
Side note:
and also resource usage (e.g. open connections, handles, etc.)
As a very general statement (i.e., in any language or runtime environment, regardless of closures or whatnot), it's usually advisable not to rely on garbage collection for resource management. I recommend to free your resources manually and explicitly as soon as it is appropriate to free them.

Do variables declared inside callback functions remain in memory or are they destroyed upon end of callback execution?

I have this code snippet; It basically connects to mongo database and goes through documents. On each iteration it searches for messages array of objects. It creates variable of messages, and then loops through it. Does declared variable (var msg) inside callback function remain in memory or is it destroyed upon end of callback function execution? Would there be any difference if var msg was actually declared as let msg? Is there a way to discard whole scope from the memory?
MongoClient.connect(mongoUrl, (err, db) => {
assert.equal(null,err);
var collection_data = db.collection('threadContents').find();
collection_data.on('data', (doc) => {
var msg = doc.messages;
for (var variable in msg) {
console.log(msg);
}//forin(msg)
});//collection_data.on
});//mongo.connect
In theory - yes.
In practice - it's not that simple...
In your example, var msg does not "leak" outside of it's original scope, so it will most likely get destroyed when callback finishes it's job.
One thing to note - the destruction of this object doesn't have to happen immediately - JS Engines are mostly Garbage Collected, so this piece of memory can stay on the stack for some time, but it won't be reachable anymore.
If you would declare this variable in outer scope, it could possibly stay in the memory if that scope stayed in memory (so, other pieces of code could access that variable). You can read about this behavior in Closures section at MDN.
Another thing to note is usage of console.log. In general, non-primitive values (like Objects or Arrays, which actually are "special" objects), will be accessible by reference, not by value. Therefore, if your var msg is a non-primitive, it will most likely persist in the memory until you clear your console. Primitive values would be copied, so strictly speaking they would still persist in the memory, but probably in another place in the memory (although JIT engines could probably try to optimize that and don't copy the memory if it's not necessary).
Node.js (and all javascript engines) rely on tracing garbage collection, which means that allocated memory is deallocated in a non-deterministic way, i.e. you cannot predict exactly when it will happen. This usually does not affect you, but uif you really want some chunk of memory or some other resource to be deallocated in a predictable way, you have to use some ad-hoc technique.
In your specific example 'msg' will be deleted, but don't expect it to be deleted straight away.

Why do you need to pass in arguments to a self executing function in javascript if the variable is global?

I was looking at the code for the underscore.js library (jQuery does the same thing) and just wanted some clarification on why the window object is getting passed into the self executing function.
For example:
(function() { //Line 6
var root = this; //Line 12
//Bunch of code
}).call(this); //Very Bottom
Since this is global, why is it being passed into the function? Wouldn't the following work as well? What issues would arrise doing it this way?
(function() {
var root = this;
//Bunch of code
}).call();
I suspect the reason is ECMAScript 5 strict mode.
In non-strict mode, this IIFE
(function() {
console.log(this); // window or global
})();
logs the window object (or the global object, if you're on a node.js server), because the global object is supplied as this to functions that don't run as a method of an object.
Compare that result to
"use strict";
(function() {
console.log(this); // undefined
})();
In strict mode, the this of a bare-invoked function is undefined. Therefore, the Underscore authors use call to supply an explicit this to the anonymous function so that it is standardized across strict and non-strict mode. Simply invoking the function (as I did in my examples) or using .call() leads to an inconsistency that can be solved with .call(this).
jQuery does the same thing, and you can find several answers to your question on SO by searching along those lines.
There are two reasons to have an always-available global variable like window in local scope, both of which are for the sake of fairly minor optimizations:
Loading time. Looking up a local variable takes a tiny bit less time than looking up a global variable, because the interpreter doesn't have to look as far up the context tree. Over the length of a function the size of Underscore or jQuery, that can theoretically add up to a less-trivial amount of time on slower machines.
Filesize. Javascript minification relies on the fact that variables can be named anything as long as they're consistent. myLongVariableName becomes a in the minified version. The catch, of course, is that this can only apply to variables defined inside the script; anything it's pulling from outside has to keep the same name to work, because if you shrink window down to pr the interpreter won't know what the hell you're talking about. By shadowing it with a reference to itself, Minify can do things like:
(function(A){ B(A, {...}); A.doStuff(); //etc });})(window)
and every time you would otherwise have put window (6 characters) you now have A (1 character). Again, not major, but in a large file that needs to explicitly call window on a regular basis for scope management it can be important, and you've already got an advantage by a few characters if you use it twice. And in large, popular libraries that get served by limited servers and to people who might have lousy Internet plans, every byte counts.
Edit after reading comments on question:
If you look at what function.call() actually does, your second snippet wouldn't actually work - the this being passed in isn't an argument, it's an explicit calling context for the function, and you have to provide it to .call(). The line at the beginning of the file serves the two purposes above. The reason for setting the context explicitly with call is futureproofing - right now, anonymous functions get the global context, but theoretically that could change, be discouraged in a future ECMA specification, or behave oddly on a niche browser. For something that half the Internet uses, it's best to be more explicit and avoid the issue entirely. The this/window choice at the top level I think must have something to do with worrying about non-PC devices (i.e. mobile) potentially using a different name for their top-level object.

Categories

Resources