I'm implementing Immutable.js in a set of React components with props containing objects, where Immutable has not been used before.
While I was rewriting all { someProp.someValue } to { someProp.get("someValue") } (since Immutable don't have the props directly accessable), I am just wondering if it could have been possible to have the object properties directly accessable while the object is still immutable.
I guess the reason is that if they where directly accessable, they would be mutable because that's how javascript objects work. However, could it not be possible to freeze the objects (in browsers supporting it of course), and still have the mutating methods (.set, .map etc) create copies instead of changing the object itself the way Immutable work?
Is that technically possible, and is there any library doing something like this allready?
"Is that technically possible"
Yes, the implementation you described would solve that particular problem, but Immutable does more than just provide data structures you're not allowed to change directly:
Immutable provides Persistent Immutable List, Stack, Map, OrderedMap, Set, OrderedSet and Record. They are highly efficient on modern JavaScript VMs by using structural sharing via hash maps tries and vector tries as popularized by Clojure and Scala, minimizing the need to copy or cache data.
Immutable also provides a lazy Seq, allowing efficient chaining of collection methods like map and filter without creating intermediate representations. Create some Seq with Range and Repeat.
Emphasis mine.
So, if you have a huge set of nested immutable Maps and Lists, and you change one element (that is: you create a new structure with mostly the same data but one item different), Immutable will be able to share a lot of the data between the two data structures, making the operation faster and consuming less memory.
However, this means that the someValue value isn't necessarily stored at the top of the data structure, waiting for you to access it; Immutable has to reach into internals to find it for you, thus the need for get.
I suspect you could create a very cool and friendly API using Symbols (to hide the underlying data structures) and Proxies (to intercept proprety accessors), but browser support is practically non-existent.
"and is there any library doing something like this already?"
The closest thing I can think of at the moment is React's immutable helpers. The update syntax is a bit noisy in order to support a wide range of operations, but what you get out the other end is are plain JavaScript objects and arrays.
Dealing with immutable data in JavaScript is more difficult than in languages designed for it, like Clojure. However, we've provided a simple immutability helper, update(), that makes dealing with this type of data much easier, without fundamentally changing how your data is represented.
The helper sort of assumes the objects you're passing in are implicitly immutable, but there's no reason you couldn't deeply freeze them as well.
Related
I was reading ECMA2019 (the same is true in ES6 too), where I found:
Each object in an ECMAScript engine is associated with a set of
internal methods that defines its runtime behaviour. These internal
methods are not part of the ECMAScript language. They are defined by
this specification purely for expository purposes. However, each
object within an implementation of ECMAScript must behave as specified
by the internal methods associated with it. The exact manner in which
this is accomplished is determined by the implementation.
I also found these Stack Overflow question1 and question2 and that their answers don't seem to give me the answer I am looking for.
My question is simple. If JavaScript engines decide not to implement some of them, then how would they ensure this statement of above spec -
However, each object within an implementation of ECMAScript must
behave as specified by the internal methods associated with it.
Let us take an example:
[[GetPrototypeOf]] , [[Get]] , [[Set]] , [[GetOwnProperty]] etc are essential internal methods. If a JavaScript engine refuses to implement them, how does it achieve this functionality? Clearly they have to implement it, just that they can choose to have different method name and different method signature as it is not enforced by spec on them?
Where am I wrong?
Similarly for internal slots too? If they don't have internal variables storing that state, how on earth will they maintain the state of that object when asked?
EDIT : I will add more details to clarify my question. Let us take an example of Object.getPrototypeOf(). This is an API for internal behaviour [[GetPrototypeOf]] and there are possible algorithm for implementing it. The question is not possible ways to implement it a behaviour - its about having a behaviour or not ! and still satisfying the spec overall object behaviour.
V8 developer here. I think this question has mostly been answered already in the comments, so I'll just summarize.
Are internal slot and internal methods actually implemented by JavaScript engines?
Generally not; the engine simply behaves as if its internals were structured in this way. Some parts of an implementation might be very close to the spec's structure, if it's convenient.
One way to phrase it would be: you could implement a JavaScript engine by first faithfully translating the spec text to code (in whichever language you choose to use for your engine), and then you'd be allowed to refactor the invisible internals in any way you want (e.g.: inline functions, or split them up, or organize them as a helper class, or add a fast path or a cache, or generally turn the code inside out, etc). Which isn't surprising, really: as long as the observable behavior remains the same, any program is allowed to refactor its internals. What the ECMAScript is making clear at that point is simply that the "internal slots" really are guaranteed to always be internal and not observable.
[[[Get]] etc] are essential internal methods. If a JavaScript engine refuses to implement them, how does it achieve this functionality?
It's not about refusing to implement something. You can usually implement functionality in many different ways, i.e. with many different ways of structuring your code and your objects. Engines are free to structure their code and objects any way they want, as long as the resulting observable behavior is as specified.
Let us take an example of Object.getPrototypeOf(). This is an API for internal behaviour [[GetPrototypeOf]]
Not quite. Object.getPrototypeOf is a public function that's specified to behave in a certain way. The way the spec describes it is that it must *behave as if there were an internal slot [[GetPrototypeOf]].
You seem to have trouble imagining an alternative way. Well, in many cases, engines will probably choose to have an implementation that's very close to having those internal slots -- perhaps mapped to fields and methods in a C++ class. But it doesn't have to be that way; for example, instead of class methods, there could be free functions: GetPrototypeImpl(internal::Object object) rather than internal::Object::GetPrototypeImpl(). Or instead of an inheritance/hierarchy structure, the engine could use switch-statements over types.
One of the most common ways in which engines' implementations deviate from the structure defined by the spec's internal slots is by having additional fast paths. Typically, a fast path performs a few checks to see if it is applicable, and then does the simple, common case; if the applicability check fails, it falls back to a slower, more complete implementation, that might be much closer to the spec's structure. Or maybe neither function on its own contains the complete spec'ed behavior: you could have GetPrototypeFromRegularObject and GetPrototypeFromProxy plus a wrapper dispatching to the right one, and those all together behave like the spec's hypothetical system of having a [[GetPrototypeOf]] slot on both proxies and regular objects. All of that is perfectly okay because from the outside you can't see a difference in behavior -- all you can see is Object.getPrototypeOf.
One particular example of a fast path is a compiler. If you implemented object behaviors as (private) methods, and loaded and called those methods every time, then your implementation would be extremely slow. Modern engines compile JavaScript functions to bytecode or even machine code, and that code will behave as if you had loaded and called an internal function with the given behavior, but it (usually) will not actually call any such functions. For example, optimized code for an array[index] access should only be a few machine instructions (type check, bounds check, memory load), there should be no call to a [[Get]] involved.
Another very common example is object types. The spec typically uses wording like "if the object has a [[StringData]] internal slot, then ..."; an engine typically replaces that with "if the object's type is what I've chosen for representing strings internally, then ...". Again, the difference is not observable from the outside: Strings behave as if they had a [[StringData]] internal slot, but (in V8 at least) they don't have such a slot, they simply have an appropriate object type that identifies them as strings, and objects with string type know where their character payload is, they don't need any special slot for that.
Edit: forgot to mention: see also https://v8.dev/blog/understanding-ecmascript-part-1 for another way to explain it.
I'm trying to better learn how JS works under the hood and I've heard in the past that the delete keyword (specifically node.js or browsers using V8) results in poor performance, so I want to see if I can figure out what the benefits/detriments are for using that keyword.
I believe the reasoning for not using delete is that removing a property leads to a rebuilding of hidden class transitions and thus a recompiling of the inline cache. However, I believe it is also true that the object prototype will no longer enumerate that property, so if the object is used heavily the upfront cost may eventually pay off.
So:
Are my assumptions about the tradeoffs correct?
If they are correct, is one factor more important than the other (e.g. is rebuilding the IC much more expensive than many prototype enumerations)?
V8 developer here. Short answer: "it depends".
Having an unused property doesn't hurt; there is no general "enumeration cost" unless you actually perform explicit enumerations. In other words, an "enumeration cost" only exists if you find yourself doing something like this:
for (var p in object) {
if (p === old_property_that_I_could_have_deleted) continue;
/* process other properties... */
}
The key reason why it's hard to give a concrete answer (or to provide a canonical example where an effect would be measurable) is because the effects are non-local: they depend both on what exactly you're doing with the object in question, and on what the rest of your app is doing. Deleting a property from one object may well cause operations on other objects to become slower. Or faster. It depends.
To take a step back and look at the high-level situation: JavaScript as a language sort of assumes that objects are represented as dictionaries. Deleting an entry in a dictionary should be perfectly fine, which is why it makes sense that the delete operator exists. In practice, it turns out that an engine can achieve huge performance improvements for read-heavy apps, which is by far the most common case, if it does not store objects as dictionaries, but instead more like something that resembles C/C++ structs. However, such an object representation is (1) generally hard/inefficient to do when properties get deleted, and (2) the engine may well interpret even the first deletion of a property as a hint that the programmer wants this particular object to behave like a dictionary, so it might switch the internal representation over. If a fast-to-modify dictionary is what you wanted, then that's fine (it will provide a benefit even); however if you wanted the object to remain in slow-to-modify/fast-to-read mode, you would perceive the transition to fast-to-modify/slow-to-read dictionary mode as a performance problem.
Thankfully there is a great solution nowadays: when you want a dictionary, use a Map or Set. Engines can (and usually will) assume that you'll want to delete entries from these, so the implementations are optimized for making that possible without negative side effects; in particular no hidden classes are involved.
A few remarks on your assumptions: deleting a property makes an object (mostly) leave the system of hidden class transitions, no transitions will be rebuilt. There is no single global "inline cache", there are many inline caches sprinkled all over your functions. They don't get rebuilt, they just transition to slower and slower modes the more different cases they have to handle. (That's generally how caching works: caching a single case provides huge speedups; on the other end of the scale if you have as many different cases as executions, then a cache just wastes time and memory without providing any benefit.) Again the effect of dictionary-mode objects depends on the overall situation: an inline cache dealing with (mostly) dictionary-mode objects typically exhibits performance somewhere in between (1) an inline cache that only has to deal with objects sharing the single same hidden class, and (2) an inline cache that has to deal with hundreds or thousands of different hidden classes.
I'm currently working with a personal library that's accumulated quite a number of "helper" functions, which are used throughout my architecture for various purposes. Back when there were only a few of them, I kept them in a single file and object so I could access them like so:
tools.parseSomething(obj);
This has been terribly handy, and keeps the code still somewhat organised and readable. The problem is that the file (and object) containing these methods is growing to an enormous size, and needs a cleanup. I was considering creating separate files for "categories" of functions and placing them in those, so they'd be accessed like:
tools.env.getEnvironmentInfo();
My concern with this approach is not the readability so much, but the performance of the look-up. From what I've read recently, object look-ups are no longer the major bottle-neck they used to be, but I still want my library to be as efficient as possible (readability second).
I also considered having the separate files, but add all of the functions to the parent object so they're still accessible in the original way, but stored separately. The object containing the functions is "static" and exists as a single instance.
My question is, with regards to what I've explained, what would be the most efficient method of storing and using a large amount of helper functions? By efficiency I mean performance, which is the sole area of concern for me at the moment.
I'd agree that premature optimization is going on here, however that wasn't your question.
I'd assert that instead of thinking in terms of loosely related scriptlets that 'do kind of related things', I'd suggest an richer object model that can be stateful and perhaps expose the behaviors you need while hiding some optimizations you feel are necessary like caching lookups, etc.
So instead of getEnvironmentInfo, why not have an Environment instance that has, for example, already fetched some parameters and stored internally so subsequent calls are made faster. From there you can create new environments or whatever other behaviors suit you.
This is just good programming practice in whatever language.
After dabbling with javascript for a while, I became progressively convinced that OOP is not the right way to go, or at least, not extensively. Having two or three levels of inheritance is ok, but working full OOP like one would do in Java seems just not fitting.
The language supports compositing and delegation natively. I want to use just that. However, I am having trouble replicating certain benefits from OOP.
Namely:
How would I check if an object implements a certain behavior? I have thought of the following methods
Check if the object has a particular method. But this would mean standardizing method names and if the project is big, it can quickly become cumbersome, and lead to the java problem (object.hasMethod('emailRegexValidatorSimpleSuperLongNotConflictingMethodName')...It would just move the problem of OOP, not fix it. Furthermore, I could not find info on the performance of looking up if methods exist
Store each composited object in an array and check if the object contains the compositor. Something like: object.hasComposite(compositorClass)...But that's also not really elegant and is once again OOP, just not in the standard way.
Have each object have an "implements" array property, and leave the responsibility to the object to say if it implements a certain behavior, whether it is through composition or natively. Flexible and simple, but requires to remember a number of conventions. It is my preferred method until now, but I am still looking.
How would I initialize an object without repeating all the set-up for composited objects? For example, if I have an "textInput" class that uses a certain number of validators, which have to be initialized with variables, and a class "emailInput" which uses the exact same validators, it is cumbersome to repeat the code. And if the interface of the validators change, the code has to change in every class that uses them. How would I go about setting that easily? The API I am thinking of should be as simple as doing object.compositors('emailValidator','lengthValidator','...')
Is there any performance loss associated with having most of the functions that run in the app go through an apply()? Since I am going to be using delegation extensively, basic objects will most probably have almost no methods. All methods will be provided by the composited objects.
Any good resource? I have read countless posts about OOP vs delegation, and about the benefits of delegation, etc, but I can't find anything that would discuss "javascript delegation done right", in the scope of a large framework.
edit
Further explanations:
I don't have code yet, I have been working on a framework in pure OOP and I am getting stuck and in need of multiple inheritance. Thus, I decided to drop classes totally. So I am now merely at theoretical level and trying to make sense out of this.
"Compositing" might be the wrong word; I am referring to the composite pattern, very useful for tree-like structures. It's true that it is rare to have tree structures on the front end (well, save for the DOM of course), but I am developing for node.js
What I mean by "switching from OOP" is that I am going to part from defining classes, using the "new" operator, and so on; I intend to use anonymous objects and extend them with delegators. Example:
var a = {};
compositor.addDelegates(a,["validator", "accessManager", "databaseObject"]);
So a "class" would be a function with predefined delegators:
function getInputObject(type, validator){
var input = {};
compositor.addDelegates(input,[compositor,renderable("input"+type),"ajaxed"]);
if(validator){input.addDelegate(validator);}
return input;
}
Does that make sense?
1) How would I check if an object implements a certain behavior?
Most people don't bother with testing for method existance like this.
If you want to test for methods in order to branch and do different things if its found or not then you are probably doing something evil (this kind of instanceof is usually a code smell in OO code)
If you are just checking if an object implements an interface for error checking then it is not much better then not testing and letting an exception be thrown if the method is not found. I don't know anyone that routinely does this checking but I am sure someone out there is doing it...
2) How would I initialize an object without repeating all the set-up for composited objects?
If you wrap the inner object construction code in a function or class then I think you can avoid most of the repetition and coupling.
3) Is there any performance loss associated with having most of the functions that run in the app go through an apply()?
In my experience, I prefer to avoid dealing with this unless strictly necessary. this is fiddly, breaks inside callbacks (that I use extensively for iteration and async stuff) and it is very easy to forget to set it correctly. I try to use more traditional approaches to composition. For example:
Having each owned object be completely independent, without needing to look at its siblings or owner. This allows me to just call its methods directly and letting it be its own this.
Giving the owned objects a reference to their owner in the form of a property or as a parameter passed to their methods. This allows the composition units to access the owner without depending on having the this correctly set.
Using mixins, flattening the separate composition units in a single level. This has big name clash issues but allows everyone to see each other and share the same "this". Mixins also decouples the code from changes in the composition structure, since different composition divisions will still flatten to the same mixed object.
4) Any good resources?
I don't know, so tell me if you find one :)
I just read this question: are there dictionaries in javascript like python?
One of the answers said that you can use JavaScript objects like Python dictionaries. Is that true? What is the performance of a key lookup in an object? Is it O(1)? Is adding a key to the object also constant time (hashing)?
The V8 design docs imply lookups will be at least this fast, if not faster:
Most JavaScript engines use a dictionary-like data structure as
storage for object properties - each property access requires a
dynamic lookup to resolve the property's location in memory. This
approach makes accessing properties in JavaScript typically much
slower than accessing instance variables in programming languages like
Java and Smalltalk. In these languages, instance variables are located
at fixed offsets determined by the compiler due to the fixed object
layout defined by the object's class. Access is simply a matter of a
memory load or store, often requiring only a single instruction.
To reduce the time required to access JavaScript properties, V8 does
not use dynamic lookup to access properties. Instead, V8 dynamically
creates hidden classes behind the scenes. [...] In V8, an object changes
its hidden class when a new property is added.
It sounds like adding a new key might be slightly slower, though, due to the hidden class creation.
Yes, you can assume that adding a key, and later using it for access are effectively constant time operations.
Under the hood the JS engine may apply some techniques to optimize subsequent lookups, but for the purposes of any algorithm, you can assume O(1).
Take a look at a similar question How does JavaScript VM implements Object property access? and the answer. Here I described the optimization technique used by JS engines and how it affects the key lookup performance. I hope these details help you write more efficient JS code.