V8: implement equality test - javascript

How can I redefine == operator in V8 for my own classes? For example:
var v = Foo.BAR;
var other = getBar(); // returns a new instance of the same as Foo.BAR
assert(v == other); // I want true
The functions are defined in C++ with V8, not directly in JS. I know it's possible as it has been done for the String class.

V8 developer here.
I know it's possible as it has been done for the String class.
Of course a JavaScript engine can and does define what all the operators do -- that is its job. So I wouldn't say that the == operator has been redefined for strings; it has merely been defined.
If you're willing to modify V8, then you can change the behavior of the == operator. But that's going to be a lot of work, because there isn't just one place where it's defined: you'll have to touch the C++ runtime (start by looking at v8::internal::Object::Equals), the Ignition interpreter (look for TestEquals in src/interpreter/interpreter-generator.cc), and the Turbofan compiler (grep for kJSEqual in src/compiler/ and adapt how it's handled in the various phases, most notably JSTypedLowering::ReduceJSEqual but there are probably other places you'll have to touch as well).
Be aware that this is a massive project; IMHO it is not advisable to go down this path. A particular difficulty will be to get the information you need (specifically, "is this object an instance of one of the classes in question?") to all the places where you'll need it; I don't have a good suggestion for how to accomplish that. Another challenge is that porting your changes to new V8 versions will be quite time-consuming maintenance work.
My recommendation would be to go for a .equals function, defined on precisely the classes that should have it. That's clean and simple, easily maintainable/adaptable, and unsurprising to any other JavaScript developer (including your own future self) reading your code.

Related

Do the most current JavaScript/ECMAScripte compilers optimize out unnecessary variable assignment when returning the value from a function call?

Say we are inside an object that implements file handling. I want to write the code for easier readability.
Example of code where it can be difficult to tell the return type, especially when there are multiple nested function calls:
function create() {
return doCreateAction();
}
This example is more readable by introducing a clarifying variable:
function create() {
var fileHandle = doCreateAction();
return fileHandle;
}
In theory, the second version could perform identically because the compiler has to store the result from doCreateAction() temporarily anyway (probably inside some hiddenm, anonymous, short-lived temp variable). It this code any slower when assigning to a named variable?
I would say either they do optimize the variable out, or it's not worth bothering; and that in either case you have bigger fish to fry. :-) But there is an interesting aspect to this in relation to tail calls.
But first, in terms of simple performance: Empirically, this simplistic, synthetic test suggests that the performance of the function doesn't vary depending on whether there's a variable. Also note that a minifier will likely remove that variable for you before the JavaScript engine gets a look in, if you use a decent minifier.
Moving on to tail-calls: As you may know, as of ES2015 in strict mode the specificaton requires tail-call optimization (TCO), which means that when function A returns the result of calling function B, rather than having B return its result to A which then returns it to the caller, A passes control directly to B which then returns the result to the caller. This is more efficient in several ways (avoids creating another frame on the stack, avoids a jump).
Now, it may not matter because development of TCO in JavaScript engines is at least stalled if not dead. The V8 team developed an early version but abandoned it, SpiderMonkey doesn't have it either; as far as I know, only JavaScriptCore in Safari does TCO. But if I read the spec correctly (no mean feat), your first example has doCreateAction in the tail position and so can be optimized via TCO, but your second does not.
So there could be implications in that regard, if and when TCO is ever implemented widely and if, when it is, implementations go slightly beyond the spec for cases like this where clearly it is, in effect, a tail call.
I used to be fairly strict about using a variable in that situation for debugging purposes; moderately-recent versions of Chrome's devtools make it unnecessary for that purpose however (and of course, a minifier will remove it anyway): If you step into a return, you see the return value in the local scope list of variables. Of course, that's only useful if you're using Chrome's devtools (Firefox's, for instance, don't do this [yet?]).

Writing reusable javascript modules & libraries.

I have been using javascript for a while now, and have authored my first content that has been used by other people.
The main reaction has been that my content does not play well with other code.
Unfortunately javascript does not have a lot of the normal tools for creating non-conflicting libraries, like namespaces and classes.
So what are the basic standards and tools for writing non-conflicting libraries in JS?
Javascript is a beautiful and broken programing language. Many programers coming from other languages often find its nature quite confusing if not down right annoying.
Javascript is missing a lot of the tools classical languages use to create clean classes and interfaces. But this does not mean that you can't write great libraries in JS, it just means that you need to learn how to use the tools it offers.
IMO the best resources on the subject of good modular code are:
Douglas Crockford's : javascript the good parts
Adequately Good's : Module Pattern: In-Depth
Paul Irish's : 10 things I learned from the jQuery Source
In all of these, the issue of conflicting code is addressed with at least the following two practices.
IFFE Wrapper
(function(dependency , undefined ) {
...dostuff...
})(dependency)
Wrapping your library in a IFFE is incredibly useful because it makes an immediate closure.This keeps you from over-populating the global namespace.
In addition : the above code Passes in the libraries dependencies as parameters. This both improves performance, and reduces side effects. For example jQuery passes in window via:(function(window){})(window)
Last but not least, we add but don't define the parameter undefined. This is often referred to as the idiot test. If someone changed undefined somewhere else in their code it could easily cause all kinds of trouble for your library. (function(undefined) {})()fixes this by "Not defining undefined" thus making it work as intended.
Conflict Handler
var _myLibrary = window.myLibrary;//Backs up whatever myLibrary's old value was
myLibrary = function(){...dostuff...};
myLibrary.prototype = {
getConflict : function() {
return window.myLibrary === myLibrary ?
_myLibrary || false : false;
};
}
Conflict Methods are very important when you do not know what other libraries yours will be used with. The above method is similar to jQuery's 'noConflict'.
In short, getConflict returns the overwritten myLibrary variable. If nothing was overwritten, then it returns false.
Having it return false is extremely useful as it can be used in an if statement like so.
if(myLibrary.getConflict()){
var foo = Object.create(myLibrary.getConflict());
}

The dangers of overwriting JavaScript object and functions

The nature of JavaScript allows for its native objects to be completely re-written. I want to know if there is any real danger in doing so!
Here are some examples of native JavaScript objects
Object
Function
Number
String
Boolean
Math
RegExp
Array
Lets assume that I want to model these to follow a similar pattern that you might find in Java (and some other OOP languages), so that Object defines a set of basic functions, and each other object inherits it (this would have to be explicitly defined by the user, unlike Java, where everything naturally derives from object)
Example:
Object = null;
function Object() {
Object.prototype.equals = function(other) {
return this === other;
}
Object.prototype.toString = function() {
return "Object";
}
Object.equals = function(objA, objB) {
return objA === objB;
}
}
Boolean = null;
function Boolean() {
}
extend(Boolean, Object); // Assume extend is an inheritance mechanism
Foo = null;
function Foo() {
Foo.prototype.bar = function() {
return "Foo.bar";
}
}
extend(Foo, Object);
In this scenario, Object and Boolean now have new implementations. In this respect, what is likely to happen? Am I likely to break things further down the line?
Edit:
I read somewhere that frameworks such as MooTools and Prototype have a similar approach to this, is this correct?
Monkey patching builtin classes like that is a controversial topic. I personally don't like doing that for 2 reaons:
Builtin classes are a global scope. This means that if two different modules try to add methods with the same name to the global classes then they will conflict, leading to subtle bugs. Even more subtly, if a future version of a browsers decides to implement a method with the same name you are also in trouble.
Adding things to the prototypes of common classes can break code that uses for-in loops without a hasOwnProperty check (people new to JS often do that to objects and arrays, since for-in kind of looks like a foreach loop). If you aren't 100% sure that the code you use is using for-in loops safely then monkeypatching Object.prototype could lead to problems.
That said, there is one situation where I find monkeypatching builtins acceptable and that is adding features from new browsers on older browsers (like, for example, the forEach method for arrays). In this case you avoid conflicts with future browser versions and aren't likely to catch anyone by surprise. But even then, I would still recommend using a shim from a third party instead of coding it on your own, since there are often many tricky corner cases that are hard to get right.
There's some level of preference here, but my personal take is that this sort of thing has the potential to become a giant intractable mess.
For example, you start with two projects, A and B, that each decide to implement all sorts of awesome useful fluent methods on String.
Project A has decided that String needs an isEmpty function that returns true if a string is zero-length or is only whitespace.
Project B has decided that String needs an isEmpty function that returns true if a string is zero-length, and an isEmptyOrWhitespace function that returns true if a string is zero-length or is only whitespace.
Now you have a project that wants to use some code from Project A and some code from Project B. Both of them make extensive use of their custom isEmpty functions. Do you have any chance of successfully joining the two? Probably not. You are in a cluster arrangement, so to speak.
Note that this is all very different than extension methods in C#, where you at least have to import the containing static class's namespace to get the extension method, there's no runtime conflict, and could reasonably consume from A and B in the same project as long as you didn't import their extensions namespace (hoping that they had the foresight to put their extension classes in a separate namespace for exactly this reason).
The worst case in JS that I know of along these lines is undefined. You can define it.
You're allowed to do things like undefined = 'blah';.... at which point, you can no longer rely on if(x === undefined). Which could easily break something elsewhere in your code (or, of course, in a third party lib you may be using).
That's completely bonkers, but definitely shows the definitely dangers of arbitrarily overwriting built-in objects.
See also: http://wtfjs.com/2010/02/15/undefined-is-mutable
For a slightly more sane example, take the Sahi browser testing tool. This tool allows you to write automated scripts for the browser to test your site. (similar to Selenium). One problem with doing that is if your site uses alert() or confirm(), the script would stop running while it waits for user input. Sahi gets around this by overwriting these functions with its own stub functions.
I avoid overriding the default behavior of the inherent objects. It's biten me a few times, while others I was fine. A library you can look at for an example is Sugar.js. Its a great library that some folks love, but I generally avoid it simply because it extends the behavior of existing JavScript objects, such as what you are doing.
I think however that you will find that this is purely opinion and style.

The disadvantages of JavaScript prototype inheritance, what are they?

I recently watched Douglas Crockford's JavaScript presentations, where he raves about JavaScript prototype inheritance as if it is the best thing since sliced white bread. Considering Crockford's reputation, it may very well be.
Can someone please tell me what is the downside of JavaScript prototype inheritance? (compared to class inheritance in C# or Java, for example)
In my experience, a significant disadvantage is that you can't mimic Java's "private" member variables by encapsulating a variable within a closure, but still have it accessible to methods subsequently added to the prototype.
i.e.:
function MyObject() {
var foo = 1;
this.bar = 2;
}
MyObject.prototype.getFoo = function() {
// can't access "foo" here!
}
MyObject.prototype.getBar = function() {
return this.bar; // OK!
}
This confuses OO programmers who are taught to make member variables private.
Things I miss when sub-classing an existing object in Javascript vs. inheriting from a class in C++:
No standard (built-into-the-language) way of writing it that looks the same no matter which developer wrote it.
Writing your code doesn't naturally produce an interface definition the way the class header file does in C++.
There's no standard way to do protected and private member variables or methods. There are some conventions for some things, but again different developers do it differently.
There's no compiler step to tell you when you've made foolish typing mistakes in your definition.
There's no type-safety when you want it.
Don't get me wrong, there are a zillion advantages to the way javascript prototype inheritance works vs C++, but these are some of the places where I find javascript works less smoothly.
4 and 5 are not strictly related to prototype inheritance, but they come into play when you have a significant sized project with many modules, many classes and lots of files and you wish to refactor some classes. In C++, you can change the classes, change as many callers as you can find and then let the compiler find all the remaining references for you that need fixing. If you've added parameters, changed types, changed method names, moved methods,etc... the compiler will show you were you need to fix things.
In Javascript, there is no easy way to discover all possible pieces of code that need to be changed without literally executing every possible code path to see if you've missed something or made some typo. While this is a general disadvantage of javascript, I've found it particularly comes into play when refactoring existing classes in a significant-sized project. I've come near the end of a release cycle in a significant-sized JS project and decided that I should NOT do any refactoring to fix a problem (even though that was the better solution) because the risk of not finding all possible ramifications of that change was much higher in JS than C++.
So, consequently, I find it's riskier to make some types of OO-related changes in a JS project.
I think the main danger is that multiple parties can override one another's prototype methods, leading to unexpected behavior.
This is particularly dangerous because so many programmers get excited about prototype "inheritance" (I'd call it extension) and therefore start using it all over the place, adding methods left and right that may have ambiguous or subjective behavior. Ultimately, if left unchecked, this kind of "prototype method proliferation" can lead to very difficult-to-maintain code.
A popular example would be the trim method. It might be implemented something like this by one party:
String.prototype.trim = function() {
// remove all ' ' characters from left & right
}
Then another party might create a new definition, with a completely different signature, taking an argument which specifies the character to trim. Suddenly all the code that passes nothing to trim has no effect.
Or another party reimplements the method to strip ' ' characters and other forms of white space (e.g., tabs, line breaks). This might go unnoticed for some time but lead to odd behavior down the road.
Depending on the project, these may be considered remote dangers. But they can happen, and from my understanding this is why libraries such as Underscore.js opt to keep all their methods within namespaces rather than add prototype methods.
(Update: Obviously, this is a judgment call. Other libraries--namely, the aptly-named Prototype--do go the prototype route. I'm not trying to say one way is right or wrong, only that this is the argument I've heard against using prototype methods too liberally.)
I miss being able to separate interface from implementation. In languages with an inheritance system that includes concepts like abstract or interface, you could e.g. declare your interface in your domain layer but put the implementation in your infrastructure layer. (Cf. onion architecture.) JavaScript's inheritance system has no way to do something like this.
I'd like to know if my intuitive answer matches up with what the experts think.
What concerns me is that if I have a function in C# (for the sake of discussion) that takes a parameter, any developer who writes code that calls my function immediately knows from the function signature what sort of parameters it takes and what type of value it returns.
With JavaScript "duck-typing", someone could inherit one of my objects and change its member functions and values (Yes, I know that functions are values in JavaScript) in almost any way imaginable so that the object they pass in to my function bears no resemblance to the object I expect my function to be passed.
I feel like there is no good way to make it obvious how a function is supposed to be called.

What is Javascript missing?

Javascript is an incredible language and libraries like jQuery make it almost too easy to use.
What should the original designers of Javascript have included in the language, or what should we be pressuring them into adding to future versions?
Things I'd like to see:-
Some kind of compiled version of the language, so we programmers can catch more of our errors earlier, as well as providing a faster solution for browsers to consume.
optional strict types (eg, being able to declare a var as a float and keep it that way).
I am no expert on Javascript, so maybe these already exist, but what else should be there? Are there any killer features of other programming languages that you would love to see?
Read Javascript: The Good Parts from the author of JSLint, Douglas Crockford. It's really impressive, and covers the bad parts too.
One thing I've always longed for and ached for is some support for hashing. Specifically, let me track metadata about an object without needing to add an expando property on that object.
Java provides Object.getHashCode() which, by default, uses the underlying memory address; Python provides id(obj) to get the memory address and hash(obj) to be customizable; etc. Javascript provides nothing for either.
For example, I'm writing a Javascript library that tries to unobtrusively and gracefully enhance some objects you give me (e.g. your <li> elements, or even something unrelated to the DOM). Let's say I need to process each object exactly once. So after I've processed each object, I need a way to "mark it" as seen.
Ideally, I could make my own hashtable or set (either way, implemented as a dictionary) to keep track:
var processed = {};
function process(obj) {
var key = obj.getHashCode();
if (processed[key]) {
return; // already seen
}
// process the object...
processed[key] = true;
}
But since that's not an option, I have to resort to adding a property onto each object:
var SEEN_PROP = "__seen__";
function process(obj) {
if (obj[SEEN_PROP]) { // or simply obj.__seen__
return; // already seen
}
// process the object...
obj[SEEN_PROP] = true; // or obj.__seen__ = true
}
But these objects aren't mine, so this makes my script obtrusive. The technique is effectively a hack to work around the fact that I can't get a reliable hash key for any arbitrary object.
Another workaround is to create wrapper objects for everything, but often you need a way to go from the original object to the wrapper object, which requires an expando property on the original object anyway. Plus, that creates a circular reference which causes memory leaks in IE if the original object is a DOM element, so this isn't a safe cross-browser technique.
For developers of Javascript libraries, this is a recurring issue.
What should the original designers of Javascript have included in the language, or what should we be pressuring them into adding to future versions?
They should have got together and decided together what to implement, rather than competing against each other with slightly different implementations of the language (naming no names), to prevent the immense headache that has ensued for every developer over the past 15 years.
The ability to use arrays/objects as keys without string coercion might've been nice.
Javascript is missing a name that differentiates it from a language it is nothing like.
There are a few little things it could do better.
Choice of + for string concatenation was a mistake. An & would have been better.
It can be frustrating that for( x in list ) iterates over indices, as it makes it difficult to use a literal array. Newer versions have a solution.
Proper scoping would be nice. v1.7 is adding this, but it looks clunky.
The way to do 'private' and 'protected' variables in an object is a little bit obscure and hard to remember as it takes advantage of closures and how they affect scoping. Some syntactic sugar to hide the mechanics of this would be fabulous.
To be honest, many of the problems I routinely trip over are actually DOM quirks, not JavaScript per se. The other big problem, of course, is that recent versions of JavaScript have interesting and useful things, like generators. Unfortunately, most browsers are stuck at 1.5. Apparantly only FireFox is forging ahead.
File IO is missing.... though some would say it doesn't really need it...

Categories

Resources