What is Javascript missing? - javascript

Javascript is an incredible language and libraries like jQuery make it almost too easy to use.
What should the original designers of Javascript have included in the language, or what should we be pressuring them into adding to future versions?
Things I'd like to see:-
Some kind of compiled version of the language, so we programmers can catch more of our errors earlier, as well as providing a faster solution for browsers to consume.
optional strict types (eg, being able to declare a var as a float and keep it that way).
I am no expert on Javascript, so maybe these already exist, but what else should be there? Are there any killer features of other programming languages that you would love to see?

Read Javascript: The Good Parts from the author of JSLint, Douglas Crockford. It's really impressive, and covers the bad parts too.

One thing I've always longed for and ached for is some support for hashing. Specifically, let me track metadata about an object without needing to add an expando property on that object.
Java provides Object.getHashCode() which, by default, uses the underlying memory address; Python provides id(obj) to get the memory address and hash(obj) to be customizable; etc. Javascript provides nothing for either.
For example, I'm writing a Javascript library that tries to unobtrusively and gracefully enhance some objects you give me (e.g. your <li> elements, or even something unrelated to the DOM). Let's say I need to process each object exactly once. So after I've processed each object, I need a way to "mark it" as seen.
Ideally, I could make my own hashtable or set (either way, implemented as a dictionary) to keep track:
var processed = {};
function process(obj) {
var key = obj.getHashCode();
if (processed[key]) {
return; // already seen
}
// process the object...
processed[key] = true;
}
But since that's not an option, I have to resort to adding a property onto each object:
var SEEN_PROP = "__seen__";
function process(obj) {
if (obj[SEEN_PROP]) { // or simply obj.__seen__
return; // already seen
}
// process the object...
obj[SEEN_PROP] = true; // or obj.__seen__ = true
}
But these objects aren't mine, so this makes my script obtrusive. The technique is effectively a hack to work around the fact that I can't get a reliable hash key for any arbitrary object.
Another workaround is to create wrapper objects for everything, but often you need a way to go from the original object to the wrapper object, which requires an expando property on the original object anyway. Plus, that creates a circular reference which causes memory leaks in IE if the original object is a DOM element, so this isn't a safe cross-browser technique.
For developers of Javascript libraries, this is a recurring issue.

What should the original designers of Javascript have included in the language, or what should we be pressuring them into adding to future versions?
They should have got together and decided together what to implement, rather than competing against each other with slightly different implementations of the language (naming no names), to prevent the immense headache that has ensued for every developer over the past 15 years.

The ability to use arrays/objects as keys without string coercion might've been nice.

Javascript is missing a name that differentiates it from a language it is nothing like.

There are a few little things it could do better.
Choice of + for string concatenation was a mistake. An & would have been better.
It can be frustrating that for( x in list ) iterates over indices, as it makes it difficult to use a literal array. Newer versions have a solution.
Proper scoping would be nice. v1.7 is adding this, but it looks clunky.
The way to do 'private' and 'protected' variables in an object is a little bit obscure and hard to remember as it takes advantage of closures and how they affect scoping. Some syntactic sugar to hide the mechanics of this would be fabulous.
To be honest, many of the problems I routinely trip over are actually DOM quirks, not JavaScript per se. The other big problem, of course, is that recent versions of JavaScript have interesting and useful things, like generators. Unfortunately, most browsers are stuck at 1.5. Apparantly only FireFox is forging ahead.

File IO is missing.... though some would say it doesn't really need it...

Related

JavaScript: performance constraints of `delete` keyword

I'm trying to better learn how JS works under the hood and I've heard in the past that the delete keyword (specifically node.js or browsers using V8) results in poor performance, so I want to see if I can figure out what the benefits/detriments are for using that keyword.
I believe the reasoning for not using delete is that removing a property leads to a rebuilding of hidden class transitions and thus a recompiling of the inline cache. However, I believe it is also true that the object prototype will no longer enumerate that property, so if the object is used heavily the upfront cost may eventually pay off.
So:
Are my assumptions about the tradeoffs correct?
If they are correct, is one factor more important than the other (e.g. is rebuilding the IC much more expensive than many prototype enumerations)?
V8 developer here. Short answer: "it depends".
Having an unused property doesn't hurt; there is no general "enumeration cost" unless you actually perform explicit enumerations. In other words, an "enumeration cost" only exists if you find yourself doing something like this:
for (var p in object) {
if (p === old_property_that_I_could_have_deleted) continue;
/* process other properties... */
}
The key reason why it's hard to give a concrete answer (or to provide a canonical example where an effect would be measurable) is because the effects are non-local: they depend both on what exactly you're doing with the object in question, and on what the rest of your app is doing. Deleting a property from one object may well cause operations on other objects to become slower. Or faster. It depends.
To take a step back and look at the high-level situation: JavaScript as a language sort of assumes that objects are represented as dictionaries. Deleting an entry in a dictionary should be perfectly fine, which is why it makes sense that the delete operator exists. In practice, it turns out that an engine can achieve huge performance improvements for read-heavy apps, which is by far the most common case, if it does not store objects as dictionaries, but instead more like something that resembles C/C++ structs. However, such an object representation is (1) generally hard/inefficient to do when properties get deleted, and (2) the engine may well interpret even the first deletion of a property as a hint that the programmer wants this particular object to behave like a dictionary, so it might switch the internal representation over. If a fast-to-modify dictionary is what you wanted, then that's fine (it will provide a benefit even); however if you wanted the object to remain in slow-to-modify/fast-to-read mode, you would perceive the transition to fast-to-modify/slow-to-read dictionary mode as a performance problem.
Thankfully there is a great solution nowadays: when you want a dictionary, use a Map or Set. Engines can (and usually will) assume that you'll want to delete entries from these, so the implementations are optimized for making that possible without negative side effects; in particular no hidden classes are involved.
A few remarks on your assumptions: deleting a property makes an object (mostly) leave the system of hidden class transitions, no transitions will be rebuilt. There is no single global "inline cache", there are many inline caches sprinkled all over your functions. They don't get rebuilt, they just transition to slower and slower modes the more different cases they have to handle. (That's generally how caching works: caching a single case provides huge speedups; on the other end of the scale if you have as many different cases as executions, then a cache just wastes time and memory without providing any benefit.) Again the effect of dictionary-mode objects depends on the overall situation: an inline cache dealing with (mostly) dictionary-mode objects typically exhibits performance somewhere in between (1) an inline cache that only has to deal with objects sharing the single same hidden class, and (2) an inline cache that has to deal with hundreds or thousands of different hidden classes.

The dangers of overwriting JavaScript object and functions

The nature of JavaScript allows for its native objects to be completely re-written. I want to know if there is any real danger in doing so!
Here are some examples of native JavaScript objects
Object
Function
Number
String
Boolean
Math
RegExp
Array
Lets assume that I want to model these to follow a similar pattern that you might find in Java (and some other OOP languages), so that Object defines a set of basic functions, and each other object inherits it (this would have to be explicitly defined by the user, unlike Java, where everything naturally derives from object)
Example:
Object = null;
function Object() {
Object.prototype.equals = function(other) {
return this === other;
}
Object.prototype.toString = function() {
return "Object";
}
Object.equals = function(objA, objB) {
return objA === objB;
}
}
Boolean = null;
function Boolean() {
}
extend(Boolean, Object); // Assume extend is an inheritance mechanism
Foo = null;
function Foo() {
Foo.prototype.bar = function() {
return "Foo.bar";
}
}
extend(Foo, Object);
In this scenario, Object and Boolean now have new implementations. In this respect, what is likely to happen? Am I likely to break things further down the line?
Edit:
I read somewhere that frameworks such as MooTools and Prototype have a similar approach to this, is this correct?
Monkey patching builtin classes like that is a controversial topic. I personally don't like doing that for 2 reaons:
Builtin classes are a global scope. This means that if two different modules try to add methods with the same name to the global classes then they will conflict, leading to subtle bugs. Even more subtly, if a future version of a browsers decides to implement a method with the same name you are also in trouble.
Adding things to the prototypes of common classes can break code that uses for-in loops without a hasOwnProperty check (people new to JS often do that to objects and arrays, since for-in kind of looks like a foreach loop). If you aren't 100% sure that the code you use is using for-in loops safely then monkeypatching Object.prototype could lead to problems.
That said, there is one situation where I find monkeypatching builtins acceptable and that is adding features from new browsers on older browsers (like, for example, the forEach method for arrays). In this case you avoid conflicts with future browser versions and aren't likely to catch anyone by surprise. But even then, I would still recommend using a shim from a third party instead of coding it on your own, since there are often many tricky corner cases that are hard to get right.
There's some level of preference here, but my personal take is that this sort of thing has the potential to become a giant intractable mess.
For example, you start with two projects, A and B, that each decide to implement all sorts of awesome useful fluent methods on String.
Project A has decided that String needs an isEmpty function that returns true if a string is zero-length or is only whitespace.
Project B has decided that String needs an isEmpty function that returns true if a string is zero-length, and an isEmptyOrWhitespace function that returns true if a string is zero-length or is only whitespace.
Now you have a project that wants to use some code from Project A and some code from Project B. Both of them make extensive use of their custom isEmpty functions. Do you have any chance of successfully joining the two? Probably not. You are in a cluster arrangement, so to speak.
Note that this is all very different than extension methods in C#, where you at least have to import the containing static class's namespace to get the extension method, there's no runtime conflict, and could reasonably consume from A and B in the same project as long as you didn't import their extensions namespace (hoping that they had the foresight to put their extension classes in a separate namespace for exactly this reason).
The worst case in JS that I know of along these lines is undefined. You can define it.
You're allowed to do things like undefined = 'blah';.... at which point, you can no longer rely on if(x === undefined). Which could easily break something elsewhere in your code (or, of course, in a third party lib you may be using).
That's completely bonkers, but definitely shows the definitely dangers of arbitrarily overwriting built-in objects.
See also: http://wtfjs.com/2010/02/15/undefined-is-mutable
For a slightly more sane example, take the Sahi browser testing tool. This tool allows you to write automated scripts for the browser to test your site. (similar to Selenium). One problem with doing that is if your site uses alert() or confirm(), the script would stop running while it waits for user input. Sahi gets around this by overwriting these functions with its own stub functions.
I avoid overriding the default behavior of the inherent objects. It's biten me a few times, while others I was fine. A library you can look at for an example is Sugar.js. Its a great library that some folks love, but I generally avoid it simply because it extends the behavior of existing JavScript objects, such as what you are doing.
I think however that you will find that this is purely opinion and style.

How to "correctly" create an object which inherits from Element?

I am writing an HTML5 application that involves a lot of XML manipulation, part of this manipulation involves comparing the versions of two different XML Elements.
What I need is for every Element, Attr, and TextNode (all of which inherit from Node, AFAIK) object that gets created to have associated version information, but still be able to behave like a normal Element, Attr, or TextNode. The current working solution I am using to store the version information, is the following:
Node.prototype.MyAppAnnotation = {
Version : null
};
Now, I understand that augmenting built-in types is considered bad form, but beyond this technique, I'm at a loss for how to get the desired functionality. I don't think I can encapsulate the Node in a wrapper because I need the Node related properties and functions exposed on the wrapper. I might be able to write some sort of pass-through functions for the wrapper, but that seems really clunky.
I feel that because the app I'm writing is an HTML5 app, and as such only has to run on the most modern browsers (all of which support the augmentation of built-ins), makes this technique appropriate. Also, by providing a sufficiently obscure name to my augmentation object, I can avoid all naming collisions (except for intentional collisions). I've also explored inheritance-based solution using Google's Closure library. However, it appears that because Element, Node and TextNode don't have direct constructors (i.e. they're created off of a Document object), this technique will not work either.
I was wondering if someone could either a) recommend an elegant way of achieving this effect without augmenting Element, or b) provide a compelling reason for why I shouldn't break the "don't augment built-ins" rule in this case.
Many Thanks,
Jarabek
Your idea is theoretically valid, but there's a weird feeling I get when reading about it.
First of all - you don't have to augment any prototypes. If you just do somedomnode.myweirdname='foo' it will become a field of that object. That's what javascript does ;)
So when there is no version you'll get undefined instead of null.
But, if you want to add more functionality or wrap dom node in anything - there's a bit of history of doing that. Most of that history is dominated by stuff like jQuery :)
Just create an object that has a field containing the node. And then you can access it really simply:
myobject.node
And create the object with some constructor or just factory function:
var myobject = createDomNodeWrapper(domnode)

The disadvantages of JavaScript prototype inheritance, what are they?

I recently watched Douglas Crockford's JavaScript presentations, where he raves about JavaScript prototype inheritance as if it is the best thing since sliced white bread. Considering Crockford's reputation, it may very well be.
Can someone please tell me what is the downside of JavaScript prototype inheritance? (compared to class inheritance in C# or Java, for example)
In my experience, a significant disadvantage is that you can't mimic Java's "private" member variables by encapsulating a variable within a closure, but still have it accessible to methods subsequently added to the prototype.
i.e.:
function MyObject() {
var foo = 1;
this.bar = 2;
}
MyObject.prototype.getFoo = function() {
// can't access "foo" here!
}
MyObject.prototype.getBar = function() {
return this.bar; // OK!
}
This confuses OO programmers who are taught to make member variables private.
Things I miss when sub-classing an existing object in Javascript vs. inheriting from a class in C++:
No standard (built-into-the-language) way of writing it that looks the same no matter which developer wrote it.
Writing your code doesn't naturally produce an interface definition the way the class header file does in C++.
There's no standard way to do protected and private member variables or methods. There are some conventions for some things, but again different developers do it differently.
There's no compiler step to tell you when you've made foolish typing mistakes in your definition.
There's no type-safety when you want it.
Don't get me wrong, there are a zillion advantages to the way javascript prototype inheritance works vs C++, but these are some of the places where I find javascript works less smoothly.
4 and 5 are not strictly related to prototype inheritance, but they come into play when you have a significant sized project with many modules, many classes and lots of files and you wish to refactor some classes. In C++, you can change the classes, change as many callers as you can find and then let the compiler find all the remaining references for you that need fixing. If you've added parameters, changed types, changed method names, moved methods,etc... the compiler will show you were you need to fix things.
In Javascript, there is no easy way to discover all possible pieces of code that need to be changed without literally executing every possible code path to see if you've missed something or made some typo. While this is a general disadvantage of javascript, I've found it particularly comes into play when refactoring existing classes in a significant-sized project. I've come near the end of a release cycle in a significant-sized JS project and decided that I should NOT do any refactoring to fix a problem (even though that was the better solution) because the risk of not finding all possible ramifications of that change was much higher in JS than C++.
So, consequently, I find it's riskier to make some types of OO-related changes in a JS project.
I think the main danger is that multiple parties can override one another's prototype methods, leading to unexpected behavior.
This is particularly dangerous because so many programmers get excited about prototype "inheritance" (I'd call it extension) and therefore start using it all over the place, adding methods left and right that may have ambiguous or subjective behavior. Ultimately, if left unchecked, this kind of "prototype method proliferation" can lead to very difficult-to-maintain code.
A popular example would be the trim method. It might be implemented something like this by one party:
String.prototype.trim = function() {
// remove all ' ' characters from left & right
}
Then another party might create a new definition, with a completely different signature, taking an argument which specifies the character to trim. Suddenly all the code that passes nothing to trim has no effect.
Or another party reimplements the method to strip ' ' characters and other forms of white space (e.g., tabs, line breaks). This might go unnoticed for some time but lead to odd behavior down the road.
Depending on the project, these may be considered remote dangers. But they can happen, and from my understanding this is why libraries such as Underscore.js opt to keep all their methods within namespaces rather than add prototype methods.
(Update: Obviously, this is a judgment call. Other libraries--namely, the aptly-named Prototype--do go the prototype route. I'm not trying to say one way is right or wrong, only that this is the argument I've heard against using prototype methods too liberally.)
I miss being able to separate interface from implementation. In languages with an inheritance system that includes concepts like abstract or interface, you could e.g. declare your interface in your domain layer but put the implementation in your infrastructure layer. (Cf. onion architecture.) JavaScript's inheritance system has no way to do something like this.
I'd like to know if my intuitive answer matches up with what the experts think.
What concerns me is that if I have a function in C# (for the sake of discussion) that takes a parameter, any developer who writes code that calls my function immediately knows from the function signature what sort of parameters it takes and what type of value it returns.
With JavaScript "duck-typing", someone could inherit one of my objects and change its member functions and values (Yes, I know that functions are values in JavaScript) in almost any way imaginable so that the object they pass in to my function bears no resemblance to the object I expect my function to be passed.
I feel like there is no good way to make it obvious how a function is supposed to be called.

javascript constructs to avoid?

I have been writing a JS algorithm. Its blazing fast in chrome and dog slow in FF. In the chrome profiler, I spend <10% in a method, in FF the same method is 30% of the execution time. Are there javascript constructs to avoid because they are really slow in one browser or another?
One thing I have noticed is that things like simple variable declaration can be expensive if you do it enough. I sped up my algorithm noticable by not doing things like
var x = y.x;
dosomthing(x);
and just doing
dosomething(y.x)
for example.
As you've found, different things are issues in different implementations. In my experience, barring doing really stupid things, there's not much point worrying about optimizing your JavaScript code to be fast until/unless you run into a specific performance problem when testing on your target browsers. Such simple things as the usual "count down to zero" optimization (for (i = length - 1; i >= 0; --i) instead of for (i = 0; i < length; ++i)) aren't even reliable across implementations. So I tend to stick to writing code that's fairly clear (because I want to be nice to whoever has to maintain it, which is frequently me), and then worry about optimization if and when.
That said, looking through the Google article that tszming linked to in his/her answer reminded me that there are some performance things that I tend to keep in mind when writing code originally. Here's a list (some from that article, some not):
When you're building up a long string out of lots of fragments, surprisingly you usually get better performance if you build up an array of the fragments and then use the Array#join method to create the final string. I do this a lot if I'm building a large HTML snippet that I'll be adding to a page.
The Crockford private instance variable pattern, though cool and powerful, is expensive. I tend to avoid it.
with is expensive and easily misunderstood. Avoid it.
Memory leaks are, of course, expensive eventually. It's fairly easy to create them on browsers when you're interacting with DOM elements. See the article for more detail, but basically, hook up event handlers using a good library like jQuery, Prototype, Closure, etc. (because that's a particularly prone area and the libraries help out), and avoid storing DOM element references on other DOM elements (directly or indirectly) via expando properties.
If you're building up a significant dynamic display of content in a browser, innerHTML is a LOT faster in most cases than using DOM methods (createElement and appendChild). This is because parsing HTML into their internal structures efficiently is what browsers do, and they do it really fast, using optimized, compiled code writing directly to their internal data structures. In contrast, if you're building a significant tree using the DOM methods, you're using an interpreted (usually) language talking to an abstraction that the browser than has to translate to match its internal structures. I did a few experiments a while back, and the difference was about an order of magnitude (in favor of innerHTML). And of course, if you're building up a big string to assign to innerHTML, see the tip above — best to build up fragments in an array and then use join.
Cache the results of known-slow operations, but don't overdo it, and only keep things as long as you need them. Keep in mind the cost of retaining a reference vs. the cost of looking it up again.
I've repeatedly heard people say that accessing vars from a containing scope (globals would be the ultimate example of this, of course, but you can do it with closures in other scopes) is slower than accessing local ones, and certainly that would make sense in a purely interpreted, non-optimized implementation because of the way the scope chain is defined. But I've never actually seen it proved to be a sigificant difference in practice. (Link to simple quick-and-dirty test) Actual globals are special because they're properties of the window object, which is a host object and so a bit different than the anonymous objects used for other levels of scope. But I expect you already avoid globals anyway.
Here's an example of #6. I actually saw this in a question related to Prototype a few weeks back:
for (i = 0; i < $$('.foo').length; ++i) {
if ($$('.foo')[i].hasClass("bar")) { // I forget what this actually was
$$('.foo')[i].setStyle({/* ... */});
}
}
In Prototype, $$ does an expensive thing: It searches through the DOM tree looking for matching elements (in this case, elements with the class "foo"). The code above is searching the DOM three times on each loop: First to check whether the index is in bounds, then when checking whether the element has the class "bar", and then when setting the style.
That's just crazy, and it'll be crazy regardless of what browser it's running on. You clearly want to cache that lookup briefly:
list = $$('.foo');
for (i = 0; i < list.length; ++i) {
if (list[i].hasClass("bar")) { // I forget what this actually was
list[i].setStyle({/* ... */});
}
}
...but taking it further (such as working backward to zero) is pointless, it may be faster on one browser and slower on another.
Here you go:
http://code.google.com/intl/zh-TW/speed/articles/optimizing-javascript.html
I don't think this is really a performance thing, but something to avoid for sure unless you really know what's happening is:
var a = something.getArrayOfWhatever();
for (var element in a) {
// aaaa! no!! please don't do this!!!
}
In other words, using the for ... in construct on arrays should be avoided. Even when iterating through object properties it's tricky.
Also, my favorite thing to avoid is to avoid omission of var when declaring local variables!

Categories

Resources