Is "monkey patching" really that bad? [closed]

Is "monkey patching" really that bad? [closed] - javascript

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Some languages like Ruby and JavaScript have open classes which allow you to modify interfaces of even core classes like numbers, strings, arrays, etc. Obviously doing so could confuse others who are familiar with the API but is there a good reason to avoid it otherwise, assuming that you are adding to the interface and not changing existing behavior?
For example, it might be nice to add a an Array.map implementation to web browsers which don't implement ECMAScript 5th edition (and if you don't need all of jQuery). Or your Ruby arrays might benefit from a "sum" convenience method which uses "inject". As long as the changes are isolated to your systems (e.g. not part of a software package you release for distribution) is there a good reason not to take advantage of this language feature?

Monkey-patching, like many tools in the programming toolbox, can be used both for good and for evil. The question is where, on balance, such tools tend to be most used. In my experience with Ruby the balance weighs heavily on the "evil" side.
So what's an "evil" use of monkey-patching? Well, monkey-patching in general leaves you wide open to major, potentially undiagnosable clashes. I have a class A. I have some kind of monkey-patching module MB that patches A to include method1, method2 and method3. I have another monkey-patching module MC that also patches A to include a method2, method3 and method4. Now I'm in a bind. I call instance_of_A.method2: whose method gets called? The answer to that can depend on a lot of factors:
In which order did I bring in the patching modules?
Are the patches applied right off or in some kind of conditional circumstance?
AAAAAAARGH! THE SPIDERS ARE EATING MY EYEBALLS OUT FROM THE INSIDE!
OK, so #3 is perhaps a tad over-melodramatic....
Anyway, that's the problem with monkey-patching: horrible clashing problems. Given the highly-dynamic nature of the languages that typically support it you're already faced with a lot of potential "spooky action at a distance" problems; monkey-patching just adds to these.
Having monkey-patching available is nice if you're a responsible developer. Unfortunately, IME, what tends to happen is that someone sees monkey-patching and says, "Sweet! I'll just monkey-patch this in instead of checking to see if other mechanisms might not be more appropriate." This is a situation roughly analogous to Lisp code bases created by people who reach for macros before they think of just doing it as a function.

Wikipedia has a short summary of the pitfalls of monkey-patching:
http://en.wikipedia.org/wiki/Monkey_patch#Pitfalls
There's a time and place for everything, also for monkey-patching. Experienced developers have many techniques up their sleeves and learn when to use them. It's seldom a technique per se that's "evil", just inconsiderate use of it.

With regards to Javascript:
is there a good reason to avoid it otherwise, assuming that you are adding to the interface and not changing existing behavior?
Yes. Worst-case, even if you don't alter existing behavior, you could damage the future syntax of the language.
This is exactly what happened with Array.prototype.flatten and Array.prototype.contains. In short, the specification was written up for those methods, their proposals got to stage 3, and then browsers started shipping it. But, in both cases, it was found that there were ancient libraries which patched the built-in Array object with their own methods with the same name as the new methods, and had different behavior; as a result, websites broke, the browsers had to back out of their implementations of the new methods, and the specification had to be edited. (The methods were renamed.)
If you mutate a built-in object like Array on your own browser, on your own computer, that's fine. (This is a very useful technique for userscripts.) If you mutate a built-in object on your public-facing site, that's less fine - it may eventually result in problems like the above. If you happen to control a big site (like stackoverflow.com) and you mutate a built-in object, you can almost guarantee that browsers will refuse to implement new features/methods which break your site (because then users of that browser will not be able to use your site, and they will be more likely to migrate to a different browser). (see here for an explanation of these sorts of interactions between the specification writers and browser makers)
All that said, with regards to the specific example in your question:
For example, it might be nice to add a an Array.map implementation to web browsers which don't implement ECMAScript 5th edition
This is a very common and trustworthy technique, called a polyfill.
A polyfill is code that implements a feature on web browsers that do not support the feature. Most often, it refers to a JavaScript library that implements an HTML5 web standard, either an established standard (supported by some browsers) on older browsers, or a proposed standard (not supported by any browsers) on existing browsers
For example, if you wrote a polyfill for Array.prototype.map (or, to take a newer example, for Array.prototype.flatMap) which was perfectly in line with the official Stage 4 specification, and then ran code that defined Array.prototype.flatMap on browsers which didn't have it already:
if (!Array.prototype.flatMap) {
Array.prototype.flatMap = function(...
// ...
}
}
If your implementation is correct, this is perfectly fine, and is very commonly done all over the web so that obsolete browsers can understand newer methods. polyfill.io is a common service for this sort of thing.

As long as the changes are isolated to
your systems (e.g. not part of a
software package you release for
distribution) is there a good reason
not to take advantage of this language
feature?
As a lone developer on an isolated problem there are no issues with extending or altering native objects. Also on larger projects this is a team choice that should be made.
Personally I dislike having native objects in javascript altered but it's a common practice and it's a valid choice to make. If your going to write a library or code that is meant to be used by other's I would heavily avoid it.
It is however a valid design choice to allow the user to set a config flag which states please overwrite native objects with your convenience methods because there's so convenient.
To illustrate a JavaScript specific pitfall.
Array.protoype.map = function map() { ... };
var a = [2];
for (var k in a) {
console.log(a[k]);
}
// 2, function map() { ... }
This issue can be avoided by using ES5 which allows you to inject non-enumerable properties into an object.
This is mainly a high level design choice and everyone needs to be aware / agreeing on this.

It's perfectly reasonable to use "monkey patching" to correct a specific, known problem where the alternative would be to wait for a patch to fix it. That means temporarily taking on responsibility for fixing something until there's a "proper", formally released fix that you can deploy.
A considered opinion by Gilad Bracha on Monkey Patching: http://gbracha.blogspot.com/2008/03/monkey-patching.html

The conditions your describe -- adding (not changing) existing behavior, and not releasing your code to the outside world -- seem relatively safe. Problems could come up, however, if the next version of Ruby or JavaScript or Rails changes their API. For example, what if some future version of jQuery checks to see if Array.map is already defined, and assumes it's the EMCA5Script version of map when in actuality it's your monkey-patch?
Similarly, what if you define "sum" in Ruby, and one day you decide you want to use that ruby code in Rails or add the Active Support gem to your project. Active Support also defines a sum method (on Enumerable), so there's a clash.

Related

Javascript: Where getter/setter values are stored? [duplicate]

I was thinking about this today and I realized I don't have a clear picture here.
Here are some statements I think to be true (please correct me if I'm wrong):
the DOM is a collection of interfaces specified by W3C.
when parsing HTML source code, the browser creates a DOM tree which has nodes that implement DOM interfaces.
the ECMAScript spec has no reference of browser host objects (DOM, BOM, HTML5 APIs etc.).
how the DOM is actually implemented depends on browser internals and is probably different among most of them.
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
I am curious about what happens behind the scenes when I call document.getElementById('foo'). Does the call get delegated to browser native code by the interpreter or does the browser have JS implementations of all host objects? Do you know about any optimizations they do in regard to this?
I read this overview of browser internals but it didn't mention anything about this. I will look through the Chrome and FF source when I have time, but I thought about asking here first. :)

All of your bullet points are correct, except:
modern JS interpreters use JIT to improve the code performance and translate it to bytecode
should be "...and translate it to native code". SpiderMonkey (the JS engine in Firefox) worked as a bytecode interpreter for a long time before the current JS speed arms race.
On Mozilla's JS-to-DOM bridge:
The host objects are typically implemented in C++, though there is an experiment underway to implement DOM in JS. So when a web page calls document.getElementById('foo'), the actual work of retrieving the element by its ID is done in a C++ method, as hsivonen noted.
The specific way the underlying C++ implementation gets called depends on the API and also changed over time (note that I'm not involved in the development, so might be wrong about some details, here's a blog post by jst, who was actually involved in creating much of this code):
At the lowest level every JS engine provides APIs to define host objects. For example, the browser can call JS_DefineFunctions (as demonstrated in the SpiderMonkey User Guide) to let the engine know that whenever script calls a function with the specified name, a provided C callback should be called. Same for other aspects of the host objects (e.g. enumeration, property getters/setters, etc.)
For the core ECMAScript functionality and in some tricky DOM cases the JS engine/the browser uses these APIs directly to define host objects and their behaviors, but it requires a lot of common boilerplate code for e.g. checking parameter types, converting them to the appropriate C++ types, error handling etc.
For reasons I won't go into, let's say historically, Mozilla made heavy use of XPCOM for many of its objects, including much of the DOM. One feature of XPCOM is its binding to JS called XPConnect. Among other things, XPConnect can take an interface definition in IDL (such as nsIDOMDocument; or more precisely its compiled representation), expose an object with the specified properties to the script, and later, when a script calls getElementById, perform the necessary parameter checks/conversions and route the call directly to a C++ method (nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn))
The way XPConnect worked was quite inefficient: it registered generic functions as callbacks to be executed when a script accesses a host object, and these generic functions figured out what they needed to do in every particular case dynamically. This post about quickstubs walks you through one example.
"Quick stubs" mentioned in the previous link is a way to optimize JS->C++ calls time by trading some code size for it: instead of always using generic C++ functions that know how to make any kind of call, the specialized code is automatically generated at the Firefox build time for a pre-defined list of "hot" calls.
Later on the JIT (tracemonkey at that time) was taught to generate the code calling C++ methods as part of the native code generated for "hot" paths in JS. I'm not sure how the newer JITs (jaegermonkey) work in this regard.
With "paris bindings" the objects are exposed to webpage JS without any reliance on XPConnect, instead generating all the necessary glue JSClass code based on WebIDL (instead of XPCOM-era IDL). See also posts by developers who worked on this: jst and khuey. Also see How is the web-exposed DOM implemented?
I'm fuzzy on details of the three last points in particular, so take it with a grain of salt.
The most recent improvements are listed as dependencies of bug 622298, but I don't follow them closely.

JS calls to DOM methods like getElementById cause the JS engine to call into the C++ code that implements the DOM. For example, in Firefox, the call ends up in nsDocument::GetElementById(const nsAString& aId, nsIDOMElement** aReturn).
As you can see, Firefox maintains a hashtable that maps ids to elements in C++ as an optimization in this case, so it doesn't walk the whole DOM tree looking for the id.

The DOM is implemented as a language-independent library pretty much in all major browser implementations, which means it's in a different library from the Javascript engine. For example in IE, the JS engine is implemented in jscript.dll while the DOM is implemented in mshtml.dll. Safari has Nitro(JS) and WebCore(DOM). Chrome has V8(JS) and WebCore(DOM), and Firefox has SpiderMonkey/TraceMonkey(JS) and Gecko(DOM).
What this means is that anytime your JS has to access the DOM, it has to reach over to the DOM library - which is inherently slow because of all the marshaling that has to take place. An analogy that has been used is 2 pieces of land connected by a toll bridge, any time you touch the DOM, you must cross over the bridge and cross back - paying a performance toll.
References
Video: Building High Performance Web Applications and Sites
Book: High Performance Javascript (Chapter 3 on the DOM)

Extending a class or Creating a new function, Which is better? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
If somebody wants to implement a function (e.g. Array.prototype.filter) for an old browser which method is better? Why? What is pros and cons of each one?
if (!Array.prototype.filter) {
Array.prototype.filter = function() {
//implementation
}
}
or
function myFilter(e,i,f) {
if (!Array.prototype.filter) {
//implementation
} else {
return Array.prototype.filter(e,i,f);
}
}

Personally, I use the first approach (adding Array.prototype.filter) most of the time, but there are times when I don't. I don't use the first approach when writing large-scale frameworks for old browsers, which need to play nice with any other scripts on the internet, including bad ones.
Here are the things to consider. Let's say you shim in Array.prototype.filter using the method you described, and then you add a 3rd party script to your website that does something like this:
var someArray = [ 'a', 'b', 'c' ];
console.log('The first three letters of the alphabet are: ');
for (var i in someArray)
console.log(someArray[i]);
The result?
The first three letters of the alphabet are:
a
b
c
function () { ... }
Your filter function gets printed out! Now, on a technical level, no one should iterate arrays with a for...in loop, but you can't control what 3rd party scripts might do. This is the main drawback, and when you're writing a large-scale library to be used all over the internet, it's worth considering.
This prevents you from adding anything to Object.prototype, period (with a simple Object.prototype.method = function() { ... }assignment), becausefor...inis good to use on other types of objects, and this kind of shimming breaksfor...in`.
But, if you're just writing code to be used on your own website, and you know the scripts you'll be including are high quality and don't make blunders like this, it's a fine technique.
In addition, if your code is only intended for ECMAScript 5 compliant browsers, you can define your shims as non-enumerable, which will prevent them from showing up in for...in loops. I am currently working on a project to shim in ECMAScript 6 methods to ES5 browsers, and since it's only intended to run in ES5 browsers, I can get away with:
Object.defineProperty(Array.prototype, 'contains', {
value: function(value) {
// Implementation
},
enumerable: false,
writable: true,
configurable: true
});
Defining the shim in this way prevents it from showing up in a for...in loop, so it's even safe to use on Object.prototype, provided your implementation accurately follows the spec/draft.
Which brings up a secondary point. If you are not confident you can accurately implement the method as per the spec, you may also be better off not to shim it in. This could break libraries which check for the built-in and use it if available, but fall back to something else if it's not. Better to leave them without anything than with a broken implementation.
It's for this reason that I never try to partially implement a shim. For instance, Object.create cannot be completely implemented in ES3 browsers, so I don't shim this one in ever. I just use my own myLibrary.create function, which may function similarly, but at least any code which may check for the existence of Object.create isn't being misled to believe it has a full, working version.
Moral of the story: If you're going to shim, check out some of the really awesome, spec-compliant shims on MDN, and learn to write code like it: Array.prototype.filter
A final note: The reason many libraries don't shim in methods on Array.prototype may be in part to some of these considerations, but there's also a historical reason in many of these cases. Libraries like jQuery were being developed before filter was a standard built-in method. Prototype.js took the approach of adding non-standard methods, which has more drawbacks than those discussed above. jQuery decided not to add non-standard methods to the built-in prototypes, and that's why they have $.filter.

The second approach is more common; it's used by such popular shim JS libraries as jQuery and Underscore.js.
Said that, extending native objects is far less dangerous then extending host objects (with latter, it's typical 'catch 22': you extend them to provide additional functionality in browsers that don't handle extending of the host objects well enough). This article sums it up:
Don’t forget that writing proper, compliant shims is hard. When in
doubt, use standalone object. When the method you’re shimming is part
of the unfinished spec, use standalone object. Only when you’re
certain about method compliance and method is part of the finished,
future-proof specification, is it safe to shim native object directly.

For new standard js function and class best practice is shiming, extending classes if the function is not found on old browsers.
But if the function does not become standard best practice is creating new function and do not change standard class, because if you use a function name that may be used in future with different behavior it will break your app.

Revisiting extending native prototypes after ECMAScript 5

Recently, given the changes to defining properties in ECMAScript 5, I have revisited the question of whether we can safely extend the native JavaScript prototypes. In truth, all along I have extended prototypes like Array and Function, but I avoided doing so with Object, for the obvious reasons. In unit testing with Jasmine, by adding Object.prototype specs to specs for my own personal framework, extending Object.prototype with non-enumerable functions has appeared to be safe. Data properties like a "type" property, however, with getters/setters that do any unusual processing have had unintended consequences. There is still the possibility of conflicts with other libraries--though in my work, that hardly ever comes up. Nevertheless, as long as the functions are not enumerable, it looks like extending Object.prototype can be safe.
What do you think? Is it safe to extend Object.prototype now? Please discuss.

Extending objects native to JavaScript might become a little safer, through many collision concerns still stand. Generally, unless you're extending object to support standartized behavior from more recent standard it really would be still much safer to introduce wrapper - it is much easier to do things right way when you're the only one in control.
Speaking of objects native to environment (DOM elements and nodes, AJAX stuff), new JS standard still don't give and, arguably, can't give you any guarantee about any interaction with those except what defined in their interface standard. Never forget that they're potentially accessible through many different scripting engines and thus not need to be tailored for quirks of one specific language - JS. So recommendation to not extend those either still stands as well.

The definitive, absolute answer is ...
"It depends." :)
Extending any built in JavaScript object can be perfectly safe or it can be a complete disaster. It depends on what you are doing and how you are doing it.
Use smart practices and common sense and test the hell-out-of-it.

Why is it frowned upon to modify JavaScript object's prototypes?

I've come across a few comments here and there about how it's frowned upon to modify a JavaScript object's prototype? I personally don't see how it could be a problem. For instance extending the Array object to have map and include methods or to create more robust Date methods?

The problem is that prototype can be modified in several places. For example one library will add map method to Array's prototype and your own code will add the same but with another purpose. So one implementation will be broken.

Mostly because of namespace collisions. I know the Prototype framework has had many problems with keeping their names different from the ones included natively.
There are two major methods of providing utilities to people..
Prototyping
Adding a function to an Object's prototype. MooTools and Prototype do this.
Advantages:
Super easy access.
Disadvantages:
Can use a lot of system memory. While modern browsers just fetch an instance of the property from the constructor, some older browsers store a separate instance of each property for each instance of the constructor.
Not necessarily always available.
What I mean by "not available" is this:
Imagine you have a NodeList from document.getElementsByTagName and you want to iterate through them. You can't do..
document.getElementsByTagName('p').map(function () { ... });
..because it's a NodeList, not an Array. The above will give you an error something like: Uncaught TypeError: [object NodeList] doesn't have method 'map'.
I should note that there are very simple ways to convert NodeList's and other Array-like
Objects into real arrays.
Collecting
Creating a brand new global variable and stock piling utilities on it. jQuery and Dojo do this.
Advantages:
Always there.
Low memory usage.
Disadvantages:
Not placed quite as nicely.
Can feel awkward to use at times.
With this method you still couldn't do..
document.getElementsByTagName('p').map(function () { ... });
..but you could do..
jQuery.map(document.getElementsByTagName('p'), function () { ... });
..but as pointed out by Matt, in usual use, you would do the above with..
jQuery('p').map(function () { ... });
Which is better?
Ultimately, it's up to you. If you're OK with the risk of being overwritten/overwriting, then I would highly recommend prototyping. It's the style I prefer and I feel that the risks are worth the results. If you're not as sure about it as me, then collecting is a fine style too. They both have advantages and disadvantages but all and all, they usually produce the same end result.

As bjornd pointed out, monkey-patching is a problem only when there are multiple libraries involved. Therefore its not a good practice to do it if you are writing reusable libraries. However, it still remains the best technique out there to iron out cross-browser compatibility issues when using host objects in javascript.
See this blog post from 2009 (or the Wayback Machine original) for a real incident when prototype.js and json2.js are used together.

There is an excellent article from Nicholas C. Zakas explaining why this practice is not something that should be in the mind of any programmer during a team or customer project (maybe you can do some tweaks for educational purpose, but not for general project use).
Maintainable JavaScript: Don’t modify objects you don’t own:
https://www.nczonline.net/blog/2010/03/02/maintainable-javascript-dont-modify-objects-you-down-own/

In addition to the other answers, an even more permanent problem that can arise from modifying built-in objects is that if the non-standard change gets used on enough sites, future versions of ECMAScript will be unable to define prototype methods using the same name. See here:
This is exactly what happened with Array.prototype.flatten and Array.prototype.contains. In short, the specification was written up for those methods, their proposals got to stage 3, and then browsers started shipping it. But, in both cases, it was found that there were ancient libraries which patched the built-in Array object with their own methods with the same name as the new methods, and had different behavior; as a result, websites broke, the browsers had to back out of their implementations of the new methods, and the specification had to be edited. (The methods were renamed.)
For example, there is currently a proposal for String.prototype.replaceAll. If you ship a library which gets widely used, and that library monkeypatches a custom non-standard method onto String.prototype.replaceAll, the replaceAll name will no longer be usable by the specification-writers; it will have to be changed before browsers can implement it.

Speed comparison of Cappuccinos obj_msgSend() vs. normal JavaScript-call avaiable?

As you know Cappuccino implements the dispatch mechanism of Objective-C / Smalltalk to send messages to objects (~call their methods) in a special method called objj_msgSend.
[someObject someMethodToInvocate: aParameter];
Obviously this introduces some overhead and therefor speed-loss. I'd like to know if somebody can provide a speed comparison between this Message Sending and the normal way to execute a method in JavaScript…
someObject.someMethodToInvocate(aParameter);

In your comments you say you're wondering 'in general' in the context of Cappuccino applications. In that case the test is easy: run any Cappuccino application, such as GitHub Issues, and judge for yourself if its slow or not. Try scrolling in the main table, select a few entries and so on. That'll tell you if Cappuccino is fast or slow 'in general' as objj_msgSend is used extensively in any use case you can think of in an application like this.
If you're actually thinking of something more specific after all, note that nothing about Cappuccino forces you to use message passing. Just like in Objective-C you can always 'drop down to the metal' - pure JavaScript in this case - when you need to do something more performance intensive. If you have a tight loop, and you don't require the additional functionality provided by objj_msgSend, simply call functions directly. Objective-J won't mind.

objj_msgSend is for my simple tests of pure method calling about 2–2.5 times slower than a direct call.
That is actually quite good, given the advanced features it makes possible.

This is coming two years too late, but this is a slightly invalid question (in no way saying that makes it a bad question). There is really no point questioning the speed of objj_msgSend, not when you are assuming that it is a Smalltalk/Obj-C/Obj-J specific feature.
Javascript has ALWAYS had this ability.
Lookup: the call() AND apply() methods... (a quick google search will bring up articles like this -> http://vikasrao.wordpress.com/2011/06/09/javascripts-call-and-apply-methods/ )
It is the same issue with jQuery/Prototype/etc..., they are all fine and dandy and useful. But they hurt the development community because everyone relies on these frameworks instead of learning the core language features that make any language useful.
Do yourself and the development community a favor and LEARN YOUR LANGUAGES, NOT FRAMEWORKS. If you know the languages you use, the frameworks you use are irrelevant, use them or just build them yourself, because at that point you should be able to.
Hope that came off as helpful and not condescending, thats not my intention. :)

Develop Reference

JavaScript is the programming language of the Web.