Access ReactJS's internal event mapping - javascript

Since ReactJS uses event delegation, is there a way to access the internal node/events mapping?
It doesn't have to be pretty, wrapping one of the responsible functions (like the equivalent of jQuery's on()) and doing custom housekeeping would work fine and would actually be preferable to accessing that data through an object of some sort.
If I could get the selector, event and handler (optional) that'd be great, as I'm already doing this for jQuery.
The reason I'm asking is because I'm trying to optimize Arachni for ReactJS applications and having prior knowledge of which elements actually have event handlers (and what they are) would result in a much more efficient scan.
In addition, that data would be of great value when scanning web applications for non security reason's as well, you could gather some very interesting stats that way.

Related

*When or why* would you use JavaScript's EventTarget?

I originally was reading this:
https://developer.mozilla.org/en-US/docs/Web/API/EventTarget
But, then I came upon this question here:
How to use JavaScript EventTarget?
However, when or why would a developer need to use EventTarget?
I'm still learning. I know the SO community doesn't like duplicate questions, but I feel this isn't since it's asking a different question than the one I cited above (it only discusses how to use it).
For the same reason you might use EventEmitter in node.js: you have some custom class/object which you would like to emit events.
Events are useful to allow your object to notify other parts of your code that something interesting has happened, without the object actually knowing anything about the code that uses it.
For example, a link on a page (implemented by the browser as an HTMLAnchorElement object) does not need to know what your code does in response to clicks; you simply register a click event handler (by calling EventTarget#addEventListener('click', …)) and when clicks happen, the browser internally calls EventTarget#dispatchEvent(new Event('click')) and your code handles the event.
Your own custom objects can use this pattern for a wide range of things. Perhaps you'd like to notify things that your object's data has changed -- either as a result of the user doing something or a fetch call returning.
This allows you to build code that is easily composable and testable: the emitter doesn't care who is listening or what that code does, and the consumers don't care about the implementation details of the event getting fired.

jQuery .remove() vs Node.removeChild() and external maps to DOM nodes

The jQuery API documentation for jQuery .remove() mentions:
In addition to the elements themselves, all bound events and jQuery
data associated with the elements are removed.
I assume "bound events" here means "event handlers"; documentation for the similar .empty() says:
To avoid memory leaks, jQuery removes other constructs such as data
and event handlers from the child elements before removing the
elements themselves.
It does sound like leaks would ensue if one were to not use these functions and use Node.removeChild() (or ChildNode.remove()) instead.
Is this true for modern browsers?
If so, why exactly can't properties and event handlers be collected once the node is removed?
If not, do we still need to use .data()? Is it only good to retrieve HTML5 data- attributes?
Documentation for jQuery.data() (lower-level function) says:
The jQuery.data() method allows us to attach data of any type to DOM
elements in a way that is safe from circular references and therefore
free from memory leaks. jQuery ensures that the data is removed when
DOM elements are removed via jQuery methods, and when the user leaves
the page.
This sounds an awful lot like a solution to the old IE DOM/JS circular leak pattern which, AFAIK, is solved in all browsers today.
However, a comment in the jQuery src/data.js code (snapshot) says:
Provide a clear path for implementation upgrade to WeakMap in 2014
Which suggests that the idea of storing data strictly associated to a DOM node outside of the DOM using a separate data store with a map is still considered in the future.
Is this just for backward-compatibility, or is there more to it?
Answers provided to other questions like this one also seem to imply that the sole reason for an external map is to avoid cyclic refs between DOM objects and JS objects, which I consider irrelevant in the context of this question (unless I'm mistaken).
Furthermore, I've seen plugins that now set properties on relevant DOM nodes directly (e.g. selectize.js) and it doesn't seem to bother anyone. Is this an OK practice? It certainly looks that way, as it makes removing entire DOM trees very easy. No need to walk it down, no need to clean up any external data store, just detach it from the parent node, lose the reference, and let the garbage collector do its thing.
Further notes, context and rationale to the question:
This kind of capability is especially interesting for frameworks that manage views (e.g. Durandal), which often times have to replace entire trees that represent said views in their architecture. While most of them certainly support jQuery explicitly, this solution does not scale at all. Every component that uses a similar data store must also be cleaned up. In the case of Durandal, it seems they (at least in one occurrence, the dialog plugin - snapshot) rely on Knockout's .removeNode() (snapshot) utility function, which in turn uses jQuery's internal cleanData(). That's, IMHO, a prime example of horrible special-casing (I'm not sure it even works as it is now if jQuery is used in noConflict mode, which it is in most AMD setups).
This is why I'd love to know if I can safely ignore all of this or if we'll have to wait for Web Components in order to regain our long-lost sanity.
"It does sound like leaks would ensue if one were to not use these functions and use Node.removeChild() (or ChildNode.remove()) instead.
Is this true for modern browsers?
If so, why exactly can't properties and event handlers be collected once the node is removed?"
Absolutely. The data (including event handlers) associated with an element is held in a global object held at jQuery.cache, and is removed via a serial number jQuery puts on the element.
When it comes time for jQuery to remove an element, it grabs the serial number, looks up the entry in jQuery.cache, manually deletes the data, and then removes the element.
Destroy the element without jQuery, you destroy the serial number and the only association to the element's entry in the cache. The garbage collector has no knowledge of what the jQuery.cache object is for, and so it can't garbage collect entries for nodes that were removed. It just sees it as a strong reference to data that may be used in the future.
While this was a useful approach for old browsers like IE6 and IE7 that had serious problems with memory leaks, modern implements have excellent garbage collectors that reliably find things like circular references between JavaScript and the DOM. You can have some pretty nasty circular references via object properties and closures, and the GC will find them, so it's really not such a worry with those browsers.
However, since jQuery holds element data in the manner it does, we now have to be very careful when using jQuery to avoid jQuery-based leaks. This means never use native methods to remove elements. Always use jQuery methods so that jQuery can perform its mandatory data cleanup.
"Furthermore, I've seen plugins that now set properties on relevant DOM nodes directly (e.g. selectize.js) and it doesn't seem to bother anyone. Is this an OK practice?"
I think it is for the most part. If the data is just primitive data types, then there's no opportunity for any sort of circular references that could happen with functions and objects. And again, even if there are circular references, modern browsers handle this nicely. Old browsers (especially IE), not so much.
"This is why I'd love to know if I can safely ignore all of this or if we'll have to wait for Web Components in order to regain our long-lost sanity."
We can't ignore the need to use jQuery specific methods when destroying nodes. Your point about external frameworks is a good one. If they're not built specifically with jQuery in mind, there can be problems.
You mention jQuery's $.noConflict, which is another good point. This easily allows other frameworks/libraries to "safely" be loaded, which may overwrite the global $. This opens the door to leaks IMO.
AFAIK, $.noConflict also enables one to load multiple versions of jQuery. I don't know if there are separate caches, but I would assume so. If that's the case, I would imagine we'd have the same issues.
If jQuery is indeed going to use WeakMaps in the future as the comment you quoted suggests, that will be a good thing and a sensible move. It'll only help in browsers that support WeakMaps, but it's better than nothing.
"If not, do we still need to use .data()? Is it only good to retrieve HTML5 data- attributes?"
Just wanted to address the second question. Some people think .data() should always be used for HTML5 data- attributes. I don't because using .data() for that will import the data into jQuery.cache, so there's more memory to potentially leak.
I can see it perhaps in some narrow cases, but not for most data. Even with no leaks, there's no need to have most data- stored in two places. It increases memory usage with no benefit. Just use .attr() for most simple data stored as data- attributes.
In order to provide some of its features, jQuery has its own storage for some things. For example, if you do
$(elem).data("greeting", "hello");
Then, jQuery, will store the key "greeting" and the data "hello" on its own object (not on the DOM object). If you then use .removeChild(elem) to remove that element from the DOM and there are no other references to it, then that DOM element will be freed by the GC, but the data that you stored with .data() will not. This is a memory leak as the data is now orphaned forever (while you're on that web page).
If you use:
$(elem).remove();
or:
$(some parent selector).empty()
Then, jQuery will not only remove the DOM elements, but also clean up its extra shadow data that it keeps on items.
In addition to .data(), jQuery also keeps some info on event handlers that are installed which allows it to perform operations that the DOM by itself can't do such as $(elem).off(). That data also will leak if you don't dispose of an object using jQuery methods.
In a touch of irony, the reason jQuery doesn't store data as properties on the DOM elements themselves (and uses this parallel storage) is because there are circumstances where storing certain types of data on the DOM elements can itself lead to memory leaks.
As for the consequences of all this, most of the time it is a negligible issue because it's a few bytes of data that is recovered by the browser as soon as the user navigates to a new page.
The kinds of things that could make it become material are:
If you have a very dynamic web page that is constantly creating and removing DOM elements thousands of times and using jQuery features on those objects that store side data (jQuery event handlers, .data() on those elements, then any memory leak per operation could add up over time and become material.
If you have a very long running web page (e.g. a single page app) that stays on screen for very long periods of time and thus over time the memory leaks could accumulate.

how to access jquery internal data?

As you may or may not be aware as of jQuery 1.7 the whole event system was rewritten from the ground up. The codebase is much faster and with the new .on() method there is a lot of uniformity to wiring up event handlers.
One used to be able to access the internal events data and investiate what events are registered on any given element, but recently this internal information has been hidden based on the following scenario...
It seems that the "private" data is ALWAYS stored on the .data(jQuery.expando) - For "objects" where the deletion of the object should also delete its caches this makes some sense.
In the realm of nodes however, I think we should store these "private" members in a separate (private) cache so that they don't pollute the object returned by $.fn.data()"
Although I agree with the above change to hide the internal data, I have found having access to this information can be helpful for debugging and unit testing.
What was the new way of getting the internal jquery event object in jQuery 1.7?
In jQuery 1.7, events are stored in an alternate location accessible through the internal $._data() method (but note that this method is documented as for internal use only in the source code, so use it at your own risks and be prepared for it to change or disappear in future versions of the library).
To obtain the events registered on an element, you can call $._data() on that element and examine the events property of the returned object. For example:
$("#yourElement").click(function() {
// ...
});
console.log($._data($("#yourElement")[0]).events);

Is it okay to use data-attributes to store Javascript 'state'

I often use data-attributes to store configuration that I can't semantically markup so that the JS will behave in a certain way for those elements. Now this is fine for pages where the server renders them (dutifully filling out the data-attributes).
However, I've seen examples where the javascript writes data-attributes to save bits of data it may need later. For example, posting some data to the server. If it fails to send then storing the data in a data-attribute and providing a retry button. When the retry button is clicked it finds the appropriate data-attribute and tries again.
To me this feels dirty and expensive as I have to delve into the DOM to then dig this bit of data out, but it's also very easy for me to do.
I can see 2 alternative approaches:
One would be to either take advantage of the scoping of an anonymous Javascript function to keep a handle on the original bit of data, although this may not be possible and could perhaps lead to too much "magic".
Two, keep an object lying around that keeps a track of these things. Instead of asking the DOM for the contents of a certain data-attribute I just query my object.
I guess my assumptions are that the DOM should not be used to store arbitrary bits of state, and instead we should use simpler objects that have a single purpose. On top of that I assume that accessing the DOM is more expensive than a simpler, but specific object to keep track of things.
What do other people think with regards to, performance, clarity and ease of execution?
Your assumptions are very good! Although it's allowed and perfectly valid, it's not a good practice to store data in the DOM. Sure, it's fine if you only have one input field, but, but as the application grows, you end up with a jumbled mess of data everywhere...and as you mentioned, the DOM is SLOW.
The bigger the app, the more essential it is to separate your interests:
DOM Events -> trigger JS functions -> access Data (JS object, JS API, or AJAX API) -> process results (API call or DOM Change)
I'm a big fan of creating an API to access JS data, so you can also trigger new events upon add, delete, get, change.

What is the difference between different ways of Custom event handling in JavaScript?

I came across the below posts about custom event handling in JavaScript. From these articles, there are two ways (at least) of handling/firing custom events:
Using DOM methods (createEvent, dispatchEvent)
How do I create a custom event class in Javascript?
http://the.unwashedmeme.com/blog/2004/10/04/custom-javascript-events/
Custom code
http://www.nczonline.net/blog/2010/03/09/custom-events-in-javascript/
http://www.geekdaily.net/2008/04/02/javascript-defining-and-using-custom-events/
But what is the recommended way of handling (firing & subscribing) custom events?
[Edit] The context for this question is not using any libraries like jQuery, YUI,... but just plain JavaScript
[Edit 2] There seems to be a subtle differences, at least with the error handling.
Dean Edwards ( http://dean.edwards.name/weblog/2009/03/callbacks-vs-events/ ) is recommending the former way for custom event handling. Can we say this a difference?
What your describing is the difference between
A custom event based message passing system
manually firing DOM events.
The former is a way to use events for message passing. An example would be creating an EventEmitter
The latter is simply a way to use the browsers build in DOM event system manually. This is basically using the DOM 3 Event API which natively exists in (competent / modern) browsers.
So the question is simply what do you want to do? Fire DOM events or use events for message passing?
Benchmark showing DOM 3 custom events is 98% slower
The DOM appears to have a huge overhead. It does so because it supports event propagation and bubbling. It does because it supports binding events to a DOMElement.
If you do not need any of the features of DOM3 events then use a pub/sub library of choice.
[Edit 2]
That's all about error handling, how you do error handling is upto you. If you know your event emitter is synchronous then that's intended behaviour. Either do your own error handling or use setTimeout to make it asynchronous.
Be wary that if you've made it asynchronous you "lose" the guarantee that the event handlers have done their logic after the emit/trigger/dispatch call returns. This requires a completely different high level design then the assumption that event emitters are synchronous. This is not a choice to make lightly
There are pros and cons to each. Use what suits you.
The most obvious "con" to using the DOM methods is that they're tied to browser-based deployments and involve the DOM in things even if it doesn't really make sense (e.g., the messaging isn't related to DOM objects), whereas a standard Observer-style implementation can be used in any environment, not just web browsers, and with any generic object you like.
The most obvious "con" to doing your own Observer implementation is that you've done your own Observer implementation, rather than reusing something already present (tested, debugged, optimised, etc.) in the environment.
Many libraries already provide a way to deal with events, and I would recommend on just using an existing library. If you use the DOM, you are assuming that the browser provides the necessary support, whereas if you use a custom mechanism, you might be sacrificing speed for the custom implementation. If you rely on an existing library, the library can choose the appropriate tradeoffs.
For example, Closure provides EventTarget which has an interface simliar to that of the DOM mechanism, but which does not depend on the browser to provide native support for it.

Categories

Resources