Javascript: Registering *every* DOM Element on creation/appending to DOM - javascript

In order to maintain a correct and highly responsive GUI overlay on a each website, I need to register and analyze every relevant DOM Element as soon as possible. Currently I am using a MutationObserver, which does this work for me and simplified, it looks like this:
var observer = new MutationObserver(
function(mutations){
mutations.forEach(
function(mutation){
if(mutation.type == 'childList')
{
var nodes = mutation.addedNodes;
var n = nodes.length;
for(var i = 0; i < n; i++)
{
if(nodes[i].nodeType == 1) // ELEMENT_NODE
{
// Save it in specific Array or something like this
AnalyzeNode(nodes[i]);
}
}
}
}
);
}
);
var config = {subtree: true, childList: true};
observer.observe(document, config);
But I've come to the realization, that the used MutationObserver isn't calling AnalyzeNode for every node contained in the DOM. When an already complete (sub)tree is created outside of the DOM (e.g. by executing an external JS script on page load) and you append its root to the DOM mutation.addedNodes will only contain the subtree's root and all of its children will go unnoticed (because no further mutations will take place there), being part of the DOM but not having been analyzed.
I had the idea of checking if the appended node may already have childNodes to identify it as root of an appended subtree, but unfortunately it seems like every addedNode may have children at the moment the MutationObserver's functions are called. So no distinction possible on this way.
I really don't want to double check every child node of an added node (the parent node) at the moment its parent node is processed by the MutationObserver. Most of the time, the child node will nevertheless be processed by the MutationObserver when itself will be part of addedNodes in an other occurring mutation and the overhead seems to get unnecessary high.
Furthermore, I thought about a Set of nodes, whose children have to be analyzed outside of a MutationObserver call. If an added node has children upon its appending to the DOM, the node is added to the Set. When another mutation takes place and one of its children is part of addedNodes, its child removes its parent from the Set by using mutation.target -- which is the parent node (mutation.type has to be childList). The problem with this approach is the timing when to check the children of the nodes in Set (and the fact, that I could query document.getElementsByTagname for every relevant Element type instead of maintaining a Set, but the timing problem is still there). Keep in mind that it should be as soon as possible to keep the overlay responsive and fitting to the website. A combination of document's onreadystatechange and appending of new script nodes to the DOM (as indicator when external JS code is executed) might work even for websites, recreating parts of its content (I am looking at you duckduckgo search result page). But it seems like a workaround, which won't solve the problem in 100% of the cases.
So, is there another, more efficient way? Or does any of these approaches may be sufficient if slightly changed? Thanks a lot!
(Please try to avoid JQuery where possible as example code, thank you. And by the way, I am using CEF, so the best case would be a solution working with Webkit/Blink)
EDIT1: Website rendering is done internally by CEF and GUI rendering is done by C++/OpenGL with information obtained by the mentioned Javascript code.

It seems your actual goal is to layout detect changes in the rendered output, not (potentially invisible) DOM changes.
On gecko based browsers you could use MozAfterPaint to get notified of the bounding boxes of changed areas, which is fairly precise but has a few gaps, such as video playback (which changes displayed content but not the layout) or asynchronous scrolling.
Layout can also be changed via the CSSOM, e.g. by manipulating a <style>.sheet.cssRules. CSS animations, already mentioned in the comments, are another thing that can also affect layout without mutations. And possibly SMIL animations.
So using mutation observers alone may be insufficient anyway.
If your overlay has some exploitable geometric properties then another possibility might be sampling the parts of the viewport that are important to you via document.elementFromPoint and calculating bounding boxes of the found elements and their children until you have whatever you need. Scheduling it via requestAnimationFrame() means you should be able to sample the state of the current layout on every frame unless it's changed by other rAF callbacks, running after yours.
In the end most available methods seem to have some gaps or need to be carefully tweaked to not hog too much CPU time.
Or does any of these approaches may be sufficient if slightly changed?
Combining tree-walking of observed mutations and a WeakSet to not process already visited nodes may work with some more careful filtering.
having already visited a node does not automatically mean you can skip its children
but having visited a child without it being a mutation target itself should mean you can skip it
removals events mean you must remove the entire subtree, node by node, from the set or just clear the set since they might be moved to another point in the tree

MutationRecords seem to be listed in the order in which the changes happened (can be easily verified).
Before you run your AnalyzeNode(nodes[i]) algorithm, you can run an AnalyzeChanges(mutations) step that can determine the over all change that happened.
For example, if you see addedNodes contains the same node 10 times, but you see the same node only 9 times in the removedNodes then you know that the net result is that the node was ultimately added to the DOM.
Of course it may be more complicated than that, you will have to detect added sub trees, and nodes that may have then been removed and added from those sub trees, etc.
Then finally, once you know what the net change was, you can run AnalyzeNode(nodes[i]).
I'm thinking about doing this to observe an entire <svg> tree and to render it (and re-render it when changes happen) in WebGL.
It may be tricky, because imagine the following happens synchronously by some user (you don't know what he/she will do) who is manipulating the DOM:
a subtree is added (queues a record for the root node in addedNodes)
a subtree of the subtree is removed (queues a record)
then appended somewhere else outside of the first subtree (queues another record, oh boy.)
an element is removed from the other subtree is removed (queues a record)
and added back to the original subtree (queues a record)
etc
etc
Finally, you receive a list of MutationRecords that details all those steps.
We could loop through all the records and basically recreate a play-by-play of what happened, then we can figure out the final net changes that happened.
After we have those net changes, it'll be like having a list of records, but they will be simpler, (for example removing then adding a node simply cancels it out, so we don't really care if a node was removed then immediately added because the end result is basically nothing for that node).
People have tried to tackle the problem of detecting changes between trees.
Many of those solutions are associated with the terms "virtual dom" and "dom diffing" as far as web goes, which yield results on Google.
So, instead of doing all that mutation analysis (which sounds like a nightmare, though if you did it I would say please please please publish it as open source so people can benefit), we can possibly use a diffing tool to find the difference between the DOM before the mutation, and the DOM at at the time the MutationObserver callback is fired.
For example, virtual-dom has a way to get a diff. I haven't tried it yet. I'm also not sure how to create VTrees from DOM trees to pass them into the diff() function.
Aha! It may be easier to get a diff with diffDOM.
Getting a diff might present the simplest changeset needed to transition from one tree to another tree, which might be much easier than analyzing the a mutation record list. It's worth trying it out. I might post back what I find when I do it with the <svg>s...

Related

vaadin readding component creates new element in dom

In Vaadin when readding a component that was removed previously will create a new element in the DOM.
Lets look at it in detail
Button button = new Button("test");
button.getElement().executeJs("""
this.addEventListener("click", event => {
alert("hello");
});
""");
add(button);
now after some event on the server we decide to remove the component from the view. So the corresponding element in the DOM gets removed.
then after another event we add the button component again. so vaadin creates a new Element on the client and adds this to the DOM. (the new element is missing the eventlistener)
What I would expect to happen is that vaadin reuses the same element that existed before. But it does not. normally this would not really matter, but in our case we added a eventlistener with js. (yes we could add eventlisteners on the javaside, but let’s suppose that we really need to do it in js because we want to execute some code on the client)
why is vaadin doing this, and is there an option so vaadin uses always the same element.
In pure JS I could easily just create a lookup table with the element that I removed, and then later use the elements in the lookup table to add them again to the DOM. Doing this would keep all the event listeners for the element.
What really perplexes me, is that even though the element in the DOM is different everytime, the Element I get with component.getElement() is always the same. Isn’t this element supposed to represent the element on the clientside?
Of course we could just run the same js on the element everytime we add the element to the view, but that is quite cumbersome.
Is vaadin doing this because of performance reasons. What are your explanations for this behaviour?
This is indeed a mechanism to avoid leaking memory. A mechanism based on server-side reference tracking would be significantly more complex, work with a delay (because the reference is cleared only when GC runs), and make it more difficult for the developer to control what happens. The current design makes it easy for the developer to choose what should happen: hide to preserve it in the browser, detach to let it be garbage collected.
I could also clarify that the same DOM element is reused in cases when the component is detached and then attached back again during the same server visit.

JavaScript MutationObserver. Observing a child element after observing a parent element triggers no events

Given this sample code:
function someMethod(elements) {
var observer = new MutationObserver(function(events) {
SomeLib.each(events, function(event, k, i) {
if ( event.removedNodes ) {
SomeLib.each(event.removedNodes, function(removedElement, k, i) {
console.log(222, removedElement)
});
}
});
});
SomeLib.each(elements, function(element, k, i) {
console.log(111, element)
observer.observe(element, {
childList : true,
subtree : false
});
});
}
I've noticed that if I call someMethod(parentElement) and then call it again later someMethod(parentElement.querySelector('someChildElement'))
The first one is the only one that triggers events and appears as if the second call does not trigger any events.
This is unfortunate as I am mostly interested in an event when the actual node is removed. Nothing else. Child nodes are really not of interest either, but childList or data... option has to be true so I am forced to I guess.
I can not organize my code around keeping track of who's a parent is already tracked or not, and therefore I would have found it much easier to simply listen to remove events on any particular node, whatever way it is eventually deleted.
Considering this dilemma, I am considering registering a MutationObserver on the document element and instead rely on detecting the element I wish to observe myself through my own event handler.
But is this really my best option?
Performance is obviously of concern since everything will fire this document listener, but perhaps just having one MutationObserver potentially efficient since I will only be triggering my own function when I detect the element of interest.
It requires iteration however, on removedNodes and addedNodes potentially, so it has a real effect on everything rather than just me observing the node.
This begs the question, is there not already a global mutation observer already registered?
Do I really have to manually observe the document myself?
What if other libraries also start to observe things similarly either on body or child elements?
Won't I destroy their implementation? (Not that I have just dependency) but this is worrying how horrible this implementation really seems to be, but not surprising considering how everything has been horrible with the web since the dawn of day. Nothing is ever correctly implemented. Thank you w3c.
Is MutationObserver really the way here? Perhaps there are node.addEventListener('someDeleteEvent') I can listen to instead?
Why are we being recommended away from DOMNodeRemoved like events, but can we really do the replacement? Especially since performance penalty seems real using MutationObserver I wonder why everywhere "smart people" are recommending us away from DOMNodeRemoved?
They are not the same. What is the idea of deprecating those anyway since this seems kind useless and potentially problematic to use.
For now, I have already implemented this global document listener that allows me to detect nodes I am interested in only, and fire the functions I desire when found. However, performance might be hit. I am not sure.
I am considering scrapping the implementation and instead rely on "deprecated" DOMNodeRemoved regardless unless someone can chip in with some thoughts.
My implementation simply registered on document and then basically looks at each element if they have the custom event key on them, and fire it if they do. Quite effecient but requires iteration similar to:
On each mutation observed across entire document.

Does node.removeChild(node.firstChild) create a memory leak?

MDN says this is one way to remove all children from a node. But since only the first child node is referenced in code, do the others become memory orphans? Is anything known about whether this is the case in any or all browsers? Is there something in the DOM standard that calls for garbage collection when doing this?
I guess you are referring to this example
// This is one way to remove all children from a node
// box is an object reference to an element with children
while (box.firstChild) {
//The list is LIVE so it will re-index each call
box.removeChild(box.firstChild);
}
No it does not cause a memory leak.
What happens is after the 1st child is removed the 2nd one will take it's place as the 1st child, and so on until there are no more children left.
Also garbage collection can not be usually requested on demand, the virtual machine will do it when it thinks it can, and that does differ between browsers.

Can the DOM be differentially updated?

First and foremost, I've done extensive research about this, under different names that I think could apply such as "Javascript differential templating", "Javascript update DOM without reparsing", "Javascript render UI using deltas" and other variations. Pardon me if I missed an existing thread that covers my question.
Essentially, I would first like to know if most DOM parsers in browsers do the following already, even though I'm fairly sure the answer is no: do they update the DOM differentially (i.e. only the nodes that have changed in the same tree since the last update) when a node is modified? Like I said, I figure the answer is no and they actually reparse and rerender the updated node and everything in its tree.
Which brings me to my question: is there any Javascript library that allows to manage differential updates to a data model and to the DOM?
I realize I might not be really clear about this, so I will provide some code to explain what I mean: http://jsfiddle.net/btZ3e/6/
In that example, I have an "event queue" (which is really a timeline) with events in it. UserEvents all have a unique ID. The way it works now is that UserEvents can execute() and undo(), in the former they modify data in memory (myAppManager.dataModel) and append a <p> in the DOM while in the latter they undo these changes. (Each UserEvent's undo() is defined within the execute() of the same UserEvent as to allow more flexibility, one could consider moving events around independently)
Then, there is myAppManager.render() :
var myAppManager = new function () {
this.dataModel = {
someValue: 0,
disableButton: false
};
this.render = function () {
$('#displaysomevalue').text(this.dataModel.someValue);
$('#go').prop('disabled', this.dataModel.disableButton)
}
}
How would it be possible (is it at all?) that myAppManager.render() only updates what has changed since the last update? I reckon this would mean that I would have to have some sort of differentiation system in my data model too. Ultimately I'm wondering about this because I'm gonna be receiving multiple new UserEvents per second (let's say 20-30 per second at worst?) via websockets and I was wondering if I would need to rerender my whole UI for every new piece of data I get. I investigated into Javascript templates to see how they do it, and it seems they all just go this route:
document.getElementById('someTemplateContainer').innerHTML = someTemplateEngine.getHtmlOutput();
I doubt however they need to refresh as often as I need to in some instances. Is there prior work on this? Did I miss anything? Thank you very much!
The way Backbone.js, as an example, does this is that models (name:value pairs basically) are backed by a view/template, and that models have events associated with them like change. Let's say you have a <ul> where each <li> is one Backbone view, backed by a model.
You could bind every model's change event to re-render its own view (and ONLY its own view). So when the 5th <li> gets its name changed, it will re-render just the contents of that <li>, and the rest of the <ul> is undisturbed.
That lets only new or updated models have their DOM nodes touched and updated.
The difference is that you don't need to know 'what parts of the whole <ul> have changed and just render those', because you've actually decomposed the problem to a series of smaller ones, each of which are responsible for their own rendering and updating logic. (I'm sure other frameworks have similar patterns, and you can do them in vanilla JS too no doubt)

DOM MutationObservers: How to support this one important use of DOM3 Mutation Events?

I get wordy sometimes: tl;dr: read the bold text.
The motivation behind deprecating Mutation Events is well understood; their efficacy in achieving many types of tasks is questionable.
However, today, I have discovered a use for them that is highly dependent on those very same undesired properties.
I will first present the question, and then present the reasons that lead me to the question, because the question will be absurd without it.
Is it possible to use the new Mutation Observers in a way that we can have the VM stop at the instant of the change (like the DOM3 Mutation Events do), rather than report it to me after the fact?
Basically, the very thing that makes the Mutation Observer performant and "reasonable" is its asynchronicity, which means (necessarily, it seems) throwing away the stack, pushing a record mutation to a list, and delivering the list to qualified Observers at the next tick or several ticks later.
What I am after is precisely that stack trace of the DOM3 Mutation Event. I really really hope this will work, but basically the Mutation Event callback (which I am allowed to write) will have a stacktrace that will lead me back to the actual code that created my element I'm listening for. So in theory I'd write a Mutation Event handler like this:
// NOT in an onload cb
$("div#haystack").on('DOMNodeInserted', function(evt) {
if (is_needle(evt.target)) {
report(new Error().stack); // please, Chrome, tell me what code created the needle
}
});
This gives me the golden answer.
It seems that Mutation Observers will make it impossible to extract this information. What, then, am I to do once Mutation Events are completely taken out? They have been deprecated for a while now.
Now, to explain a little better the real actual circumstances, and why this matters.
I have been trying to kill a bug which I describe here: I have built a full-DOM serializer which nicely spits back out every element that exists on the webpage, and in comparing them, the broken page and the working page are identical. I have tested this and it is pretty nice. it captures every little thing that's different: Whatever hovery-thing my mouse happens to be over, the CSS class that gets consequently set will be reflected in the HTML dump. Any text of any form on the page will show up if you search it (provided it doesn't span across elements). All inline JS (and more importantly, all differences between inline JS) is present.
I have then gone on to verify that the broken page is missing several event handlers. So none of the clickable items respond to hover or clicks, and therefore no useful work can be done on the interactive form. This is not known to be the only problem, but it does fully explain the behavior. Given that the DOM has no differences in inline JS that explains the difference in behavior, then it must be the case that either the content of the linked resources or the invisible properties of elements (event handlers being in this category) are causing the difference in behavior.
Now I know which elements are supposed to have handlers, but I know not where in the comically large code base (ballpark: 200K lines of JS all loaded as one resource, assembled by several M lines of Perl serverside code) lies the code that assigns the events.
I have tried JS methods to watch modifications of object properties, such as this one (there are many, but all work on the same principle of setting setters and getters), which works the first time, and then subsequently breaks the app afterward. Apparently assigning setters and getters cause the system to stop functioning. It's not clear to me how I can take that approach of watching property assignments to a point where i can get a list of code points that hit a specific element. It might be feasible, but surely not if I can only fire it once, and it breaks everything thereafter.
So watching variables with JS is out.
I might be able to manually instrument jQuery itself, so that when my is_needle() succeeds on the element processed by jQuery, I log all event-related functions performed by jQuery on that element. This is dreadful, and I will resort to this if my Mutation Observer approach fails.
There are yet more ways to skin the cat of course. I could use the handy getEventListeners() on my target element when it is working to get the list of event listener functions that are on it, and then look at the code there, and search the code base to find those functions, and then analyze the code to find out all the places there those functions are inserted into event handlers. That is actually pretty straightforward.
Now I know which elements are supposed to have handlers, but I know not where in the comically large code base (ballpark: 200K lines of JS all loaded as one resource, assembled by several M lines of Perl serverside code) lies the code that assigns the events.
Have you considered simply instrumenting .addEventListener function calls one way or another, e.g. via debugger breakpoints or by modifying the DOM element prototype to replace it with a wrapper method? This would be browser-specific but should be sufficient for your debugging needs.
You also might want to try firefox's tracer, available in nightlies I think. It basically records function execution without the need to use breakpoints or instrumenting code.

Categories

Resources