Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am trying to understand how the DOM is rendered, and resources and requested/loaded from the network. However when reading the resources found on internet, DOM parsing/loading/rendering/ready terms are used and I cant seem to grasp what is the order of these 'events'.
When script, css or img file is requested from network, does it stop rendering dom only or stops parsing it also? Is Dom loading same as Dom rendering? and Is DomContentLoaded event equivalent to jQuery.ready()?
Can someone please explain if some of these terms are synonymous and in what order they happen?
When you open a browser window, that window needs to have a document loaded into it for the user to see and interact with. But, a user can navigate away from that document (while still keeping the same window open) and load up another document. Of course, the user can close the browser window as well. As such, you can say that the window and the document have a life-cycle.
The window and the document are accessible to you via object APIs and you can get involved in the life-cycles of these objects by hooking up functions that should be called during key events in the life-cycle of these objects.
The window object is at the top of the browser's object model - it is always present (you can't have a document if there's no window to load it into) and this means that it is the browser's Global Object. You can talk to it anytime in any JavaScript code.
When you make a request for a document (that would be an HTTP or HTTPS request) and the resource is returned to the client, it comes back in an HTTP or HTTPS response - this is where the data payload (the text, html, css, JavaScript, JSON, XML, etc.) lives.
Let's say that you've requested an .html page. As the browser begins to receive that payload it begins to read the HTML and construct an "in-memory" representation of the document object formed from the code. This representation is called The Document Object Model or the DOM.
The act of reading/processing the HTML is called "parsing" and when the browser is done doing this, the DOM structure is now complete. This key moment in the life-cycle triggers the document object's DOMContentLoaded event, which signifies that there is enough information for a fully formed document to be interactive. This event is synonymous with jQuery's document.ready event.
But, before going on, we need to back up a moment... As the browser is parsing the HTML, it also "renders" that content to the screen, meaning that space in the document is allocated for the element and its content and that content is displayed. This doesn't happen AFTER all parsing is complete, the rendering engine works at the same time the parsing engine is working, just one step behind it - - if the parsing engine parses a table-row, for example, the rendering engine will then render it. However, when it comes to things like images, although the image element may have been parsed, the actual image file may not yet have finished downloading to the client. This is why you may sometimes initially see a page with no images and then as the images begin to appear, the rest of the content on the page has to shift to make room for the image -- the browser knew there was going to be an image, but it didn't necessarily know how much space it was going to need for that image until it arrived.
CSS files, JS files, images and other resources required by the document download in the background, but most browsers/operating systems cap how many HTTP requests can be working simultaneously. I know for Windows, the Windows registry has a setting for IE that caps that at 10 requests at at time, so if a page has 11 images in it, the first 10 will download at the same time, but the 11th will have to wait. This is one of the reasons it is suggested that it's better to combine multiple CSS files into one file and to use image sprites, rather than separate images - - to reduce the overall amount of HTTP requests a page has to make.
When all of the external resources required by the document have completed downloading (CSS files, JavaScript files, image files, etc.), the window will receive its "load" event, which signifies that, not only has the DOM structure been built, but all resources are available for use. This is the event to tap into when your code needs to interact with the content of an external resource - - it must wait for the content to arrive before consuming it.
Now that the document is fully loaded in the window, anything can happen. The user may click things, press keys to provide input, scroll, etc. All these actions cause events to trigger and any or all of them can be tapped into to launch custom code at just the right time.
When the browser window is asked to load a different document, there are events that are triggered that signify the end of the document's life, such as the window's beforeunload event and ultimately its unload event.
This is all still a simplification of the total process, but I think it should give you a good overview of how documents are loaded, parsed and rendered within their life-cycle.
Related
I'm implementing a front-end application using Stencil.js. We intend to nest our Stencil application on every page of our web application, which all use different technologies. We have some pages that are Angular apps, others are made with React, etc.
Within the Stencil application, I want to modify the "body" element on the background web pages so that I can disable the scroll bar at various times. I can easily make a call to "document.body", but I don't know if this is something that will always be available depending on the type of web page. Does every web page contain a DOM, as well as a "body" element, regardless of what technology was used?
In case I need to clarify this, I'm talking about visual web pages loaded in web browsers.
Generally speaking yes, but there are exceptions.
document.body is a getter and it looks up the body element every time you refer to it in code. If your code executes before body element created or even if there is one, you will get null. The common case where you might not have document.body ready yet is when you have a synchronous script tag in head element or if your application is a browser extension set up to execute before the page is loaded.
I'm trying to identify roughly when the DOM is finished updating after a page is loaded via AJAX on any arbitrary website.
My current method first listens for the chrome.webNavigation.onHistoryStateUpdated event in a background script, then executes a content script in which a MutationObserver detects changes to the website's body. From there, unfortunately, it seems like it's a bit more finicky. If I just wait for the first mutation where nodes are added to the DOM, I wind up in many cases (YouTube, to give one example) where the page is still blank. Other more hacky approaches I've considered include things like just using setTimeout or waiting for the page to reach a certain length, but those seem clearly wide open to exception cases.
Is there a more fool-proof way to detect that the DOM has roughly finished updating? It doesn't necessarily have to be perfectly precise, and erring on the side of triggering late in my use case is better than triggering early. Also it isn't important at all that resources like video and images be fully loaded, just that the text contents of the page are basically in place.
Thanks for your help!
I am developing a Chrome extension with a manifest that, for now, enables access to all hosts. The background script injects content scripts into all frames. After the DOM is loaded, the content script in the top page/frame begins to walk the DOM tree. When the walker encounters an iframe, it needs to message the specific content script associated with that iframe's window (possibly cross-origin) to begin it's work and includes some serialized data with this message. The parent window suspends execution and waits for the child to complete it's walk and send a message back that it is done along with serialized data. The parent then continues its work. I have tried two approaches to this problem:
frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation(). For example, on the yahoo home page (https://www.yahoo.com), when posting a message to my content script associated with iframe source https://s.yimg.com/rq/darla/2-9-9/html/r-sf.html, the message is never received. This is an ad-related iframe. Maybe the blocking of messages is intentional. There is no error when the message is posted and I use a targetOrigin of "*".
chrome.runtime.sendMessage: I can send a message to the background page but cannot figure out how to tell the background page to which frame to relay the message. The parent window content script does not know the chrome extension frameId associated with the child frame element it encountered in the DOM walk. So it cannot tell the background page how to direct the message.
For point 2, I have tried two techniques that I found here on stackoverflow:
Using concept described in this question: In parent window, determine iframe's position in the window.frames array and post a message to the background page with this index. The background page posts a message to all frames with the desired index in the message data. Only the iframe that finds it's window object position in the window.parent.frames array matches the index received from the message proceeds with it's walk. This works OK but is vulnerable to changes in the window.frames array during the asynchronous messaging process (if an iframe is deleted after message is sent, the index value may no longer match the desired frame).
Instead of the index value from point 1, use frameElement.name in the parent window. With same messaging technique, send name to child iframe for comparison to its window.name value. I believe window.name gets it's value from the frameElement.name at the time of the iframe element creation. However, since I don't control the frame element creation, the name attribute is often an empty string and can't be relied on to uniquely match iframe elements to their windows.
Is there a way for me to reliably send a message to a content script associated with an iframe element found in walking a DOM tree?
When you call chrome.runtime.sendMessage from a content script, the second parameter of the chrome.runtime.onMessage listener ("sender") includes the properties url and frameId.
You can send a message (from an extension page, e.g. the background page) to a specific frame using chrome.tabs.sendMessage with the given frameId.
If you want to know the list of all frames (and their frame IDs) at any time, use the chrome.webNavigation.getAllFrames. If you do that, then you can construct a tree of the frames in a tab, and then send this information to all frames for further processing.
Reliable postMessage / onMessage
frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation()
This can be countered by running your script at "run_at":"document_start" and immediately register the message event listener. Then your handler will always be called first and the page cannot cancel it via event.stopImmediatePropagation(). However, do not blindly trust the information from other frames and always verify the message (e.g. by communicating with the other frames via the background page).
Combining both approaches
The first method offers a secure way to exchange data between frames, but does not offer a general way to link the frame to a specific DOM element.
The second method allows you to target a specific (i)frame element, but any web page can do that and therefore the method on its own is not reliable.
By combining both, you get a secure communication channel that is linked to a DOM element.
This is a basic example that applies the above methods to communicate between frames A and B:
Content script in A:
Send a message to the background page (e.g. a message including the index of frame B).
Background page:
Receives the message from A.
Generate a random nonce, say R (crypto.getRandomValues).
Store a mapping from R to frameId (and optionally other information that was included in the message from A).
Call the response callback with this random value.
Content script in A:
Receive R from the background page.
Calls postMessage on frame B and pass R.
Content script in B:
Receive R from A.
Send a message to the background page to retrieve the frameId (and optionally other information from A).
Note: For a rock-solid application, you need to account for the fact that the frame is removed during any of those steps. If you neglect the asynchronous nature of this process, you may leave your application in an inconsistent state.
tl;dr
My answer will describe a CORS-proof solution in the specific case when the child frame has user focus. In many usecases, usually the frame we want to interact with has focus.
In 2022, Firefox and Safari already have CORS-proof standard API for this, so if you're targeting them, consider using the standard API instead.
My solution does not use window.postMessage or cryptographic random values.
Prerequisites
The frame - to which you are sending a message - needs be a 'valid frame'. A 'valid frame' is:
a frame which has user focus, OR
parent of another valid frame
For simplicity of discussion, I'll assume:
we have <all_urls> host permission
we are working inside only one tab
Glossary
A frame tree is such that each frame has a unique parent but can have multiple children. The depth of a frame is:
Zero if it is the root document, OR
One plus depth of its parent frame
Procedure
Step 1. Track depth of each frame
tl;dr: When a frame loads, we record its depth in the frame tree.
I will assume your content script (CS) already injects itself into each iframe on the page. As soon as it is injected, the CS needs to report its own frame depth to the background page (BG). Using this information, BG will maintain the list of frame IDs at each depth level.
CS can get its own depth by using recursive algorithm similar to the one described here
BG can access sender.frameId in the onMessage listener to correctly get the frame Id.
BG now has a list reportedFrameDepths (for example) where reportedFrameDepths[depth] is a list/set of all frameIds at that depth.
Step 2. Check which child frame is focused
tl;dr: Given a frame, we can find which one of its child frame is focused.
We can enumerate all candidate children of this frame by checking reportedFrameDepths[depth + 1], where depth is the frame depth of this frame. Only one of the frames in this list should have user focus.
The focused child will have non-null document.activeElement value, and document.hasFocus() will be true. We need to check the latter as in certain cases (for example, mail.google.com), document.activeElement is set to a non-focused element (<body>) for many frames.
So we can send a message to all the candidate frames (specify { frameId } in options field of tabs.sendMessage) and get a boolean response from them if they have focus. The one frameId that responds true should be the intended focused child frame.
Step 3. Repeat step 2 recursively.
If you can find the focused child A of a given frame, you can also find the focused child B of that focused child A.
Repeating step 2 starting from root document will lead you to the deepest focused child. The recursion stops when there is no further focused child.
This is the end, you now have the frame ID of the deepest focused child. You can now send a message directly to this frame.
Gotchas
This is not a trivial solution to implement, due to:
the amount of asynchronicity. There is a long messaging chain across multiple CS and the BG. Ensure your code can handle if the messaging chain is interrupted midway due to some other parts of the code crashing.
Tabs and frames can reload, navigate away or destroy themselves. Make sure your implementation handles these cases. Especially be wary of caching or data stores as they can become obsolete.
That said, I have implemented the solution for a similar usecase and it works reliably and fast enough (extra overhead I observed is less than 5ms). The exact implementation will vary depending on your product's needs, and the above explanation should serve as a good reference.
I'm trying to port one of my Firefox extensions to support Electrolysis (e10s). My extension grabs some page data and puts it on the clipboard via a context-menu item that the user can click. Based on the message manager documentation, there are 3 types of message managers available:
Global
Window
Browser
Since my add-on is context specific, the last one seems like the one I want to use. The problem is that I don't fully know when to load the frame script. A simplified version of my context menu item's action handling code looks like this:
onContext: function() {
let browserMM = gBrowser.selectedBrowser.messageManager;
browserMM.loadFrameScript("chrome://myaddon/content/frame-script.js", true);
browserMM.sendAsyncMessage("myaddon#myaddon.com:get-page-info", json);
}
Loading the frame script here seemed like the best idea to me since (a) the frame script isn't guaranteed to get used on every page and (b) I figured that frame scripts are loaded once and only once per <browser>. The second theory isn't correct it seems; each time I call loadFrameScript, a new copy gets loaded. Even load-protection logic (i.e. only creating the frame script functions if they don't already exist) doesn't seem to fix the problem.
So, my problem is that each time the context menu item is accessed, a new copy of the frame script gets loaded. And since my frame script adds a message listener, I get duplicate messages on subsequent calls of the context menu item.
When should I load browser frame scripts? Loading it once on add-on initialization doesn't seem to work well, since it only loads on the first <browser> (I want this code to execute when asked for by any subsequent <browser>). But loading it on demand appears to duplicate things.
Are there other strategies I'm missing here?
Even load-protection logic (i.e. only creating the frame script functions if they don't already exist)
Frame scripts are a bit tricky, scripts for each tab share a global object but have a separate scope, akin to being evaluated inside their own function block. So if you add it multiple times to a tab then each gets evaluated in a separate scope.
Instead you might want to track the browser objects that already have your frame script attached with a WeakMap. Although I think there also is some property to enumerate the loaded frame scripts.
Loading it once on add-on initialization doesn't seem to work well
If you want that, then use the global message manager and attach a delayed frame script, that'll get attached to all current and future tabs. Of course that will consume more memory than just attaching it to tabs that really need it.
browserMM.loadFrameScript("chrome://myaddon/content/frame-script.js", true);
You don't really need to set the delayed flag to true if you run it on a specific browser, that only makes sense for broadcasting message manager which may get additional children in the future.
I'm making a game using JavaScript, currently I'm using window.location = "somepage.html" to perform navigation but I'm not sure if that is the correct way to do it. As I said in the title I've choosed Blank App Template so I do not have any navigator.js or something like.
Can you guys tell me the best way to do it?
Although you can use window.location to perform navigation, I'm sure you've already noticed a few of the downsides:
The transition between pages goes through a black screen, which is an artifact of how the underlying HTML rendering engine works.
You lose your script context between pages, e.g. you don't have any shared variables or namespaces, unless you use HTML5 session storage (or WinRT app data).
It's hard to wire up back buttons, e.g. you have to make sure each destination page knows what page navigated to it, and then maintain a back stack in session storage.
It's for these reasons that WinJS + navigator.js created a way to do "pages" via DOM replacement, which is the same strategy used by "single page web apps." That is, you have a div in default.html within which you load an unload DOM fragments to give the appearance of page navigation, while you don't actually ever leave the original script context of default.html. As a result, all of your in-memory variables persist across all page navigations.
The mechanics work like this: WinJS.Navigation provides an API to manage navigation and a backstack. By itself, however, all it really does is manage a backstack array and fire navigation-related events. To do the DOM replacement, something has to be listening to those events.
Those listeners are what navigator.js implements, so that's a piece of code that you can pull into any project for this purpose. Navigator.js also implements a custom control called the PageControlNavigator (usually Application.PageControlNavigator) is what implements the listeners.
That leave the mechanics of how you define your "pages." This is what the WinJS.UI.Pages API is for, and navigator.js assumes that you've defined your pages in this way. (Technically speaking, you can define your own page mechanisms for this, perhaps using the low-level WinJS.UI.Fragments API or even implementing your own from scratch. But WinJS.UI.Pages came about because everyone who approached this problem basically came up with the same solution, so the WinJS team provided one implementation that everyone can use.)
Put together then:
You define each page as an instance of WinJS.UI.Pages.PageControl, where each page is identified by its HTML file (which can load its own JS and CSS files). The JS file contains implementations of a page's methods like ready, in which you can do initialization work. You can then build out any other object structure you want.
In default.html, define a single div for the "host container" for the page rendering. This is an instance of the PageControlNavigator class that's defined in navigator.js. In its data-win-options you specify "{home: }" for the initial page that's loaded.
Whenever you want to switch to another page, call WinJS.Navigation.navigate with the identifier for the target page (namely the path to its .html file). In response, it will fire some navigating events.
In response, the PageControlNavigator's handlers for those events will load the target page's HTML into the DOM, within its div in default.html. It will then unload the previous page's DOM. When all of this gets rendered, you see a page transition--and a smooth one because we can animate the content in and out rather than go through a black screen.
In this process, the previous page control's unload method is called, and the init/load/processed/ready methods of the new page control are called.
It's not too hard to convert a blank app template into a nav template project--move your default.html/.css/.js content into a page control structure, add navigator.js to default.html (and your project), and put a PageControlNavigator into default.html. I suggest that you create a project from the nav app template for reference. Given the explanation above, you should be able to understand the structure.
For more details, refer to Chapter 3 of my free ebook, Programming Windows Store Apps with HTML, CSS, and JavaScript, Second Edition, where I talk about app anatomy and page navigation with plenty of code examples.