I'm trying to port one of my Firefox extensions to support Electrolysis (e10s). My extension grabs some page data and puts it on the clipboard via a context-menu item that the user can click. Based on the message manager documentation, there are 3 types of message managers available:
Global
Window
Browser
Since my add-on is context specific, the last one seems like the one I want to use. The problem is that I don't fully know when to load the frame script. A simplified version of my context menu item's action handling code looks like this:
onContext: function() {
let browserMM = gBrowser.selectedBrowser.messageManager;
browserMM.loadFrameScript("chrome://myaddon/content/frame-script.js", true);
browserMM.sendAsyncMessage("myaddon#myaddon.com:get-page-info", json);
}
Loading the frame script here seemed like the best idea to me since (a) the frame script isn't guaranteed to get used on every page and (b) I figured that frame scripts are loaded once and only once per <browser>. The second theory isn't correct it seems; each time I call loadFrameScript, a new copy gets loaded. Even load-protection logic (i.e. only creating the frame script functions if they don't already exist) doesn't seem to fix the problem.
So, my problem is that each time the context menu item is accessed, a new copy of the frame script gets loaded. And since my frame script adds a message listener, I get duplicate messages on subsequent calls of the context menu item.
When should I load browser frame scripts? Loading it once on add-on initialization doesn't seem to work well, since it only loads on the first <browser> (I want this code to execute when asked for by any subsequent <browser>). But loading it on demand appears to duplicate things.
Are there other strategies I'm missing here?
Even load-protection logic (i.e. only creating the frame script functions if they don't already exist)
Frame scripts are a bit tricky, scripts for each tab share a global object but have a separate scope, akin to being evaluated inside their own function block. So if you add it multiple times to a tab then each gets evaluated in a separate scope.
Instead you might want to track the browser objects that already have your frame script attached with a WeakMap. Although I think there also is some property to enumerate the loaded frame scripts.
Loading it once on add-on initialization doesn't seem to work well
If you want that, then use the global message manager and attach a delayed frame script, that'll get attached to all current and future tabs. Of course that will consume more memory than just attaching it to tabs that really need it.
browserMM.loadFrameScript("chrome://myaddon/content/frame-script.js", true);
You don't really need to set the delayed flag to true if you run it on a specific browser, that only makes sense for broadcasting message manager which may get additional children in the future.
Related
I'm trying to identify roughly when the DOM is finished updating after a page is loaded via AJAX on any arbitrary website.
My current method first listens for the chrome.webNavigation.onHistoryStateUpdated event in a background script, then executes a content script in which a MutationObserver detects changes to the website's body. From there, unfortunately, it seems like it's a bit more finicky. If I just wait for the first mutation where nodes are added to the DOM, I wind up in many cases (YouTube, to give one example) where the page is still blank. Other more hacky approaches I've considered include things like just using setTimeout or waiting for the page to reach a certain length, but those seem clearly wide open to exception cases.
Is there a more fool-proof way to detect that the DOM has roughly finished updating? It doesn't necessarily have to be perfectly precise, and erring on the side of triggering late in my use case is better than triggering early. Also it isn't important at all that resources like video and images be fully loaded, just that the text contents of the page are basically in place.
Thanks for your help!
I have 2 questions.
First,
I have a script tag (not jquery, my own js file) in my page. Then I run my page via apache in browser and delete that tag but the page is still working. why? I also delete all cache and not reload the page.
[delete in browser Elements window]
Second,
What happen when I put two script tag with same name (one in my localhost and another in file system) ? Which one will work?
After the browser loads the code from a <script> tag, it is loaded into the VM and kept there. If it saves some data or functions into global variables, they are independent from the DOM, just like e.g. the window object.
All event listeners sent by the code will also persist such removal, effectively meaning the JS is undisturbed by your actions. After a script has been run, it's almost impossible to "turn it off" and remove it from the webpage in a generic way.
If that's your code and you simply want to stop its execution, provide cleanup methods using e.g. removeEventListener to stop the browser from calling your code.
I am developing a Chrome extension with a manifest that, for now, enables access to all hosts. The background script injects content scripts into all frames. After the DOM is loaded, the content script in the top page/frame begins to walk the DOM tree. When the walker encounters an iframe, it needs to message the specific content script associated with that iframe's window (possibly cross-origin) to begin it's work and includes some serialized data with this message. The parent window suspends execution and waits for the child to complete it's walk and send a message back that it is done along with serialized data. The parent then continues its work. I have tried two approaches to this problem:
frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation(). For example, on the yahoo home page (https://www.yahoo.com), when posting a message to my content script associated with iframe source https://s.yimg.com/rq/darla/2-9-9/html/r-sf.html, the message is never received. This is an ad-related iframe. Maybe the blocking of messages is intentional. There is no error when the message is posted and I use a targetOrigin of "*".
chrome.runtime.sendMessage: I can send a message to the background page but cannot figure out how to tell the background page to which frame to relay the message. The parent window content script does not know the chrome extension frameId associated with the child frame element it encountered in the DOM walk. So it cannot tell the background page how to direct the message.
For point 2, I have tried two techniques that I found here on stackoverflow:
Using concept described in this question: In parent window, determine iframe's position in the window.frames array and post a message to the background page with this index. The background page posts a message to all frames with the desired index in the message data. Only the iframe that finds it's window object position in the window.parent.frames array matches the index received from the message proceeds with it's walk. This works OK but is vulnerable to changes in the window.frames array during the asynchronous messaging process (if an iframe is deleted after message is sent, the index value may no longer match the desired frame).
Instead of the index value from point 1, use frameElement.name in the parent window. With same messaging technique, send name to child iframe for comparison to its window.name value. I believe window.name gets it's value from the frameElement.name at the time of the iframe element creation. However, since I don't control the frame element creation, the name attribute is often an empty string and can't be relied on to uniquely match iframe elements to their windows.
Is there a way for me to reliably send a message to a content script associated with an iframe element found in walking a DOM tree?
When you call chrome.runtime.sendMessage from a content script, the second parameter of the chrome.runtime.onMessage listener ("sender") includes the properties url and frameId.
You can send a message (from an extension page, e.g. the background page) to a specific frame using chrome.tabs.sendMessage with the given frameId.
If you want to know the list of all frames (and their frame IDs) at any time, use the chrome.webNavigation.getAllFrames. If you do that, then you can construct a tree of the frames in a tab, and then send this information to all frames for further processing.
Reliable postMessage / onMessage
frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation()
This can be countered by running your script at "run_at":"document_start" and immediately register the message event listener. Then your handler will always be called first and the page cannot cancel it via event.stopImmediatePropagation(). However, do not blindly trust the information from other frames and always verify the message (e.g. by communicating with the other frames via the background page).
Combining both approaches
The first method offers a secure way to exchange data between frames, but does not offer a general way to link the frame to a specific DOM element.
The second method allows you to target a specific (i)frame element, but any web page can do that and therefore the method on its own is not reliable.
By combining both, you get a secure communication channel that is linked to a DOM element.
This is a basic example that applies the above methods to communicate between frames A and B:
Content script in A:
Send a message to the background page (e.g. a message including the index of frame B).
Background page:
Receives the message from A.
Generate a random nonce, say R (crypto.getRandomValues).
Store a mapping from R to frameId (and optionally other information that was included in the message from A).
Call the response callback with this random value.
Content script in A:
Receive R from the background page.
Calls postMessage on frame B and pass R.
Content script in B:
Receive R from A.
Send a message to the background page to retrieve the frameId (and optionally other information from A).
Note: For a rock-solid application, you need to account for the fact that the frame is removed during any of those steps. If you neglect the asynchronous nature of this process, you may leave your application in an inconsistent state.
tl;dr
My answer will describe a CORS-proof solution in the specific case when the child frame has user focus. In many usecases, usually the frame we want to interact with has focus.
In 2022, Firefox and Safari already have CORS-proof standard API for this, so if you're targeting them, consider using the standard API instead.
My solution does not use window.postMessage or cryptographic random values.
Prerequisites
The frame - to which you are sending a message - needs be a 'valid frame'. A 'valid frame' is:
a frame which has user focus, OR
parent of another valid frame
For simplicity of discussion, I'll assume:
we have <all_urls> host permission
we are working inside only one tab
Glossary
A frame tree is such that each frame has a unique parent but can have multiple children. The depth of a frame is:
Zero if it is the root document, OR
One plus depth of its parent frame
Procedure
Step 1. Track depth of each frame
tl;dr: When a frame loads, we record its depth in the frame tree.
I will assume your content script (CS) already injects itself into each iframe on the page. As soon as it is injected, the CS needs to report its own frame depth to the background page (BG). Using this information, BG will maintain the list of frame IDs at each depth level.
CS can get its own depth by using recursive algorithm similar to the one described here
BG can access sender.frameId in the onMessage listener to correctly get the frame Id.
BG now has a list reportedFrameDepths (for example) where reportedFrameDepths[depth] is a list/set of all frameIds at that depth.
Step 2. Check which child frame is focused
tl;dr: Given a frame, we can find which one of its child frame is focused.
We can enumerate all candidate children of this frame by checking reportedFrameDepths[depth + 1], where depth is the frame depth of this frame. Only one of the frames in this list should have user focus.
The focused child will have non-null document.activeElement value, and document.hasFocus() will be true. We need to check the latter as in certain cases (for example, mail.google.com), document.activeElement is set to a non-focused element (<body>) for many frames.
So we can send a message to all the candidate frames (specify { frameId } in options field of tabs.sendMessage) and get a boolean response from them if they have focus. The one frameId that responds true should be the intended focused child frame.
Step 3. Repeat step 2 recursively.
If you can find the focused child A of a given frame, you can also find the focused child B of that focused child A.
Repeating step 2 starting from root document will lead you to the deepest focused child. The recursion stops when there is no further focused child.
This is the end, you now have the frame ID of the deepest focused child. You can now send a message directly to this frame.
Gotchas
This is not a trivial solution to implement, due to:
the amount of asynchronicity. There is a long messaging chain across multiple CS and the BG. Ensure your code can handle if the messaging chain is interrupted midway due to some other parts of the code crashing.
Tabs and frames can reload, navigate away or destroy themselves. Make sure your implementation handles these cases. Especially be wary of caching or data stores as they can become obsolete.
That said, I have implemented the solution for a similar usecase and it works reliably and fast enough (extra overhead I observed is less than 5ms). The exact implementation will vary depending on your product's needs, and the above explanation should serve as a good reference.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I am trying to understand how the DOM is rendered, and resources and requested/loaded from the network. However when reading the resources found on internet, DOM parsing/loading/rendering/ready terms are used and I cant seem to grasp what is the order of these 'events'.
When script, css or img file is requested from network, does it stop rendering dom only or stops parsing it also? Is Dom loading same as Dom rendering? and Is DomContentLoaded event equivalent to jQuery.ready()?
Can someone please explain if some of these terms are synonymous and in what order they happen?
When you open a browser window, that window needs to have a document loaded into it for the user to see and interact with. But, a user can navigate away from that document (while still keeping the same window open) and load up another document. Of course, the user can close the browser window as well. As such, you can say that the window and the document have a life-cycle.
The window and the document are accessible to you via object APIs and you can get involved in the life-cycles of these objects by hooking up functions that should be called during key events in the life-cycle of these objects.
The window object is at the top of the browser's object model - it is always present (you can't have a document if there's no window to load it into) and this means that it is the browser's Global Object. You can talk to it anytime in any JavaScript code.
When you make a request for a document (that would be an HTTP or HTTPS request) and the resource is returned to the client, it comes back in an HTTP or HTTPS response - this is where the data payload (the text, html, css, JavaScript, JSON, XML, etc.) lives.
Let's say that you've requested an .html page. As the browser begins to receive that payload it begins to read the HTML and construct an "in-memory" representation of the document object formed from the code. This representation is called The Document Object Model or the DOM.
The act of reading/processing the HTML is called "parsing" and when the browser is done doing this, the DOM structure is now complete. This key moment in the life-cycle triggers the document object's DOMContentLoaded event, which signifies that there is enough information for a fully formed document to be interactive. This event is synonymous with jQuery's document.ready event.
But, before going on, we need to back up a moment... As the browser is parsing the HTML, it also "renders" that content to the screen, meaning that space in the document is allocated for the element and its content and that content is displayed. This doesn't happen AFTER all parsing is complete, the rendering engine works at the same time the parsing engine is working, just one step behind it - - if the parsing engine parses a table-row, for example, the rendering engine will then render it. However, when it comes to things like images, although the image element may have been parsed, the actual image file may not yet have finished downloading to the client. This is why you may sometimes initially see a page with no images and then as the images begin to appear, the rest of the content on the page has to shift to make room for the image -- the browser knew there was going to be an image, but it didn't necessarily know how much space it was going to need for that image until it arrived.
CSS files, JS files, images and other resources required by the document download in the background, but most browsers/operating systems cap how many HTTP requests can be working simultaneously. I know for Windows, the Windows registry has a setting for IE that caps that at 10 requests at at time, so if a page has 11 images in it, the first 10 will download at the same time, but the 11th will have to wait. This is one of the reasons it is suggested that it's better to combine multiple CSS files into one file and to use image sprites, rather than separate images - - to reduce the overall amount of HTTP requests a page has to make.
When all of the external resources required by the document have completed downloading (CSS files, JavaScript files, image files, etc.), the window will receive its "load" event, which signifies that, not only has the DOM structure been built, but all resources are available for use. This is the event to tap into when your code needs to interact with the content of an external resource - - it must wait for the content to arrive before consuming it.
Now that the document is fully loaded in the window, anything can happen. The user may click things, press keys to provide input, scroll, etc. All these actions cause events to trigger and any or all of them can be tapped into to launch custom code at just the right time.
When the browser window is asked to load a different document, there are events that are triggered that signify the end of the document's life, such as the window's beforeunload event and ultimately its unload event.
This is all still a simplification of the total process, but I think it should give you a good overview of how documents are loaded, parsed and rendered within their life-cycle.
I am building a firefox extension that creates several hidden browser elements.
I would like to addProgressListener() to handle onLocationChange for the page that I load. However, my handler does not always get called.
More specifically, here's what I'm doing:
Create a browser element, without setting its src property
Attach it to another element
Add a progress listener listening for onLocationChange to the browser element
Call loadURIWithFlags() with the desired url and post data
I expect the handler to be called every time after 4, but sometimes it does not (it seems to get stuck on the same pages though).
Interestingly, if I wrap 3 and 4 inside a setTimeout(..., 5000); it works every time.
I've also tried shuffling some of the steps around, but it did not have any effect.
The bigger picture: I would like to be reliably notified when browser's contentDocument is that of the newly loaded page (after redirects). Is there a better way to do this?
Update: I've since opened a bug on mozilla's bug tracker with a minimal xulrunner app displaying this behavior, in case anybody wants to take a closer look: https://bugzilla.mozilla.org/show_bug.cgi?id=941414
In my experience developing with Firefox, I've found in some cases the initialization code for various elements acts as if it were asynchronous. In other words, when you're done executing
var newBrowser = window.document.createElement('browser');
newBrowser.setAttribute('flex', '1');
newBrowser.setAttribute('type', 'content');
cacheFrame.insertBefore(newBrowser, null);
, your browser may not actually be ready yet. When you add the delay, things have time to initialize, so they work fine. Additionally, when you do things like dynamically creating browser elements, you're likely doing something that very few have tried before. In other words, this sounds like a bug in Firefox, and probably one that will not get much attention.
You say you're using onLocationChange so that you can know when to add a load listener. I'm going to guess that you're adding the load listener to the contentDocument since you mentioned it. What you can do instead is add the load listener to the browser itself, much like you would with an iframe. If I replace
newBrowser.addProgressListener(listener);
with
newBrowser.addEventListener("load", function(e) {
console.log('got here! ' + e.target.contentDocument.location.href);
}, false);
then I receive notifications for each browser.