It is possible to move deeper into the DOM tree using the .frame JSON Wire Protocol call but I haven't been able to figure out a way to move upward through the DOM tree.
module.exports = {
"Enter and exit iframes in tree" : function(browser){
browser
//Currently focus is at top level.
.frame('iframeOne')
//Focus is now enters inner frame iframeTwo
.frame('iframeTwo')
//Attempt to move to frame directly
.frame('iframeOne')
//Selenium Error 'no such frame'
//Attempt to move focus up to iframeOne by searching from root.
.element('id', 'iframeOne', function(e){
browser.frame(e.value);
}
//Selenium Error 'no such element'
}
There is a JSON Wire Protocol call frame/parent that is able to move up to the parent element but it is not currently supported by NightWatchJS. Any suggestions would be appreciated.
In current binary releases, you need to use the switchToDefaultContent command. In Java this manifests itself as driver.switchTo().defaultContent(). This will take you to the top of the frame hierarchy, and you can navigate back down the tree to the frame you need.
The "navigate to parent frame" wire protocol is brand new. On the order of days, at the time of this writing. No released server implementation at the time of this writing exists that understands that wire protocol endpoint. The Firefox driver has had it implemented only in the last day or so. Work hasn't started yet on the IE driver. It's not yet implemented in any language binding except Java, and that's only in the source tree; it hasn't been released in a binary form yet. If you're patient, it will be available to you in the future (no available timeline so don't ask), just not yet.
I was able to get around this problem by resetting the focus back up to the top level element and crawling down again.
.frame(null)
.frame('iframeOne')
It's a workable solution.
Related
I'm trying to debug the integration between my app and Stripe's Elements component library. Everything works fine in sandbox mode, but we ran into a problem on production in the 3D Secure authentication process. This involves loading an iframe, into our app, that contains a form from the credit card's issuer (usually via a technology partner, like Arcot).
The form loads correctly and its buttons are working as expected, but the element (for a SMS one time code) is not behaving. Every time I click on the input, something is immediately pushing the focus back to the element of the iframe. This makes it impossible to type anything in, since by the time I touch a key, the input is not in focus. For reference, it is possible to change the input's value using document.getElementById('enterPIN').value = '123456';
I'm not sure if my app is triggering focus() calls (I don't think so) or if it is some part of the iframe code or even Stripe's. Is there a good way to monitor DOM events and do a stack trace for the trigger of each one?
I tried two tactics. Neither gave an obvious answer, but they did point my search in the right direction.
I opened the Event Listeners panel (in the Elements tab of my browser's developer tools) and removed everything I could find, but it seems that this doesn't actually change the behavior of the page- focus kept being stolen away. Luckily, I also noticed some listeners that were defined by the Material UI library.
I used monitorEvents() to get a few more details, but the src & target values were not much help and event.relatedTarget was always null.
In the end, I found this discussion and realized that my MUI Dialog component was stealing focus whenever I clicked on the iframe triggered by its content. This was easily fixed by adding the disableEnforceFocus attribute.
Background:
I often find myself in the position of debugging a piece of Java script on a web page in an unfamiliar codebase, and often one that has seen many developers and coding approaches. Sometimes I do not even what technologies might be in use, eg. angular etc.
The first time I need to address the Java script is when a specific behaviour is unexpected (ie. it has gone wrong.)
Question:
What tool provides the fastest route to identifying the entry point of the code that is causing the problem?
Example:
I have an html element on a page lets say a button. When that button is clicked I expect to see an http request at the server. There are many ways the element can be associated with its Java script listener. eg JQuery, thrid party plugins such as knockout etc, in house scripts, and so on.
Using developer tools I can start debugging this in the browser but only if I already know the entry point to put a breakpoint on.
Is there a faster method to find the entry point than doing regular expressions searches on the pages code based on intuition and guess work to find what might be attached to that particular element?
For me, the best starting point is in Chrome developer tools. You can:
Choose an element in the elements tab
On the right-hand side of the elements tree, click the "Event Listeners" tab.
Find the event you want to debug (like click)
Click the hyperlink to bring up the code for event listeners, and set breakpoints. Sometimes you have to click the "format code" button (looks like { }) to get the code on multiple lines so that the breakpoint is manageable.
Do the click, and you'll hit your breakpoint, allowing you to step through the code, add watch variables, etc.
In JavaScript, once I've received a 'message' event, is there a way to find out which frame in the DOM model has initiated it? This would be helpful when debugging a large web application where a particular message could have come anyone of 15-20 frames. The message event has a source property, but if the frame is cross-domain, it's not accessible:
Since I know these things vary from browser to browser, I'm asking specifically about IE11.
I found a way that actually works even when cross-domain - I add a DOM element by evaluating it in the Add Watch window. Then I search the DOM tree for that element, and figure out the frame in this way.
For example, this code works:
var foo_btn = document.createElement("BUTTON"); var foo_t = document.createTextNode("FOOBAR FOOBAR"); foo_btn.appendChild(foo_t); document.body.appendChild(foo_btn);
You just click Add Watch and paste it, and then after it executes, you can search for FOOBAR FOOBAR in the DOM tree.
Tracking mouse movement/scroll/click events is easy but how do they save the screen and keep it in sync so well?
The pages are rendered very quite well (at least for static HTML pages, haven't tested on Angular or any SPA), the sync is almost perfect.
To generate and upload a 23fps recording of my screen (1920x1080) it would take about 2Mbps of bandwidth. Maybe when recording only when there are some mouse events it would still take some 300-500Kbps on average? That seems way too much...
HTML content and DOM changes get pumped through a websocket and stored by Hotjar (minus sensitive information such as form inputs from the user, unless you've whitelisted them), the CSS isn't stored (it gets loaded by you when you watch the recording).
Because they're only recording user activity and DOM changes, there's a lot less data to record than if they were capturing a full video. The downside is that some Javascript driven widgets won't function correctly in the replay.
Relevant information from Hotjar docs:
When it comes to recordings, changes to the page are captured using the MutationObserver API which is built-in into every modern browser.
This makes it efficient since the change itself is already happening
on the page and the browser MutationObserver API allows us to record
this change which we then parse and also send through the websocket.
At regular short intervals, every 100ms or 10 times per second, the cursor position and scroll position are recorded. Clicks are recorded
when they happen, capturing the position of the cursor relative to the
element being clicked. These are functions which in no way hinder a
user's experience as they only capture the location of the pointer
when a click happens or every 100ms. The events are sent to the Hotjar
servers through frames within the websocket, which is more efficient
than sending XHR requests at regular intervals.
Source: https://help.hotjar.com/hc/en-us/articles/115009335727-Will-Hotjar-Slow-Down-My-Site-
I am developing a Chrome extension with a manifest that, for now, enables access to all hosts. The background script injects content scripts into all frames. After the DOM is loaded, the content script in the top page/frame begins to walk the DOM tree. When the walker encounters an iframe, it needs to message the specific content script associated with that iframe's window (possibly cross-origin) to begin it's work and includes some serialized data with this message. The parent window suspends execution and waits for the child to complete it's walk and send a message back that it is done along with serialized data. The parent then continues its work. I have tried two approaches to this problem:
frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation(). For example, on the yahoo home page (https://www.yahoo.com), when posting a message to my content script associated with iframe source https://s.yimg.com/rq/darla/2-9-9/html/r-sf.html, the message is never received. This is an ad-related iframe. Maybe the blocking of messages is intentional. There is no error when the message is posted and I use a targetOrigin of "*".
chrome.runtime.sendMessage: I can send a message to the background page but cannot figure out how to tell the background page to which frame to relay the message. The parent window content script does not know the chrome extension frameId associated with the child frame element it encountered in the DOM walk. So it cannot tell the background page how to direct the message.
For point 2, I have tried two techniques that I found here on stackoverflow:
Using concept described in this question: In parent window, determine iframe's position in the window.frames array and post a message to the background page with this index. The background page posts a message to all frames with the desired index in the message data. Only the iframe that finds it's window object position in the window.parent.frames array matches the index received from the message proceeds with it's walk. This works OK but is vulnerable to changes in the window.frames array during the asynchronous messaging process (if an iframe is deleted after message is sent, the index value may no longer match the desired frame).
Instead of the index value from point 1, use frameElement.name in the parent window. With same messaging technique, send name to child iframe for comparison to its window.name value. I believe window.name gets it's value from the frameElement.name at the time of the iframe element creation. However, since I don't control the frame element creation, the name attribute is often an empty string and can't be relied on to uniquely match iframe elements to their windows.
Is there a way for me to reliably send a message to a content script associated with an iframe element found in walking a DOM tree?
When you call chrome.runtime.sendMessage from a content script, the second parameter of the chrome.runtime.onMessage listener ("sender") includes the properties url and frameId.
You can send a message (from an extension page, e.g. the background page) to a specific frame using chrome.tabs.sendMessage with the given frameId.
If you want to know the list of all frames (and their frame IDs) at any time, use the chrome.webNavigation.getAllFrames. If you do that, then you can construct a tree of the frames in a tab, and then send this information to all frames for further processing.
Reliable postMessage / onMessage
frameElement.contentWindow.postMessage: this works most of the time, but not always. Sometimes the message is never received by the content script message event listener associated with the iframe window. I have not been able to confirm the cause but I think it is listeners attached before my listener calling event.stopImmediatePropagation()
This can be countered by running your script at "run_at":"document_start" and immediately register the message event listener. Then your handler will always be called first and the page cannot cancel it via event.stopImmediatePropagation(). However, do not blindly trust the information from other frames and always verify the message (e.g. by communicating with the other frames via the background page).
Combining both approaches
The first method offers a secure way to exchange data between frames, but does not offer a general way to link the frame to a specific DOM element.
The second method allows you to target a specific (i)frame element, but any web page can do that and therefore the method on its own is not reliable.
By combining both, you get a secure communication channel that is linked to a DOM element.
This is a basic example that applies the above methods to communicate between frames A and B:
Content script in A:
Send a message to the background page (e.g. a message including the index of frame B).
Background page:
Receives the message from A.
Generate a random nonce, say R (crypto.getRandomValues).
Store a mapping from R to frameId (and optionally other information that was included in the message from A).
Call the response callback with this random value.
Content script in A:
Receive R from the background page.
Calls postMessage on frame B and pass R.
Content script in B:
Receive R from A.
Send a message to the background page to retrieve the frameId (and optionally other information from A).
Note: For a rock-solid application, you need to account for the fact that the frame is removed during any of those steps. If you neglect the asynchronous nature of this process, you may leave your application in an inconsistent state.
tl;dr
My answer will describe a CORS-proof solution in the specific case when the child frame has user focus. In many usecases, usually the frame we want to interact with has focus.
In 2022, Firefox and Safari already have CORS-proof standard API for this, so if you're targeting them, consider using the standard API instead.
My solution does not use window.postMessage or cryptographic random values.
Prerequisites
The frame - to which you are sending a message - needs be a 'valid frame'. A 'valid frame' is:
a frame which has user focus, OR
parent of another valid frame
For simplicity of discussion, I'll assume:
we have <all_urls> host permission
we are working inside only one tab
Glossary
A frame tree is such that each frame has a unique parent but can have multiple children. The depth of a frame is:
Zero if it is the root document, OR
One plus depth of its parent frame
Procedure
Step 1. Track depth of each frame
tl;dr: When a frame loads, we record its depth in the frame tree.
I will assume your content script (CS) already injects itself into each iframe on the page. As soon as it is injected, the CS needs to report its own frame depth to the background page (BG). Using this information, BG will maintain the list of frame IDs at each depth level.
CS can get its own depth by using recursive algorithm similar to the one described here
BG can access sender.frameId in the onMessage listener to correctly get the frame Id.
BG now has a list reportedFrameDepths (for example) where reportedFrameDepths[depth] is a list/set of all frameIds at that depth.
Step 2. Check which child frame is focused
tl;dr: Given a frame, we can find which one of its child frame is focused.
We can enumerate all candidate children of this frame by checking reportedFrameDepths[depth + 1], where depth is the frame depth of this frame. Only one of the frames in this list should have user focus.
The focused child will have non-null document.activeElement value, and document.hasFocus() will be true. We need to check the latter as in certain cases (for example, mail.google.com), document.activeElement is set to a non-focused element (<body>) for many frames.
So we can send a message to all the candidate frames (specify { frameId } in options field of tabs.sendMessage) and get a boolean response from them if they have focus. The one frameId that responds true should be the intended focused child frame.
Step 3. Repeat step 2 recursively.
If you can find the focused child A of a given frame, you can also find the focused child B of that focused child A.
Repeating step 2 starting from root document will lead you to the deepest focused child. The recursion stops when there is no further focused child.
This is the end, you now have the frame ID of the deepest focused child. You can now send a message directly to this frame.
Gotchas
This is not a trivial solution to implement, due to:
the amount of asynchronicity. There is a long messaging chain across multiple CS and the BG. Ensure your code can handle if the messaging chain is interrupted midway due to some other parts of the code crashing.
Tabs and frames can reload, navigate away or destroy themselves. Make sure your implementation handles these cases. Especially be wary of caching or data stores as they can become obsolete.
That said, I have implemented the solution for a similar usecase and it works reliably and fast enough (extra overhead I observed is less than 5ms). The exact implementation will vary depending on your product's needs, and the above explanation should serve as a good reference.