Im using Python + Selenium + Splinter + Firefox to create an interactive web crawler.
The python script offers the options, then Selenium opens Firefox and sends some orders.
Right now, I need to let the python script know the web element that the user wants to interact with.
The method I currently use is:
Right-click the item in the website (Firefox), click 'inspect
element', then click in the Firefox inspector, click 'copy HTML', then
feed it manually to the script, which will then be able to go on.
But for obvious reasons I feel this process is far from perfection.
I know nothing of javascript, but after reading other questions I get the feeling that javascript could actually be the solution.
Splinter allows to run javascript and pick up the returning values into the python script, so, theoretically:
Would it be possible to run a javascript code that would return the html code of the next element the user clicks? So the named method would only be right-clicking the desired element?
Clarification for Amey's comment:
The python script opens a Firefox window, which control is still retained from the script.
And with splinter, javascript code can be executed and waited upon completion / information return.
This means that the python script can ask the user to click or right-click in the Firefox window that it owns, so the aim would be to launch a javascript that would "catch" which element the user clicks on.
Would that be enough for javascript to catch the desired element?
This was an interesting question. My strategy was to use Javascript to add listeners to the elements you're targeting. Since you didn't specify what types of elements, I used links. This could easily be adapted though.
When an element is clicked, the listener creates a new page element with an ID you specify and sets the value attribute to the relevant information.
Then, assuming you've set driver.implicitly_wait, you can just wait for the element to appear.
driver.execute_script("for(var i = 0; i < document.links.length; i++){document.links[i].onclick = function clicked(){var e = document.createElement('a'); e.setAttribute('id','myUniqueID'); e.setAttribute('value', this); document.getElementsByTagName('body')[0].appendChild(e);};}")
clicked = driver.find_element_by_id('myUniqueID').get_attribute('value')
Related
First off, I want to say that I very little knowledge of coding so please bear with me. I'm trying to paste in a site that doesn't allow it. This is the link to the javascript that they used to block it, https://mychatdashboard.com/js/messages.js?v=1.3
A friend of mine is helping me with it and he suggested that I put this in the javascript console in the DevTools of Google Chrome,
handler = function(e){ e.stopImmediatePropagation(); return true; }
document.querySelector('#conversation-content .conversation-message-text').addEventListener('keyup', handler, true)
document.querySelector('#conversation-content .conversation-message-text').addEventListener('input', handler, true)
This does solve the problem but it creates another issue. It seems that it interferes with this section of the javascript that I have linked to,
* Function to update the messagebox. (Enable/disable send button,
* change the color class, update the counter)
* #return void
So what would happen is that when a message is typed in the textbook, there's a character counter at the top which shows how many characters are written. When 80 characters(I think it's 80) are typed, the send button will be enabled so that I can send the message. However, with the javascript code that my friend suggested that I used, it stops the counter from working altogether so the send button never gets highlighted.
Is there any way around this? Please let me know if further clarifications are needed since it's the first time I'm asking a question of this nature.
The JavaScript you're entering into the DevTools console is defining a function named handler and then adding it as an event handler for keyup and input events for a field on the page you're viewing (presumable the chat window textbox).
The way that the handler is defined and attached prevents other events from firing (such as those that enable the send button when you've typed enough characters).
For this sites (and I haven't been able to test it) instead of the code you've used you could try running this in the DevTools console (once the page is loaded):
restrictCopyPasteByKeyboard = function () { return true; };
This should redefine the function that's preventing you from using paste (I can't test it out because I can't access that site).
There are numerous way through one can copy contents from Right Click protected sites
By disabling browser JavaScript in browser
Using Proxy Sites
By Using the source code of the site
Disabling JavaScript in Browsers [Google Chrome]
In Chrome browser, you can quickly disable JavaScript by going to settings. See the screenshot for better explanation:
screenshot
Through Viewing Source Code
f you have to copy the specific text content and you can take care of HTML tags, you can use browser view source options. All the major browser give an option to source of the page, which you can access directly using the format below or by right click. Since, right click is out of question here, we will simply open chrome browser and type: view-source: before the post URl Like
view-source:Enable copy and paste for a site that doesn't allow it
Press ctrl+u
And find the paragraph or text you want to copy and then paste it into any text editor.
I'm sure there are many ways of restricting user's ability to copy/paste. In my experience, it's always been a JS function that you can overwrite.
Slight variations of the below have always worked for me:
document.getElementById("#ElementWithDisabledPaste").onpaste = null
First I would like to describe the motivation for my question.
I have a complex web page to test with Selenium + HtmlUnit, which launches diverse javascript scripts. The problem which I describe should be quite common.
On the page there is a button to which jQuery binds a click callback (click event handler) after the page is loaded. There is an explicit Wait (this is a Selenium term) for the button to become clickable in the test code. So as soon as the button becomes clickable, it gets clicked by Selenium. Often, however, this happens before jQuery manages to attach to the button the click event handler. In this case the Selenium test fails.
What I thought to do is to preprocess the web page accessed by HtmlUnit before javascript starts executing on the page, injecting some <script>myownscript()</script>at the beginning of the page (so that it executes before any other script on that page). Then I would be able to know, controlling certain conditions in the Selenium test code, when exactly the attaching of the click event handler has happened (how I exactly do this, depends on the details of the application). If I make Selenium click the button then, the presence of the click event handler will be guaranteed, and the test would proceed further as planned - with no errors due to the missing click event handler.
Let us leave apart the question whether the idea is a good or a bad one (a much simpler one, of course, would be just introducing a large enough delay in the Selenium test code before trying to click the problematic button, but then there might be a problem with the overall duration of tests, because the problem I described is present on many pages of the application being tested).
Are there some hooks in Selenium/HtmlUnit which permit to preprocess the page fetched from the server, injecting a script as I described, before javascript starts executing on the page?
In this case, you can use JavaScriptExecutor. You can add a function to do anything you want in the String script.
WebElement button = driver.findElement(By.id("my-button"));
JavascriptExecutor jsExe = ((JavascriptExecutor) driver);
String script = "console.log(arguments[0].id); return arguments[0].id";
Object oj = jsExe.executeScript(script, button);
String txt = oj.toString();
System.out.println(txt);
Please be careful if you want to use aycn such as setTimeout(), it will return immediately. See an example for async method in my answer at: method execute_script don't wait end of script to return value with selenium in python
Is there a way to find out which JS script created a dynamic element in Chrome's Developer Tools? If I do 'view page source' on the page, the element isn't there. I can see the element though in Chrome's Developer Tools. Is there a way to find out specifically which JavaScript file and what line in my JavaScript file created the element?
To help clarify: I know which element is created...what I don't know is which .js file created it and specifically what line in that .js file
Updated answer:
Below you've said:
I know which element it is...what I don't know is which .js file created it and specifically what line in that .js file
That's not how the question originally read. :-)
If you know which element it is, two options for you:
You can use Dev Tools to trigger a breakpoint when its parent element is modified:
Load the page
Open Dev Tools
Go to the Elements panel
Navigate to the parent element that the target element will eventually be added to
Right-click the parent element and click Break on... > Subtree Modifications
Now, Chrome will trigger a breakpoint when the parent element's subtree is modified, and so you can see what JavaScript code is adding the element.
Unfortuantely, it won't fire that breakpoint if the element is added during the main loading of the page (e.g., during the parsing of the HTML, by script that runs immediately rather than waiting).
If there's any text in the element that seems specific to it (content, id, class, some attribute, whatever), once the page is loaded you can use Chrome's powerful search feature to try to find that text:
Load the page
Open Dev Tools
Go to the Sources tab
Click Ctrl+Shift+F, which is "find in files" — it looks in all of the files associated with the page, not just the "current" file
Type the text that you think might help you identify the code adding the element
Press Enter, all matches will be shown below
You can even use regular expressions.
Original answer:
No, there's no simple way to differentiate an element created via JavaScript after page load from ones created by the initial HTML parsing.
Or at least, there isn't without adding JavaScript to the page that runs before any other JavaScript on the page runs, which I'm guessing is a requirement.
But if you can add JavaScript to the page before any other JavaScript runs, it's actually really easy to do:
Array.prototype.forEach.call(document.querySelectorAll("*"), function(element) {
element.setAttribute("data-original", "");
});
That marks every single element on the page with an attribute that tells you it was there when that code ran. You can see those attributes in the Elements panel of the Dev Tools. And so, if you see an element that doesn't have that attribute, you know it was added later.
document.querySelectorAll("*") is a big hammer you probably wouldn't want to use in production code, but for temporary use when debugging/developing, it's fine.
And if you want to know about the elements that have been created by other code later, you can do this in the console:
Array.prototype.forEach.call(document.querySelectorAll("*"), function(element) {
if (element.getAttribute("data-original") === null) {
console.log(element);
}
});
That'll output every element that wasn't on the page when you ran the earlier code, and Chrome's console is really cool — you can right-click the element display in the console and choose "Reveal in Elements panel" to see exactly where that element is.
You can use chrome-devtools-protocol's experimental feature.
Check this, https://chromedevtools.github.io/devtools-protocol/tot/DOM/#method-getNodeStackTraces
First, send 'DOM.setNodeStackTracesEnabled' to chrome dev protocl.
Second, use 'DOM.getNodeStackTraces' message.
So, you can get call stack information from dynamic creation element.
I wrote my own program using these functions.
Image: https://imgur.com/a/TtL5PtQ
Here is my project: https://github.com/rollrat/custom-crawler
I have a Selenium WebDriver test where I enter some text into a text input box
var input_Note = Driver.Instance.FindElement(By.Id("note"));
input_Note.SendKeys("test");
I then attempt to click on the Save button, but it does not work. I was previously using Coded UI where there is a SetFocus element that points the focus towards whichever element you are targeting. Is there something similar in Selenium?
var button_Save = Driver.Instance.FindElement(By.Id("save"));
button_Save.Submit();
Sometimes depending on how the page is loaded it will exist and then not exist and then exist again. I have found that waiting on the element is a good idea and sometimes will put two waits back to back for the specific element that this will solve this issue(I would say a programming fix for a double wait would be desired...screen flashing too much at that point). This really depends on the load pattern of your application though.
WebDriverWait wait = new WebDriverWait(_driver, TimeSpan.FromSeconds(10));
wait.Until (d=> Driver.Instance.FindElement(By.Id("save")));
I would also utilize the .click if it is a button. In general it should be a submit action, but it doesn't always have to be a submit action. There may also be situations where you might need to gain the focus...which shouldn't be programmed that way, but in case it is you can utilize the Actions class and move the mouse to the element and then perform a click action on the element.
//C# example:
OpenQA.Selenium.UI.Interactions.Actions actions = new OpenQA.Selenium.UI.Interactions.Actions();
actions.MoveToElement([Instance of Web Element goes here]).Perform();
actions.Click([Instance of Web Element goes here]).Perform();
In general you could just use the actions.Click, but figured I would give both.
One of the above should work just fine. If it does not work please provide a specific error message you get with Selenium and the specific html structure of the page being utilized.
In stackoverflow here and here I found ways to add breakpoint in every method of a class. But I can't find a way to add a break point to every method of a jquery/javascript file.
This is exactly what I am trying to achieve. When I click on a checkbox in a custom control gridview (asp.net) , the entire row gets highlighted. When viewing the generated HTML, the row is nested under many other elements with their own ids and classes. There is some jquery code possibly within this 500kb jquery file, that subscribes to some event of one of the tags, either based on id or class. If I find a way to add a breakpoint to every method, I can pin point which method is responsible for highlighting the row.
(What I have gathered by looking at the generated HTML is that, a jquery function assigns a css class to the selected row)
Here is a link for how to debug javascript within Visual Studio:
http://weblogs.asp.net/scottgu/archive/2007/07/19/vs-2008-javascript-debugging.aspx
However, setting a break-point on every single method and waiting for one of them to hit is not the correct way to do debugging. You should focus on the events which are fired after the row is selected. You can do this by looking at the javascript which was written to interact with the gridview.
One place to start would be to look at the solution in IE, open up the developer tools by pressing F12. Using those tools will get you where you want to be.
P.S. Developer tools in IE also allow you to do javascript debugging right there in the browser.