I am attempting to click an element in a shadow dom. I am using Selenium Webdriver Nodejs. I currently get access the shadow root. I also can find an element. So I thought the last bit clicking it would be straight forward but cant find out how. There is not much support for Shadow Dom with nodejs.But cant work out how to click that element.
async function getshadowDOM(driver) {
// This gets the 1st Shadow Root
const shadowHost = await driver.findElement(By.css("#container > div.sf_common_comp-Page__header > div > xweb-shellbar"),3000);
const shadowRoot = await driver.executeScript("return arguments[0].shadowRoot", shadowHost);
//This gets the 2nd Shadow Root
const shadowHost2nd = await shadowRoot.findElement(By.css("#shellbarContainer"));
const shadowRoot2 = await driver.executeScript("return arguments[0].shadowRoot",shadowHost2nd);
//Clicks the element in 2nd Shadow DOM
const elem = await shadowRoot2.findElement(By.css("div > div.ui5-shellbar-overflow-container.ui5-shellbar-overflow-container-left > button"));
await elem.click();
}
module.exports = getshadowDOM;
I think i found a solution.
Used nodeJs + selenium.
It's work bad with return so i handle like this.
const shadowHost = await driver.findElement(By.css("body > div.project > sn-component-va-web-client"));
const shadowRoot = await driver.executeScript("return arguments[0].shadowRoot", shadowHost);
const elem = await shadowRoot.findElement(By.css("div.menu-item.contact-support-clicker"));
elem.click();
in my case its works.
Related
Is there a way to traverse through html elements in playwright like cy.get("abc").find("div") in cypress?
In other words, any find() equivalent method in playwright?
page.locator("abc").find() is not a valid method in playwright though :(
If you assign the parent element handle to a variable, any of the findBy* functions (or locator) will search only in the parent element. An example below where the parent is a div, the child is a button, and we use .locator() to find the elements.
test('foo', async ({ page }) => {
await page.goto('/foo');
const parent = await page.locator('div');
const child = await parent.locator('button');
});
You can just combine the selectors, this will resolve to div below abc
page.locator("abc div")
Let's consider the website https://www.example.com with the following HTML
<body style="">
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
<p>
More information...
</p>
</div>
</body>
As mentioned by #agoff, you can use nested locator page.locator('p').locator('a') or you can specify the nested selector directly in the locator page.locator('p >> a')
// #ts-check
const playwright = require('playwright');
(async () => {
const browser = await playwright.webkit.launch();
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://www.example.com/');
// Here res1 and res2 produces same output
const res1 = await page.locator('p').locator('a'); // nested locator
const res2 = await page.locator('p >> a'); // directly specifying the selector to check the nested elements
const out1 = await res1.innerText();
const out2 = await res2.innerText();
console.log(out1 == out2); // output: true
console.log(out1); // output: More information...
console.log(out2); // output: More information...
// Traversal
const result = await page.locator('p');
const elementList = await result.elementHandles(); // elementList will contain list of <p> tags
for (const element of elementList)
{
const out = await element.innerText()
console.log(out);
}
// The above will output both the <p> tags' innerText i.e
// This domain is for use in illustrative examples in...
// More information...
await browser.close();
})();
Since you mentioned that you need to traverse through the HTML elements, elementHandles can be used to traverse through the elements specified by the locator as mentioned in the above code.
I need to check that a button is disabled (checking for a last page of a table). There are two with the same id (top and bottom of the table).
const nextPageButtons = await this.page.$$('button#_btnNext'); // nextPageButtons.length is 2, chekced via console.log
const nextPageButtonState = await nextPageButtons[0].isDisabled();
But when I do the above I get: elementHandle.isDisabled: Unable to adopt element handle from a different document.
Why doesn't this work?
So, this works:
const nextPageButtons = await this.page.$$('button#_btnNext');
const nextPageButton1 = await nextPageButtons[0];
const nextPageButton1State = await nextPageButtonsState.isDisabled();
I'm running some Node.js code to scrape a website and return some text from this part of the html:
And here's the code I'm using to get it
const fs = require('mz/fs');
const xpath = require('xpath');
const parse5 = require('parse5');
const xmlser = require('xmlserializer');
const dom = require('xmldom').DOMParser;
const axios = require('axios');
(async () => {
const response = await axios.get('https://www.aritzia.com/en/product/sculpt-knit-tank-%28arjun-knit-top%29/66139.html?dwvar_66139_color=17388');
const html = response.data;
const document = parse5.parse(html.toString());
const xhtml = xmlser.serializeToString(document);
const doc = new dom().parseFromString(xhtml);
const select = xpath.useNamespaces({"x": "http://www.w3.org/1999/xhtml"});
const nodes = select("//x:div[contains(#class, 'pdp-product-brand')]/*/text()", doc);
console.log(nodes.length ? nodes[0].nodeValue : nodes.length)
})();
The code above works as expected -- it prints Babaton.
But when I swap out the xpath above for one that includes a instead of * (i.e. //x:div[contains(#class, 'pdp-product-brand')]/a/text()) it instead tells me that nodes.length === 0.
I would expect it to give the same result because the div that it's pointing to does in fact have a child anchor tag (see screenshot above). I'm just confused why it doesn't work with a and was wondering if anybody else knew the answer. Thanks!
My application gets a list of IDs from the db. I iterate over these with a cursor & for every ID, I plug it into a URL with Selenium to get specific items on a page. This is doing a search on a keyword & getting the most relevant item to that search. There are around 1000 results from the db. At random iterations, 1 of the driver actions will throw up an StaleElementReferenceError with the full message of:
stale element reference: element is not attached to the page document\n (Session info: chrome=77.0.3865.75)
Looking at the official docs I can see that the 2 common causes for this are:
The element has been deleted entirely.
The element is no longer attached to the DOM.
With the former being the most frequent cause.
index.js
const { MongoClient, ObjectID } = require('mongodb')
const fs = require('fs')
const path = require('path')
const { Builder, Capabilities, until, By } = require('selenium-webdriver')
const chrome = require('selenium-webdriver/chrome')
require('dotenv').config()
async function init() {
try {
const chromeOpts = new chrome.Options()
const ids = fs.readFileSync(path.resolve(__dirname, '..', 'data', 'primary_ids.json'), 'utf8')
const client = await MongoClient.connect(process.env.DB_URL || 'mongodb://localhost:27017/test', {
useNewUrlParser: true
})
const db = client.db(process.env.DB_NAME || 'test')
const productCursor = db.collection('product').find(
{
accountId: ObjectID(process.env.ACCOUNT_ID),
primaryId: {
$in: JSON.parse(ids)
}
},
{
_id: 1,
primaryId: 1
}
)
const resultsSelector = 'body #wrapper div.src-routes-search-style__container--2g429 div.src-routes-search-style__products--3rsz9'
const mostRelevantSelector = `${resultsSelector}
> div:nth-child(2)
> div.src-routes-search-product-item-raw-style__product--3vH_O:nth-child(1)`
const titleContainerSelector = `${mostRelevantSelector}
> div.src-routes-search-product-item-raw-style__mainPart--1HEWx
> div.src-routes-search-product-item-raw-style__containerText--3NefD
> div.src-routes-search-product-item-raw-style__description--3swql
> div.src-routes-search-product-item-raw-style__titleContainer--tazkH`
const productImageSelector = `${mostRelevantSelector}
> div.src-routes-search-product-item-raw-style__mainPart--1HEWx
> div.src-routes-search-product-item-raw-style__containerImages--1PfdF
> a.src-routes-search-product-item-raw-style__productImage--1Y42Y
> img`
const linkSelector = `${titleContainerSelector} > a`
const primaryIdSelector = `${titleContainerSelector} > p`
chromeOpts.setChromeBinaryPath('/usr/local/bin')
const driver = await new Builder()
.withCapabilities(Capabilities.chrome())
.forBrowser('chrome')
.build()
let newProds = {}
let product
let i = 0
while (await productCursor.hasNext()) {
i += 1
product = await productCursor.next()
let searchablePrimaryId = product.primaryId
let link
let primaryId
let pId
let href
let img
let imgSrc
if (product.primaryId.includes('#')) {
searchablePrimaryId = product.primaryId.substr(0, product.primaryId.indexOf('#'))
}
if (searchablePrimaryId.includes('-')) {
searchablePrimaryId = searchablePrimaryId.substr(0, searchablePrimaryId.indexOf('-'))
}
await driver.get(`https://icecat.biz/en/search?keyword=${encodeURIComponent(searchablePrimaryId.toLowerCase())}`)
link = await driver.wait(until.elementLocated(By.css(linkSelector)), 10000) // wait 10 seconds
img = await driver.wait(until.elementLocated(By.css(productImageSelector)), 10000)
imgSrc = await img.getAttribute('src')
primaryId = await driver.wait(until.elementLocated(By.css(primaryIdSelector)), 10000)
pId = await primaryId.getText()
href = await link.getAttribute('href')
const iceCatId = href.substr(href.lastIndexOf('-') + 1, href.length)
const _iceCatId = iceCatId.substr(0, iceCatId.indexOf('.html'))
const idFound = (searchablePrimaryId.toUpperCase() === pId.toUpperCase()) && !imgSrc.includes('logo-fullicecat')
newProds[product._id.toString()] = {
primaryId: product.primaryId,
iceCatId: idFound ? _iceCatId : 'N/A'
}
}
const foundProducts = Object.values(newProds).filter(prod => prod.iceCatId !== 'N/A')
console.log(`\nFound ${foundProducts.length}/${JSON.parse(ids).length}`)
fs.writeFileSync(path.resolve(__dirname, '..', 'data', 'new_products.json'), JSON.stringify(newProds, null, 4), 'utf8')
driver.quit()
} catch(err) {
throw err
}
}
init()
.then(res => {
console.log(res)
})
.catch(err => {
console.error(err)
})
To debug, I put a try...catch around each of the driver actions to see which specific action it is that is failing but that didn't work as it was never a consistent action that was failing. For example, sometimes if would have been one of the elementLocated lines or others it would have just been the getAttribute action.
If it is the latter in that scenario, that is why I am confused as to why this error is being thrown as surely selenium has found the element on the page (i.e. link) but is unable to do getAttribute('src') on the element? That's why I'm confused as to the error I'm getting. I imagine I must be doing something wrong with how I am setting up selenium to handle iterations. The iterations never get higher than 110
In your case the second cause is The element is no longer attached to the DOM. If a WebElement is located and the DOM is refreshed afterwards this element become stale even if the DOM hasn't change, the same locator will return new WebElement.
Normally, driver.get() will block until the page is fully loaded, however this site is running JavaScript to load the search results. You can test it by running document.readyState in the developer tools console, you will see "complete" results while the search results are still loading.
The page has a spinner before the results are located, hopefully it will be enough to wait for it to appear and became stale before scraping the page
await driver.get(`https://icecat.biz/en/search?keyword=${encodeURIComponent(searchablePrimaryId.toLowerCase())}`)
let spinner = driver.wait(until.elementIsVisible(By.className('src-routes-search-style__loader---acti')))
driver.wait(until.stalenessOf(spinner))
link = await driver.wait(until.elementLocated(By.css(linkSelector)), 10000)
You don't have wait for Ajax request to finish. The website retrieves and refreshes dom once you go to end and also keeps calling index every few seconds so DOM probably keeps updating. You can probably hold AJAX requests, get your results, process and enable AJAX again.
Could you try removing "await" from img Src = await img.getAttribute('src'). Since wait for img is already handled in its previous line.
I am trying to have Puppeteer log into a variety of sites.
On one site, the following code works fine:
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto('https://www.site1.com');
await page.waitFor('input[name=UserID]');
await page.$eval('input[name=UserID]', el => el.value = 'myusername1');
This does not work on another site. The username is a selector called id="username". So I ran it headless=false and it opens the login page, and thus I got the full JS path to the selector called "username", that is:
document.querySelector("body > banno-web > bannoweb-login").shadowRoot.querySelector("div > jha-card > article > bannoweb-login-form").shadowRoot.querySelector("#username")
I am just looking to input the username, but I do not know the syntax of getting to the username field through the DOM. This does not work:
await page.$eval('input[id=username]', el => el.value = 'myusername2');
Using page.evaluate
You can use page.evaluate to run JavaScript on the page itself.
Code Sample
const result = await page.evaluate(() => {
document.querySelector("body > banno-web > bannoweb-login")
.shadowRoot.querySelector("div > jha-card > article > bannoweb-login-form")
.shadowRoot.querySelector("#username")
.value = 'myusername2';
});
Using elementHandle.type
However, complex applications with special event handlers for focus, input, etc. like Angular pages do not work well when the value of an input field is changed like that. To behave more human-like, we should use functions like elementHandle.type instead.
Code Sample
const jsHandle = await page.evaluateHandle(
() => document.querySelector("body > banno-web > bannoweb-login")
.shadowRoot.querySelector("div > jha-card > article > bannoweb-login-form")
.shadowRoot.querySelector("#username")
);
const elementHandle = await jsHandle.asElement();
await elementHandle.type('myusername2');
This code uses page.evaluateHandle and jsHandle.asElement to get the ElementHandle from the JavaScript selector. After that, elementHandle.type is used to fill in the text.