puppeteer: how to wait until an element is visible? - javascript

I would like to know if I can tell puppeteer to wait until an element is displayed.
const inputValidate = await page.$('input[value=validate]');
await inputValidate.click()
// I want to do something like that
waitElemenentVisble('.btnNext ')
const btnNext = await page.$('.btnNext');
await btnNext.click();
Is there any way I can accomplish this?

I think you can use page.waitForSelector(selector[, options]) function for that purpose.
const puppeteer = require('puppeteer');
puppeteer.launch().then(async browser => {
const browser = await puppeteer.launch({executablePath: "C:\\Program Files\\Google\\Chrome\\Application\\chrome.exe", headless: false});
const page = await browser.newPage();
await page.setUserAgent(options.agent);
await page.goto("https://www.url.net", {timeout: 60000, waitUntil: 'domcontentloaded'});
page
.waitForSelector('#myId')
.then(() => console.log('got it'));
browser.close();
});
To check the options avaible, please see the github link.

If you want to ensure the element is actually visible, you have to use
await page.waitForSelector('#myId', {visible: true})
Otherwise you are just looking for the element in the DOM and not checking for visibility.

Note, All the answers submitted until today are incorrect
Because it answer for an element if Exist or Located but NOT Visible or Displayed
The right answer is to check an element size or visibility using page.waitFor() or page.waitForFunction(), see explaination below.
// wait until present on the DOM
// await page.waitForSelector( css_selector );
// wait until "display"-ed
await page.waitForFunction("document.querySelector('.btnNext') && document.querySelector('.btnNext').clientHeight != 0");
// or wait until "visibility" not hidden
await page.waitForFunction("document.querySelector('.btnNext') && document.querySelector('.btnNext').style.visibility != 'hidden'");
const btnNext = await page.$('.btnNext');
await btnNext.click();
Explanation
The element that Exist on the DOM of page not always Visible if has CSS property display:none or visibility:hidden that why using page.waitForSelector(selector) is not good idea, let see the different in the snippet below.
function isExist(selector) {
let el = document.querySelector(selector);
let exist = el.length != 0 ? 'Exist!' : 'Not Exist!';
console.log(selector + ' is ' + exist)
}
function isVisible(selector) {
let el = document.querySelector(selector).clientHeight;
let visible = el != 0 ? 'Visible, ' + el : 'Not Visible, ' + el;
console.log(selector + ' is ' + visible + 'px')
}
isExist('#idA');
isVisible('#idA');
console.log('=============================')
isExist('#idB')
isVisible('#idB')
.bd {border: solid 2px blue;}
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div class="bd">
<div id="idA" style="display:none">#idA, hidden element</div>
</div>
<br>
<div class="bd">
<div id="idB">#idB, visible element</div>
</div>
on the snippet above the function isExist() is simulate
page.waitForSelector('#myId');
and we can see while running isExist() for both element #idA an #idB is return exist.
But when running isVisible() the #idA is not visible or dislayed.
And here other objects to check if an element is displayed or using CSS property display.
scrollWidth
scrollHeight
offsetTop
offsetWidth
offsetHeight
offsetLeft
clientWidth
clientHeight
for style visibility check with not hidden.
note: I'm not good in Javascript or English, feel free to improve this answer.

You can use page.waitFor(), page.waitForSelector(), or page.waitForXPath() to wait for an element on a page:
// Selectors
const css_selector = '.btnNext';
const xpath_selector = '//*[contains(concat(" ", normalize-space(#class), " "), " btnNext ")]';
// Wait for CSS Selector
await page.waitFor(css_selector);
await page.waitForSelector(css_selector);
// Wait for XPath Selector
await page.waitFor(xpath_selector);
await page.waitForXPath(xpath_selector);
Note: In reference to a frame, you can also use frame.waitFor(), frame.waitForSelector(), or frame.waitForXPath().

Updated answer with some optimizations:
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({headless: true});
const page = await browser.newPage();
await page.goto('https://www.somedomain.com', {waitUntil: 'networkidle2'});
await page.click('input[value=validate]');
await page.waitForSelector('#myId');
await page.click('.btnNext');
console.log('got it');
browser.close();
})();

While I agree with #ewwink answer. Puppeteer's API checks for not hidden by default, so when you do:
await page.waitForSelector('#id', {visible: true})
You get not hidden and visible by CSS.
To ensure rendering you can do as #ewwink's waitForFunction. However to completely answer your question, here's a snippet using puppeteer's API:
async waitElemenentVisble(selector) {
function waitVisible(selector) {
function hasVisibleBoundingBox(element) {
const rect = element.getBoundingClientRect()
return !!(rect.top || rect.bottom || rect.width || rect.height)
}
const elements = [document.querySelectorAll(selector)].filter(hasVisibleBoundingBox)
return elements[0]
}
await page.waitForFunction(waitVisible, {visible: true}, selector)
const jsHandle = await page.evaluateHandle(waitVisible, selector)
return jsHandle.asElement()
}
After writing some methods like this myself, I found expect-puppeteer which does this and more better (see toMatchElement).

async function waitForVisible (selector){
//const selector = '.foo';
return await page.waitForFunction(
(selector) => document.querySelector(selector) && document.querySelector(selector).clientHeight != 0",
{},
selector
);
}
Above function makes it generic, so that you can use it anywhere.
But, if you are using pptr there is another faster and easier solution:
https://pptr.dev/#?product=Puppeteer&version=v10.0.0&show=api-pagewaitforfunctionpagefunction-options-args
page.waitForSelector('#myId', {visible: true})

Just tested this by scraping a fitness website. #ewwink, #0fnt, and #caram have provided the most complete answer.
Just because a DOM element is visible doesn't mean that it's content has been fully populated.
Today, I ran:
await page.waitForSelector("table#some-table", {visible:true})
const data = await page.$eval("table#some-table",(el)=>el.outerHTML)
console.log(data)
And incorrectly received the following, because the table DOM hadn't been populated fully by runtime. You can see that the outerHTML is empty.
user#env:$ <table id="some-table"></table>
Adding a pause of 1 second fixed this, as might be expected:
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
await page.waitForSelector("table#some-table", {visible:true})
await sleep(1000)
const data = await page.$eval("table#some-table",(el)=>el.outerHTML)
console.log(data)
user#env:$ <table id="some-table"><tr><td>Data</td></tr></table>
But so did #ewwink's answer, more elegantly (no artificial timeouts):
await page.waitForSelector("table#some-table", {visible:true})
await page.waitForFunction("document.querySelector('table#sched-records').clientHeight != 0")
const data = await page.$eval("table#some-table",(el)=>el.outerHTML)
console.log(data)
user#env:$ <table id="some-table"><tr><td>Data</td></tr></table>

Related

Puppeteer- How to .click() a single button out of a grid of buttons with same classname?

I'm developing a Nike SNKRS BOT to buy shoes with Puppeteer and Node.js.
I'm having issues to distinguish and .click() Size button screenshot of html devtools and front end buttons
That's my code: i'm not experienced so i have tried everything
const xpathButton = '//*
[#id="root"]/div/div/div[1]/div/div[1]/div[2]/div/section[1]/div[2]/aside/div/div[2]/div/
div[2]/ul/li[1]/button'
const puppeteer = require('puppeteer')
const productUrl = 'https://www.nike.com/it/launch/t/air-max-97-coconut-
milk-black'
const idAcceptCookies = "button[class='ncss-btn-primary-dark btn-lg']"
async function givePage(){
const browser = await puppeteer.launch({headless: false})
const page = await browser.newPage();
return page;
}
async function addToCart(page){
await page.goto(urlProdotto);
await page.waitForSelector(idAcceptCookies);
await page.click(idAcceptCookies,elem => elem.click());
//this is where the issues begin
//attempt 1
await page.evaluate(() => document.getElementsByClassName('size-grid-
dropdown size-grid-button"')[1].click());
//attempt 2
const sizeButton = "button[class='size-grid-dropdown size-grid-button']
button[name='42']";
await page.waitForSelector(sizeButton);
await page.click(sizeButton,elem => elem.click());
}
//attempt 3
await page.click(xpathButton)
//attempt 4
document.evaluate("//button[contains ( ., '36')]", document, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue
async function checkout(){
var page = await givePage();
await addToCart(page)
}
checkout()
Attempt number 2 looks like the best approach, except your selector is wrong. The button does not have a name attribute, according to your screenshot, so you will need another approach, closer to attempt 3.
You can use puppeteer to select an element by with xpath, and xpath allows you to select by an element's text content.
Try this:
await page.waitForXPath('//button[contains(text(), "EU 36")]')
const [button] = await page.$x('//button[contains(text(), "EU 36")]')
await button.click()
Because the xpath selector is returning an array of element handles, I destructure the first element in the array (which should be the only match), and assign it a value of button. That element handle can now be clicked.

Can't scrape from a page I navigate to by using Puppeteer

I'm fairly new to Puppeteer and I'm trying to practice keep tracking of a selected item from Amazon. However, I'm facing a problem when I try to retrieve some results from the page.
The way I intended this automation to work is by following these steps:
New tab.
Go to the home page of Amazon.
Enter the given product name in the search element.
Press the enter key.
Return the product title and price.
Check this example below:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: false,
});
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', (req) => { // don't load any fonts or images on my requests. To Boost the performance
if (req.resourceType() == 'font' /* || req.resourceType() == 'image' || req.resourceType() == 'stylesheet'*/) {
req.abort();
}
else {
req.continue(); {
}
}
});
const baseDomain = 'https://www.amazon.com';
await page.goto(`${baseDomain}/`, { waitUntil: "networkidle0" });
await page.click("#twotabsearchtextbox" ,{delay: 50})
await page.type("#twotabsearchtextbox", "Bose QuietComfort 35 II",{delay: 50});
await page.keyboard.press("Enter");
await page.waitForNavigation({
waitUntil: 'networkidle2',
});
let productTitle = await page.$$(".a-size-medium, .a-color-base, .a-text-normal")[43]; //varible that holds the title of the product
console.log(productTitle );
debugger;
})();
when I execute this code, I get in the console.log a value of undefined for the variable productTitle. I had a lot of trouble with scraping information from a page I navigate to. I used to do page.evaluate() and it only worked when I'm scraping from the page that I have told the browser to go to.
The first problem is on this line:
let productTitle = await page.$$(".a-size-medium, .a-color-base, .a-text-normal")[43];
// is equivalent to:
let productTitle = await (somePromise[43]);
// As you guessed it, a Promise does not have a property `43`,
// so I think you meant to do this instead:
let productTitle = (await page.$$(".a-size-medium, .a-color-base, .a-text-normal"))[43];
Once this is fixed, you don't get the title text, but a handle to the DOM element. So you can do:
let titleElem = (await page.$$(".a-size-medium, .a-color-base, .a-text-normal"))[43];
let productTitle = await titleElem.evaluate(node => node.innerText);
console.log(productTitle); // "Microphone"
However, I'm not sure that simply selecting the 43rd element will always get you the one you want, but if it isn't, that would be a topic for another question.

Puppeteer returning undefined (JS) using xPath

I'm trying to scrape this element: on this website.
My JS code:
const puppeteer = require("puppeteer");
const url = 'https://magicseaweed.com/Bore-Surf-Report/1886/'
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(url);
const title = await page.$x('/html/body/div[1]/div[2]/div[2]/div/div[2]/div[2]/div[2]/div/div/div[1]/div/header/h3/div[1]/span[1]')
let text = await page.evaluate(res => res.textContext, title[0])
console.log(text) // UNDEFINED
text is undefined. What is the problem here? Thanks.
I think you need to fix 1 or 2 issues on your code.
textContent vs textContext
xpath
For the content you want the xpath should be:
const title = await page.$x('/html/body/div[1]/div[2]/div[2]/div/div[2]/div[2]/div[2]/div/div/div[1]/div/div[1]/div[1]/div/div[2]/ul[1]/li[1]/text()')
And to get the content of this:
const text = await page.evaluate(el => {
return el.textContent.trim()
}, title[0])
Notice you need send title[0] as an argument to the page function.
OR
if you don't need to use xpath, it seems you could get directly using class name to find the element:
const rating = await page.evaluate(() => {
return $('.rating.rating-large.clearfix > li.rating-text')[0].textContent.trim()
})

Sequentially pressing each element of a certain class within an SVG

I've been looking for a good example of click events on every element of a certain class, but I can't seem to find one. In my app, I generate multiple bars in an svg with class .bar.
Is there a nice way to iterate through each bar in the selection and click it?
Here is my code so far (with the link to dev area removed):
const puppeteer = require('puppeteer');
(async () => {
//open browser, use headless to allowing viewing
const browser = await puppeteer.launch({headless: false, sloMo: 80});
const page = await browser.newPage();
//goto link
await page.goto('/link_to_test/');
//scraping automation goes here
await page.waitFor(5000);
let bars = await page.$$(".bar");
for(const idx in bars){
await bars[idx].click({delay:250});
}
// close browser
await browser.close();
})();
I've been looking for a way to select each bar from the $$(".bar") selection and click it but I cannot seem to find any documentation around it.
Update
I increased the page.waitFor to 5000 and removed the ElementHandle from the for loop. Code no longer throws any errors but it doesn't want to click anything.
Looks like this doesn't work for SVG elements yet https://github.com/GoogleChrome/puppeteer/issues/1769
Without seeing more code I am not sure whether this is the answer you need. This code selects all a.bars for a given ul and returns an array of all the hrefs. We then loop through the links and open each one in turn.
I think the missing bit of the jigsaw is that I am mapping the links to an array (see below ... links => links.map((a) => { return a.href }));
const puppeteer = require('puppeteer');
(async () => {
const html = `
<html>
<body>
<ul>
<li><a class="bar" href="https://www.google.com">Goolge</a></li>
<li><a class="bar" href="https://www.bing.com">Bing</a></li>
<li><a class="bar" href="https://duckduckgo.com">DuckDuckGo</a></li>
</ul>
</body>
</html>`;
const browser = await puppeteer.launch({ headless:false});
const page = await browser.newPage();
await page.goto(`data:text/html,${html}`);
const data = await page.$$eval('ul li a.bar', links =>
links.map((a) => { return a.href }));
//You will now have an array of hrefs
for (const i in data) {
console.log("Opening", data[i]);
await page.goto(data[i]);
}
await browser.close();
})();

Unable to choose by selectors using Puppeteer

I have a problem with getting elements by their selectors.
A page on which I struggle is: http://html5.haxball.com/.
What I have succeded is to log in, but that was kind of a hack, because I used the fact, that the field I need to fill is already selected.
After typing in nick and going into lobby I want to click the button 'Create room'. Its selector:
body > div > div > div > div > div.buttons > button:nth-child(3)
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
args: ['--no-sandbox'], headless: false, slowMo: 10
});
const page = await browser.newPage();
await page.goto('http://html5.haxball.com/index.html');
await page.keyboard.type(name);
await page.keyboard.press('Enter');
//at this point I am logged in.
let buttonSelector = 'body > div > div > div > div > div.buttons > button:nth-child(3)';
await page.waitForSelector('body > div > div');
await page.evaluate(() => {
document.querySelector(buttonSelector).click();
});
browser.close();
})();
after running such code I get error:
UnhandledPromiseRejectionWarning: Error: Evaluation failed: TypeError: Cannot read property 'click' of null
My initial approach was with:
await page.click(buttonSelector);
instead of page.evaluate but it also fails.
What frustrates my the most is the fact that when I run in Chromium console:
document.querySelector(buttonSelector).click();
it works fine.
A few things to note:
The selector you are using to retrieve the button is more complex than it needs to be. Try something simpler like: 'button[data-hook="create"]'.
The game is within an iframe, so you're better off calling document.querySelector using the iframe's document object as opposed to the containing window's document
The function passed to evaluate is executed in a different context than where you are running your node script. For this reason, you have to explicitly pass variables from your node script to the window script otherwise buttonSelector will be undefined:
Making the changes above, your code will input your name and successfully click on "Create Room":
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
args: ['--no-sandbox'], headless: false, slowMo: 10
});
const page = await browser.newPage();
await page.goto('http://html5.haxball.com/index.html');
await page.keyboard.type('Chris');
await page.keyboard.press('Enter');
//at this point I am logged in.
let buttonSelector = 'button[data-hook="create"]';
await page.waitForSelector('body > div > div');
await page.evaluate((buttonSelector) => {
var frame = document.querySelector('iframe');
var frameDocument = frame.contentDocument;
frameDocument.querySelector(buttonSelector).click();
}, buttonSelector);
browser.close();
})();

Categories

Resources