How to click a specific div with a specific class? - javascript

I'm new at coding in puppeteer, and I wanted to know how to make it click this: (image)
The code I have rn is this one:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('page link is here');
await page.screenshot({ path: 'game.png' });
const [button] = await page.$x("//button[contains(., 'Accept')]");
if (button) {
await button.click();
}
I want to click it here.
await page.screenshot({ path: 'test.png' });
await browser.close();
})();
Sorry for my bad English 😔👌

If the element highlighted in the screenshot is the one to be clicked, you can simply:
await page.click('.shipyard-item');
I'd like to suggest the excellent puppeteer documentation to consult with for most the use cases.

Related

Popup form visible, but html code missing in Puppeteer

I'm currently trying to get some informations from a website (https://www.bauhaus.info/) and fail at the cookie popup form.
This is my code till now:
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://www.bauhaus.info');
await sleep(5000);
const html = await page.content();
fs.writeFileSync("./page.html", html, "UTF-8");
page.pdf({
path: './bauhaus.pdf',
format: 'a4'
});
});
function sleep(ms) {
return new Promise((resolve) => {
setTimeout(resolve, ms);
});
}
Till this everything works fine. But I can't accept the cookie banner, because I don't see the html from this banner in puppeteer. But in the pdf I can see the form.
My browser
Puppeteer
Why can I not see this popup in the html code?
Bonus quest: Is there any way to replace the sleep method with any page.await without knowing which js method triggers the cookie form to appear?
This element is in a shadow root. Please visit my answer in Puppeteer not giving accurate HTML code for page with shadow roots for additional information about the shadow DOM.
This code dips into the shadow root, waits for the button to appear, then clicks it:
const puppeteer = require("puppeteer"); // ^13.5.1
let browser;
(async () => {
browser = await puppeteer.launch({headless: false});
const [page] = await browser.pages();
const url = "https://www.bauhaus.info/";
await page.goto(url, {waitUntil: "domcontentloaded"});
const el = await page.waitForSelector("#usercentrics-root");
await page.waitForFunction(el =>
el.shadowRoot.querySelector(".sc-gsDKAQ.dejeIh"), {}, el
);
await el.evaluate(el =>
el.shadowRoot.querySelector(".sc-gsDKAQ.dejeIh").click()
);
await page.waitForTimeout(100000); // pause to show that it worked
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;

puppeteer can't get page source after using evaluate()

I'm using puppeteer to interact with a website using the evaluate() function to maniupulate page front (i.e to click on certain items etc...), click through works fine but I can't return the page source after clicking using evaluate.
I have recreated the error in this simplified script below it loads google.com, clicks on 'I feel lucky' and should then return the page source of the loaded page:
const puppeteer = require('puppeteer');
async function main() {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox']
});
const page = await browser.newPage();
await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});
response = await page.evaluate(() => {
document.getElementsByClassName('RNmpXc')[1].click()
});
await page.waitForNavigation({waitUntil: 'load'});
console.log(response.text());
}
main();
I get the following error:
TypeError: Cannot read property 'text' of undefined
UPDATE New code following suggestion to use page.content()
const puppeteer = require('puppeteer');
async function main() {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox']
});
const page = await browser.newPage();
await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});
await page.evaluate(() => {
document.getElementsByClassName('RNmpXc')[1].click()
});
const source = await page.content()
console.log(source);
}
main();
I am now getting the following error:
Error: Execution context was destroyed, most likely because of a navigation.
My question is: How can I return page source using the .text() method after manipulating the webpage using the evaluate() method?
All suggestions / insight / proposals would be very much appreciated thanks.
Since you're asking for page source after javascript modification, I'd assume you want DOM and not the original HTML content. your evaluate function doesn't return anything which results in undefined response. You can use
const source = await page.evaluate(() => new XMLSerializer().serializeToString(document.doctype) + document.documentElement.outerHTML);
or
const source = await page.content();

Puppeteer does not activate button click, despite selecting button

I'm trying to automate a sign in to a simple website that a scammer sent my friend. I can use puppeteer to fill in the text inputs but when I try to use it to click the button, all it does is activate the button color change (that happens when the mouse hovers over the button). I also tried clicking enter while focusing on the input fields, but that doesn't seem to work. When I use document.buttonNode.click() in the console, it worked, but I can't seem to emulate that with puppeteer
I also tried to use the waitFor function but it kept telling me 'cannot read property waitFor'
const puppeteer = require('puppeteer');
const chromeOptions = {
headless:false,
defaultViewport: null,
slowMo:10};
(async function main() {
const browser = await puppeteer.launch(chromeOptions);
const page = await browser.newPage();
await page.goto('https://cornelluniversityemailverifica.godaddysites.com/?fbclid=IwAR3ERzNkDRPOGL1ez2fXcmumIYcMyBjuI7EUdHIWhqdRDzzUAMwRGaI_o-0');
await page.type('#input1', 'hello#cornell.edu');
await page.type('#input2', 'password');
// await this.page.waitFor(2000);
// await page.type(String.fromCharCode(13));
await page.click('button[type=submit]');
})()
This site blocks unsecured events, you need to wait before the click.
Just add the await page.waitFor(1000); before click. Also, I would suggest adding the waitUntil:"networkidle2" argument to the goto function.
So here is the working script:
const puppeteer = require('puppeteer');
const chromeOptions = {
headless: false,
defaultViewport: null,
slowMo:10
};
(async function main() {
const browser = await puppeteer.launch(chromeOptions);
const page = await browser.newPage();
await page.goto('https://cornelluniversityemailverifica.godaddysites.com/?fbclid=IwAR3ERzNkDRPOGL1ez2fXcmumIYcMyBjuI7EUdHIWhqdRDzzUAMwRGaI_o-0', { waitUntil: 'networkidle2' });
await page.type('#input1', 'hello#cornell.edu');
await page.type('#input2', 'password');
await page.waitFor(1000);
await page.click('button[type=submit]');
})()

Puppeteer Login to Instagram

I'm trying to login into Instagram with Puppeteer, but somehow I'm unable to do it.
Can you help me?
Here is the link I'm using:
https://www.instagram.com/accounts/login/
I tried different stuff. The last code I tried was this:
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://www.instagram.com/accounts/login/');
await page.evaluate();
await afterJS.type('#f29d14ae75303cc', 'username');
await afterJS.type('#f13459e80cdd114', 'password');
await page.pdf({path: 'page.pdf', format: 'A4'});
await browser.close();
})();
Thanks in advance!
OK you're on the right track but just need to change a few things.
Firstly, I have no idea where your afterJS variable comes from? Either way you won't need it.
You're asking for data to be typed into the username and password input fields but aren't asking puppeteer to actually click on the log in button to complete the log in process.
page.evaluate() is used to execute JavaScript code inside of the page context (ie. on the web page loaded in the remote browser). So you don't need to use it here.
I would refactor your code to look like the following:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://www.instagram.com/accounts/login/');
await page.waitForSelector('input[name="username"]');
await page.type('input[name="username"]', 'username');
await page.type('input[name="password"]', 'password');
await page.click('button[type="submit"]');
// Add a wait for some selector on the home page to load to ensure the next step works correctly
await page.pdf({path: 'page.pdf', format: 'A4'});
await browser.close();
})();
Hopefully this sets you down the right path to getting past the login page!
Update 1:
You've enquired about parsing the text of an element on Instagram... unfortunately I don't have an account on there myself so can't really give you an exact solution but hopefully this still proves of some value.
So you're trying to evaluate an elements text, right? You can do this as follows:
const text = await page.$eval(cssSelector, (element) => {
return element.textContent;
});
All you have to do is replace cssSelector with the selector of the element you wish to retrieve the text from.
Update 2:
OK lastly, you've enquired about scrolling down to an element within a parent element. I'm not going to steal the credit from someone else so here's the answer to that:
How to scroll to an element inside a div?
What you'll have to do is basically follow the instructions in there and get that to work with puppeteer similar to as follows:
await page.evaluate(() => {
const lastLink = document.querySelectorAll('h3 > a')[2];
const topPos = lastLink.offsetTop;
const parentDiv = document.querySelector('div[class*="eo2As"]');
parentDiv.scrollTop = topPos;
});
Bear in mind that I haven't tested that code - I've just directly followed the answer in the URL I've provided. It should work!
You can log in to Instagram using the following example code:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Wait until page has loaded
await page.goto('https://www.instagram.com/accounts/login/', {
waitUntil: 'networkidle0',
});
// Wait for log in form
await Promise.all([
page.waitForSelector('[name="username"]'),
page.waitForSelector('[name="password"]'),
page.waitForSelector('[name="submit"]'),
]);
// Enter username and password
await page.type('[name="username"]', 'username');
await page.type('[name="password"]', 'password');
// Submit log in credentials and wait for navigation
await Promise.all([
page.click('[type="submit"]'),
page.waitForNavigation({
waitUntil: 'networkidle0',
}),
]);
// Download PDF
await page.pdf({
path: 'page.pdf',
format: 'A4',
});
await browser.close();
})();

Open multiple links in new tab and switch focus with a loop with puppeteer?

I have multiple links in a single page whom I would like to access either sequentially or all at once. What I want to do is open all the links in their respective new tabs and get the page as pdf for all the pages. How do I achieve the same with puppeteer?
I can get all the links with a DOM and href property but I don't know how to open them in new tab access them and then close them.
You can open a new page in a loop:
const puppeteer = require('puppeteer');
(async () => {
try {
const browser = await puppeteer.launch();
const urls = [
'https://www.google.com',
'https://www.duckduckgo.com',
'https://www.bing.com',
];
const pdfs = urls.map(async (url, i) => {
const page = await browser.newPage();
console.log(`loading page: ${url}`);
await page.goto(url, {
waitUntil: 'networkidle0',
timeout: 120000,
});
console.log(`saving as pdf: ${url}`);
await page.pdf({
path: `${i}.pdf`,
format: 'Letter',
printBackground: true,
});
console.log(`closing page: ${url}`);
await page.close();
});
Promise.all(pdfs).then(() => {
browser.close();
});
} catch (error) {
console.log(error);
}
})();
To open a new tab (activate) it you just need to make a call to page.bringToFront()
const page1 = await browser.newPage();
await page1.goto('https://www.google.com');
const page2 = await browser.newPage();
await page2.goto('https://www.bing.com');
const pageList = await browser.pages();
console.log("NUMBER TABS:", pageList.length);
//switch tabs here
await page1.bringToFront();
//Do something... save as pdf
await page2.bringToFront();
//Do something... save as pdf
I suspect you have an array of pages so you might need to tweak the above code to cater for that.
As for generating a single pdf from multiple tabs I am pretty certain this is not possible. I suspect there will be a node library that can take multiple pdf files and merge into one.
pdf-merge might be what you are looking for.
You can also use a for loop.
(async ()=>{
const movieURL= ["https://www.imdb.com/title/tt0234215", "https://www.imdb.com/title/tt0411008"];
for (var i = 0; i < movieURL.length; i++) {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(movieURL[i], {waitUntil: "networkidle2"});
const movieData = await page.evaluate(() => {
let movieTitle = document.querySelector('div[class="TitleBlock"] > h1').innerText;
return{movieTitle}
});
await browser.close();
await console.log(movieData);
}
})()

Categories

Resources