I tried searching for this answer but there doesn't seem to be an answer on the Internet. What I want to do is use node js to reload a page until it finds the element with the query I want. I will be using puppeteer for other parts of the program if that will help.
Ok, I used functions from both answers and came up with this, probably unoptimized code:
const puppeteer = require("puppeteer");
(async () => {
try {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto("http://127.0.0.1:5500/main.html");
await page.waitForSelector("#buy-button");
console.log("worked");
} catch (err) {
console.log(`ERROR: ${err}`);
}
})();
But what I don't know how to do is to reload the page, and keep reloading until the id I want is there. For example, keep reloading youtube until the video you want is there(unpractical example, but I think it gets the point across).
Here's how I solved waiting for an element in puppeteer and reloading the page if it wasn't found;
async waitForSelectorWithReload(selector: string) {
const MAX_TRIES = 5;
let tries = 0;
while (tries <= MAX_TRIES) {
try {
const element = await this.page.waitForSelector(selector, {
timeout: 5000,
});
return element;
} catch (error) {
if (tries === MAX_TRIES) throw error;
tries += 1;
void this.page.reload();
await this.page.waitForNavigation({ waitUntil: 'networkidle0' });
}
}
}
And can be used as;
await waitForSelectorWithReload("input#name")
You can use "waitUntil: "networkidle2" to make sure the page is done loading. Obviously change the url, unless you are actually using evil.com
const puppeteer = require("puppeteer"); // include library
(async () =>{
const browser = await puppeteer.launch(); // run browser
const page = await browser.newPage(); // create new tab
await page.goto(
`http://www.evil.com`,
{
waitUntil: "networkidle2",
}
);
// do your stuff here
await browser.close();
})();
const puppeteer = require('puppeteer');
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
page
.waitForSelector('#myId')
.then(() => console.log('got it'));
browser.close();
});
Related
I am trying to automate my application which is running on azure portal using puppeteer. And I am getting following error after entering the password it is not clicking the submit button.
node:55768) UnhandledPromiseRejectionWarning: ReferenceError: browser is not defined
Here is my sample code:
(async () => {
try {
const launchOptions = { headless: false, args: ['--start-maximized'] };
const browser = await puppeteer.launch(launchOptions);
const page = await browser.newPage();
await page.emulate(iPhonex);
await page.goto('https://apps.testpowerapps.com/play/72ff5b93-2327-404d-9423-92eedb44a287?tenantId=n082027');
//Enter User Name
const [userName] = await page.$x('//*[#id="i0116"]');
await userName.type("jyoti.m#azure.com");
const [loginButton] = await page.$x('//*[#id="idSIButton9"]');
await loginButton.press('Enter');
//Enter Password
const [passWord] = await page.$x('//*[#id="i0118"]');
await passWord.type("Pass123");
const [submitButton] = await page.$x('//*[#id="idSIButton9"]');
await submitButton.press('Enter');
//await page.keyboard.press('Enter');
}
catch(error){
console.error(error);
}
finally {
await browser.close();
}
})();
Tried with both way but not working only catch is the xpath for both the pages are same.
const [submitButton] = await page.$x('//*[#id="idSIButton9"]');
await submitButton.press('Enter');
//await page.keyboard.press('Enter');
any clue to resolve this.
You define the browser value in the try but you also use it in the catch. consts are block-scoped, so they are tied to the block, so a different block (the finally) can not see it.
Here is the problem:
try {
const browser = ...;
}
finally {
// different block!
await browser.close();
}
To solve this, move the browser out of the try-catch:
const browser = ...
try {
}
finally {
await browser.close();
}
This way it's available in the finally block.
I am trying to wait for a popup to load completely before proceeding but i am not sure how to accomplish this, currently i am using a await page.waitFor(3000);. Is there a more elegant way to do this and wait for the popup to fully load and then proceed.
below is my relevant part of the code.
await page.evaluate(async () => {
await $('#myDataExport').click();
await $('.export-btn a').click();
},);
await page.waitFor(3000);
const browserPages = await browser.pages();
const exportPopup = browserPages[browserPages.length - 1];
I have also tried to use the below
await Promise.all([
await page.click('.export-btn a'),
await page.waitForNavigation({ waitUntil: 'networkidle2' }),
]);
But I get an error Error: Node is either not visible or not an HTMLElement
Any help in this would be really great, Thanks.
I tried to make a working example. You can just ignore the request interception code.
const puppeteer = require('puppeteer')
;(async () => {
const browser = await puppeteer.launch({headless: false})
const [page] = await browser.pages()
// This network interception due to massive ads on the page
// You can remove this if you like, as this is just an example
// page.setRequestInterception(true)
// page.on('request', request => {
// if (request.url().startsWith('https://www.w3schools.com/')) {
// request.continue()
// } else {
// request.abort()
// }
// })
await page.goto('https://www.w3schools.com/tags/att_a_target.asp', {waitUntil: 'domcontentloaded'})
const [popup] = await Promise.all([
new Promise(resolve => page.on('popup', resolve)),
// THE LINES COMMENTED BELOW IS JUST AN W3SCHOOL EXAMPLE
// page.waitForSelector('a[target="_blank"].w3-btn.w3-margin-bottom'),
// page.click('a[target="_blank"].w3-btn.w3-margin-bottom'),
// YOUR CODE SHOULD LIKE THIS
page.waitForSelector('.export-btn a'),
page.click('.export-btn a'),
])
await popup.waitForSelector('#iframeResult')
await popup.screenshot({path: 'targetpopup.png'})
await popup.close()
await browser.close()
})()
Have you tried: browser.once with targetcreated target domain event?
Calling target.page() connects Puppeteer to the tab and generates a Page object.
New tabs aren't opened immediately on click. A way to await events is to create a new promise. [source]
Example:
const newPagePromise = new Promise(resolve => browser.once('targetcreated', target => resolve(target.page()));
await page.click('.export-btn a');
const newPage = await newPagePromise;
i have this script, everything works until the start of block 2
I do not understand why it does not do the work in block 2, I should have a return in "page.on request" but it is not the case it leaves directly, would you have an idea of the problem?
node return no error to me
thanks
async function main() {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.setViewport({width: 1200, height: 720})
await page.goto('https://site.local', { waitUntil: 'networkidle0' }); // wait until page load
await page.type("input[name='UserName']", "myusername");
await page.type("input[name='Password']", "mypassworduser");
// click and wait for navigation
await Promise.all([
page.click("body > main > button"),
page.waitForNavigation({ waitUntil: 'networkidle0' }),
]);
await page.goto(urlformation);
await page.setRequestInterception(true);
await page.on('request', (request) => {
if (request.resourceType() === 'media') {
var CurrentRequest = request.url();
console.log(CurrentRequest);
fs.appendFileSync(fichiernurlaudio, request.url()+"\r\n");
}
request.continue();
});
//START BLOC 1 ------------------IT WORK
const Titresaudios = await page.evaluate(() => {
let names = document.querySelectorAll(
"td.cursor.audio"
);
let arr = Array.prototype.slice.call(names);
let text_arr = [];
for (let i = 0; i < arr.length; i += 1) {
text_arr.push($vartraited+"\r\n");
}
return text_arr;
})
fs.appendFileSync(fichiernomaudio, Titresaudios);
//END BLOCK 1------------------IT WORK- i got data in my file
//START BLOCK 2-------seems to ignore-----------NOT WORKING
await page.evaluate(()=>{
let elements = document.querySelectorAll("td.cursor.audio");
elements.forEach((element, index) => {
setTimeout(() => {
element.click();
}, index * 1000);
})
})
//END BLOCK 2---------seems to ignore---------NO WORKING
//i should see some console.log in page.on('request' (request) => { but instant close after works of bloc 1
await page.close();
await browser.close();
}
main();
I have no clue, what exactly you are trying to achieve there, but that block could be rewritten like this:
// ...
const els = await page.$$( 'td.cursor.audio' );
for( const el of els ) {
// basically your timeout, but from outside the browser
await page.waitFor( 1000 );
// perform the action
await el.click();
}
// ...
In your script the only thing you did in the evaluate() call was to schedule a few timeout-actions. As soon as those were scheduled (but not executed!) the callback of evaluate() exits and your script proceeds with closing the browser. So likely your clicks were never executed.
In my experience it is usually advisable to do as much as you can in NodeJS and not within the browser. Usually also makes for easier debugging.
Finally I figured how to use Node.js. Installed all libraries/extensions. So puppeteer is working, but as it was previous with Xmlhttp... it gets only template/body of the page, without needed information. All scripts on the page engage after few second it had been opened in browser (Web app?). I need to get information inside certain tags after Whole page is loaded. Also, I would ask, if it possible to have pure JavaScript, because I do not use jQuery like code. So it doubles difficulty for me...
Here what I have so far.
const puppeteer = require('puppeteer');
const $ = require('cheerio');
let browser;
let page;
const url = "really long link with latitude and attitude";
(async () => puppeteer
.launch()
.then(await function(browser) {
return browser.newPage();
})
.then(await function(page) {
return page.goto(url).then(function() {
return page.content();
});
})
.then(await function(html) {
$('strong', html).each(function() {
console.log($(this).text());
});
})
.catch(function(err) {
//handle error
}))();
I get only template default body elements inside strong tag. But it should contain a lot more data than just 10 items.
If you want full html same as inspect? Here it is:
const puppeteer = require('puppeteer');
(async function main() {
try {
const browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.goto('https://example.org/', { waitUntil: 'networkidle0' });
const data = await page.evaluate(() => document.querySelector('*').outerHTML);
console.log(data);
await browser.close();
} catch (err) {
console.error(err);
}
})();
let bodyHTML = await page.evaluate(() => document.documentElement.outerHTML);
This
Some notes:
You need not cheerio with puppeteer and you need not reparse page.content(): you already have the full DOM with all scripts run and you can evaluate any code in window context like in a browser using page.evaluate() and transferring serializable data between web API context and Node.js API context.
Try to use async/await only, this will simplify your code and flow.
If you need to wait till all the scripts and other dependencies are loaded, use waitUntil: 'networkidle0' in page.goto().
If you suspect that document scripts need some time till the needed state, use various test functions like page.waitForSelector() or fall back to page.waitFor(milliseconds).
Here is a simple script that outputs all tag names in a page.
'use strict';
const puppeteer = require('puppeteer');
(async function main() {
try {
const browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.goto('https://example.org/', { waitUntil: 'networkidle0' });
const data = await page.evaluate(
() => Array.from(document.querySelectorAll('*'))
.map(elem => elem.tagName)
);
console.log(data);
await browser.close();
} catch (err) {
console.error(err);
}
})();
You can specify your task in more details and we can try to write something more appropriate.
Script for www.bezrealitky.cz (task from a comment below):
'use strict';
const fs = require('fs');
const puppeteer = require('puppeteer');
(async function main() {
try {
const browser = await puppeteer.launch();
const [page] = await browser.pages();
page.setDefaultTimeout(0);
await page.goto('https://www.bezrealitky.cz/vyhledat?offerType=pronajem&estateType=byt&disposition=&ownership=&construction=&equipped=&balcony=&order=timeOrder_desc&boundary=%5B%5B%7B%22lat%22%3A50.171436864513%2C%22lng%22%3A14.506905276796942%7D%2C%7B%22lat%22%3A50.154133576294%2C%22lng%22%3A14.599004629591036%7D%2C%7B%22lat%22%3A50.14524430128%2C%22lng%22%3A14.58773054712799%7D%2C%7B%22lat%22%3A50.129307131988%2C%22lng%22%3A14.60087568578706%7D%2C%7B%22lat%22%3A50.122604734575%2C%22lng%22%3A14.659116306376973%7D%2C%7B%22lat%22%3A50.106512499343%2C%22lng%22%3A14.657434650206028%7D%2C%7B%22lat%22%3A50.090685542974%2C%22lng%22%3A14.705099547441932%7D%2C%7B%22lat%22%3A50.072175921973%2C%22lng%22%3A14.700004206235008%7D%2C%7B%22lat%22%3A50.056898491904%2C%22lng%22%3A14.640206899053055%7D%2C%7B%22lat%22%3A50.038528576841%2C%22lng%22%3A14.666852728301023%7D%2C%7B%22lat%22%3A50.030955909657%2C%22lng%22%3A14.656128752460972%7D%2C%7B%22lat%22%3A50.013435368522%2C%22lng%22%3A14.66854956530301%7D%2C%7B%22lat%22%3A49.99444182116%2C%22lng%22%3A14.640153080292066%7D%2C%7B%22lat%22%3A50.010839032542%2C%22lng%22%3A14.527474219359988%7D%2C%7B%22lat%22%3A49.970771602447%2C%22lng%22%3A14.46224174052395%7D%2C%7B%22lat%22%3A49.970669964027%2C%22lng%22%3A14.400648545303966%7D%2C%7B%22lat%22%3A49.941901176098%2C%22lng%22%3A14.395563234671044%7D%2C%7B%22lat%22%3A49.948384148423%2C%22lng%22%3A14.337635637038034%7D%2C%7B%22lat%22%3A49.958376114735%2C%22lng%22%3A14.324977842107955%7D%2C%7B%22lat%22%3A49.9676286223%2C%22lng%22%3A14.34491711110104%7D%2C%7B%22lat%22%3A49.971859099005%2C%22lng%22%3A14.326815050839059%7D%2C%7B%22lat%22%3A49.990608728081%2C%22lng%22%3A14.342731259186962%7D%2C%7B%22lat%22%3A50.002211140429%2C%22lng%22%3A14.29483886971002%7D%2C%7B%22lat%22%3A50.023596577558%2C%22lng%22%3A14.315872285282012%7D%2C%7B%22lat%22%3A50.058309376419%2C%22lng%22%3A14.248086830069042%7D%2C%7B%22lat%22%3A50.073179111%2C%22lng%22%3A14.290193274400963%7D%2C%7B%22lat%22%3A50.102973823639%2C%22lng%22%3A14.224439442359994%7D%2C%7B%22lat%22%3A50.130060800171%2C%22lng%22%3A14.302396419107936%7D%2C%7B%22lat%22%3A50.116019827009%2C%22lng%22%3A14.360785349547996%7D%2C%7B%22lat%22%3A50.148005694843%2C%22lng%22%3A14.365662825877052%7D%2C%7B%22lat%22%3A50.14142969454%2C%22lng%22%3A14.394903042943952%7D%2C%7B%22lat%22%3A50.171436864513%2C%22lng%22%3A14.506905276796942%7D%2C%7B%22lat%22%3A50.171436864513%2C%22lng%22%3A14.506905276796942%7D%5D%5D&hasDrawnBoundary=1&mapBounds=%5B%5B%7B%22lat%22%3A50.289447077141126%2C%22lng%22%3A14.68724263943227%7D%2C%7B%22lat%22%3A50.289447077141126%2C%22lng%22%3A14.087801111111958%7D%2C%7B%22lat%22%3A50.039169221047985%2C%22lng%22%3A14.087801111111958%7D%2C%7B%22lat%22%3A50.039169221047985%2C%22lng%22%3A14.68724263943227%7D%2C%7B%22lat%22%3A50.289447077141126%2C%22lng%22%3A14.68724263943227%7D%5D%5D¢er=%7B%22lat%22%3A50.16447196305031%2C%22lng%22%3A14.387521875272125%7D&zoom=11&locationInput=praha&limit=15');
await page.waitForSelector('#search-content button.btn-icon');
while (await page.$('#search-content button.btn-icon') !== null) {
const articlesForNow = (await page.$$('#search-content article')).length;
console.log(`Articles for now: ${articlesForNow}. Getting more...`);
await Promise.all([
page.evaluate(
() => { document.querySelector('#search-content button.btn-icon').click(); }
),
page.waitForFunction(
old => document.querySelectorAll('#search-content article').length > old,
{},
articlesForNow
),
]);
}
const articlesAll = (await page.$$('#search-content article')).length;
console.log(`All articles: ${articlesAll}.`);
fs.writeFileSync('full.html', await page.content());
fs.writeFileSync('articles.html', await page.evaluate(
() => document.querySelector('#search-content div.b-filter__inner').outerHTML
));
fs.writeFileSync('articles.txt', await page.evaluate(
() => [...document.querySelectorAll('#search-content article')]
.map(({ innerText }) => innerText)
.join(`\n${'-'.repeat(50)}\n`)
));
console.log('Saved.');
await browser.close();
} catch (err) {
console.error(err);
}
})();
Just one line:
const html = await page.content();
Details:
import puppeteer from 'puppeteer'
const test = async (url) => {
const browser = await puppeteer.launch({ headless: false })
const page = await browser.newPage()
await page.goto(url, { waitUntil: 'networkidle0' })
const html = await page.content()
console.log(html)
}
await test('https://stackoverflow.com/')
I’m trying check if elements are available on a page from within a function, if the element is on the page, good, continue with the code, if not, log the error.
Using the try puppeteer page, here is what I tried:
const browser = await puppeteer.launch();
const page = await browser.newPage();
const check = element => {
try {
await page.waitFor(element, {timeout: 1000});
} catch(e) {
console.log("error : ", e)
await browser.close();
}
}
await page.goto('https://www.example.com/');
check("#something");
console.log("done")
await browser.close();
I get Error running your code. SyntaxError: Unexpected identifier. I debugged a bit and it seems that page within the check function is the unexpected identifier. So I tried to pass it in with force like this:
const browser = await puppeteer.launch();
const page = await browser.newPage();
const check = (element, page) => {
try {
await page.waitFor(element, {timeout: 1000});
} catch(e) {
console.log("error : ", e)
await browser.close();
}
}
await page.goto('https://www.example.com/');
check("#something", page);
console.log("done")
await browser.close();
but I get the same Error running your code. SyntaxError: Unexpected identifier error...
What am I doing wrong?
You can use this variant to check if the element is in the page or not.
if (await page.$(selector) !== null) console.log('found');
else console.log('not found');
Now back to your code, it's throwing error because of this function is not async,
const check = async element => { // <-- make it async
try {
await page.waitFor(element, {timeout: 1000});
} catch(e) {
console.log("error : ", e)
await browser.close();
}
}
Anytime you call await, it must be inside an async function. You cannot call await everywhere. So your check function should be called like this,
await check("#something", page);
So altogether we can rewrite the code snippet this way, you can go ahead and try this one.
const browser = await puppeteer.launch();
const page = await browser.newPage();
const check = async(element, page) => (await page.$(element) !== null); // Make it async, return true if the element is visible
await page.goto('https://www.example.com/');
// now lets check for the h1 element on example.com
const foundH1 = await check("h1", page);
console.log(`Element Found? : ${foundH1}`);
// now lets check for the h2 element on example.com
const foundH2 = await check("h2", page);
console.log(`Element Found? : ${foundH2}`);
await browser.close();
Also async functions will return promises, so you have to catch that promise or use another await. Read more about async await here:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/async_function
https://ponyfoo.com/articles/understanding-javascript-async-await