Puppeteer custom error messages when failure - javascript

I am trying to create a custom error messages when Puppeteer fails to do a task, in my case it cannot find the field that it has to click.
let page;
before(async () => { /* before hook for mocha testing */
page = await browser.newPage();
await page.goto("https://www.linkedin.com/login");
await page.setViewport({ width: 1920, height: 1040 });
});
after(async function () { /* after hook for mocah testing */
await page.close();
});
it('should login to home page', async () => { /* simple test case */
const emailInput = "#username";
const passwordInput = "#assword";
const submitSelector = ".login__form_action_container ";
linkEmail = await page.$(emailInput);
linkPassword = await page.$(passwordInput)
linkSubmit = await page.$(submitSelector);
await linkEmail.click({ clickCount: 3 });
await linkEmail.type('testemail#example.com'); // add the email address for linkedin //
await linkPassword.click({ clickCount: 3 }).catch(error => {
console.log('The following error occurred: ' + error);
});;
await linkPassword.type('testpassword'); // add password for linkedin account
await linkSubmit.click();
await page.waitFor(3000);
});
});
I have deliberately put a wrong passwordInput name in order to force puppeteer to fail. However, the console.log message is never printed.
This is my error output which is the default mocha error:
simple test for Linkedin Login functionality
1) should login to home page
0 passing (4s)
1 failing
1) simple test for Linkedin Login functionality
should login to home page:
TypeError: Cannot read property 'click' of null
at Context.<anonymous> (test/sample.spec.js:29:28)
Line 29 is the await linkPassword.click({ clickCount: 3 })
Anyone has an idea how I can make it print a custom error message when an error like this occurs?

The problem is that the exception is being thrown not in the result of the function await linkPassword.click() execution, but in the result of attempt of executing the function. By .catch() you try to handle an eventual exception thrown during execution. page.$() works this way it returns a null if a selector isn't found. And in your case, you execute null.click({ clickCount: 3 }).catch() what actually doesn't have sense.
To quickly solve your problem you should do a check to verify whether linkPassword isn't null. However, I think you make a big mistake by using page.$() to get an element to interact with. This way you lose a lot of the puppeteer features because instead to use puppeteer's method page.click() you use a simple browser's click() in the browser.
Instead, you should make sure that the element exists and is visible and then use the puppeteer's API to play with the element. Like this:
const emailInput = "#username";
await page.waitForSelector(emailInput);
await page.click(emailInput, { clickCount: 3 });
await page.type(emailInput, 'testemail#example.com')
Thanks to that your script makes sure the element is clickable and if it is it scrolls to the element and performs clicks and types the text.
Then you can handle a case when the element isn't found this way:
page.waitForSelector(emailInput).catch(() => {})
or just by using try/catch.

Related

How to run window.stop after certain time in puppeteer

I want to web scrap a site but the problem with that site is this, content in the site load in 1s but the loader in the navbar kept loading for 30 to 1min so my puppeteer scrapper kept waiting for the loader in the navbar to stop?
Is there any way to run window.stop() after a certain timeout
GITHUB
const checkBook = async () => {
await page.goto(`https://wattpad.com/story/${bookid}`, {
waitUntil: 'domcontentloaded',
});
const is404 = await page.$('#story-404-wrapper');
if (is404) {
socket.emit('error', {
message: 'Story not found',
});
await browser.close();
return {
error: true,
};
}
storyLastUpdated = await page
.$eval(
'.table-of-contents__last-updated strong',
(ele: any) => ele.textContent,
)
.then((date: string) => getDate(date));
};
Similar approach to Marcel's answer. The following will do the job:
page.goto(url)
await page.waitForTimeout(1000)
await page.evaluate(() => window.stop())
// your scraper script goes here
await browser.close()
Notes:
page.goto() is NOT awaited, so you save time compared to waiting until DOMContentLoaded or Load events...
...if the goto was not awaited you need to make sure your script can start the work with the DOM. You can either use page.waitForTimeout() or page.waitForSelector().
you have to execute window.stop() within page.evaluate() so you can avoid this kind of error: Error: Navigation failed because browser has disconnected!
You could strip the
waitUntil: 'domcontentloaded',
in favor of a timeout as documented here https://github.com/puppeteer/puppeteer/blob/v14.1.0/docs/api.md#pagegotourl-options
or set the timeout to zero and instead use one of the page.waitFor... like this
await page.waitForTimeout(30000);

Playwright error (Target closed) after navigation

I'm trying something really simple:
Navigate to google.com
Fill the search box with "cheese"
Press enter on the search box
Print the text for the title of the first result
So simple, but I can't get it to work. This is the code:
const playwright = require('playwright');
(async () => {
for (const browserType of ['chromium', 'firefox', 'webkit']) {
const browser = await playwright[browserType].launch();
try {
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
await page.press('input[name=q]', 'Enter');
await page.waitForNavigation();
page.waitForSelector('div#rso h3')
.then(firstResult => console.log(`${browserType}: ${firstResult.textContent()}`))
.catch(error => console.error(`Waiting for result: ${error}`));
} catch(error) {
console.error(`Trying to run test on ${browserType}: ${error}`);
} finally {
await browser.close();
}
}
})();
At first I tried to get the first result with a page.$() but it didn't work. After investigating the issue a little bit I discovered that page.waitForNavigation() that I thought would be the solution, but it isn't.
I'm using the latest playwright version: 1.0.2.
It seems to me that the only problem was with your initial promise composition, I've just refactored the promise to async/await and using page.$eval to retrieve the textContent it works perfectly, there are no target closed errors anymore.
try {
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
await page.press('input[name=q]', 'Enter');
await page.waitForNavigation();
// page.waitForSelector('div#rso h3').then(firstResult => console.log(`${browserType}: ${firstResult.textContent()}`)).catch(error => console.error(`Waiting for result: ${error}`));
await page.waitForSelector('div#rso h3');
const firstResult = await page.$eval('div#rso h3', firstRes => firstRes.textContent);
console.log(`${browserType}: ${firstResult}`)
} catch(error) {
console.error(`Trying to run test on ${browserType}: ${error}`);
} finally {
await browser.close();
}
}
Output:
chrome: Cheese – Wikipedia
firefox: Cheese – Wikipedia
webkit: Cheese – Wikipedia
Note: chrome and webkit works, firefox fails on waitForNavigation for me. If I replaced it with await page.waitForTimeout(5000); firefox worked as well. It might be an issue with playwright's Firefox support for the navigation promise.
If you await the page.press('input[name=q]', 'Enter'); it might be too late for waitForNavigation to work.
You could remove the await on the press call. You can need to wait for the navigation, not the press action.
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
page.press('input[name=q]', 'Enter');
await page.waitForNavigation();
var firstResult = await page.waitForSelector('div#rso h3');
console.log(`${browserType}: ${await firstResult.textContent()}`);
Also notice that you need to await for textContent().
In my case the Playwright error Target closed appeared at the first attempt to retrieve a text from the page.
The error is inaccurate, the actual reason was that the Basic Auth was enabled in the target site.
Playwright could not open a page and just stuck with "Target closed".
const options = {
httpCredentials = { username: 'user', password: 'password'}
};
const context = await browser.newContext(options);
One more issue was that local tests were running without a problem, including docker containers, while Github CI was failing with Playwright without any details except the above error.
The reason was with a special symbol in a Github Secret. For example, the dollar sign $ will be just removed from the secret in Github Actions. To correct it, either use env: section
env:
XXX: ${ secrets.SUPER_SECRET }
or wrap the secret in single quotes:
run: |
export XXX='${{ secrets.YYY}}'
A similar escaping specificity exists in Kubernetes, Docker and Gitlub; $$ becomes $ and z$abc becomes z.
Use mcr.microsoft.com/playwright docker hub image from Microsoft with pre-installed node, npm and playwright. Alternatively during the playwright installation do not forget to install system package dependencies by running npx playwright install-deps.
A VM should have enough resources to handle browser tests. A common problem in CI/CD worfklows.

Reloading page in puppeteer if response code is not valid

So I know I can check the response code after a page.goto() call using the response.status() function but my program is built to scrape and do a bunch of actions on a website. Now some websites under load or randomly return a 500 error or 503 error instead of serving up the webpage.
So what I want to do is, for every navigation request, if the response code returns a 500 or 503 error, I want to reload the page. I have been taking a look at setRequestInterception but that's fired before a request is made. setResponseInterception doesn't exist yet (but I see it as a potential feature in github). It would be a peace of cake with setResponseInterception:
Check response code
If 500 or 503, reload page
I am wondering if I can do anything like this right now using setRequestInterception. Or I may have to individually monitor each navigation call and check if it returns a valid code before proceeding.
you didn't provide any code sample so i dont know who your code structure is but here is 1 way to do this
async function init_puppeteer( test ) {
var browser = await puppeteer.launch({headless: false , args: ['--no-sandbox', '--disable-setuid-sandbox' , ]});
let success = false ;
while(!success)
{
success = await open_page( browser , link );
}
browser.close();
}
async function open_page( browser , link ){
try
{
const page = await browser.newPage();
await page.goto( link ).catch(function (error) { throw new Error('TimeoutBrows'); });
// also you can check status code and throw error if its 500 or 503
return true ;
}
catch(e){
return false;
}
}

login into gmail fails for unknown reason

I am trying to login into my gmail with puppeteer to lower the risk of recaptcha
here is my code
await page.goto('https://accounts.google.com/AccountChooser?service=mail&continue=https://mail.google.com/mail/', {timeout: 60000})
.catch(function (error) {
throw new Error('TimeoutBrows');
});
await page.waitForSelector('#identifierId' , { visible: true });
await page.type('#identifierId' , 'myemail');
await Promise.all([
page.click('#identifierNext') ,
page.waitForSelector('.whsOnd' , { visible: true })
])
await page.type('#password .whsOnd' , "mypassword");
await page.click('#passwordNext');
await page.waitFor(5000);
but i always end up with this message
I even tried to just open the login window with puppeteer and fill the login form manually myself, but even that failed.
Am I missing something ?
When I look into console there is a failed ajax call just after login.
Request URL: https://accounts.google.com/_/signin/challenge?hl=en&TL=APDPHBCG5lPol53JDSKUY2mO1RzSwOE3ZgC39xH0VCaq_WHrJXHS6LHyTJklSkxd&_reqid=464883&rt=j
Request Method: POST
Status Code: 401
Remote Address: 216.58.213.13:443
Referrer Policy: no-referrer-when-downgrade
)]}'
[[["er",null,null,null,null,401,null,null,null,16]
,["e",2,null,null,81]
]]
I've inspected your code and it seems to be correct despite of some selectors. Also, I had to add a couple of timeouts in order to make it work. However, I failed to reproduce your issue so I'll just post the code that worked for me.
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
await page.goto('https://accounts.google.com/AccountChooser?service=mail&continue=https://mail.google.com/mail/', {timeout: 60000})
.catch(function (error) {
throw new Error('TimeoutBrows');
});
await page.screenshot({path: './1.png'});
...
})();
Please, note that I run browser in normal, not headless mode. If you take a look at screenshot at this position, you will see that it is correct Google login form
The rest of the code is responsible for entering password
const puppeteer = require('puppeteer');
(async () => {
...
await page.waitForSelector('#identifierId', {visible: true});
await page.type('#identifierId', 'my#email');
await Promise.all([
page.click('#identifierNext'),
page.waitForSelector('.whsOnd', {visible: true})
]);
await page.waitForSelector('input[name=password]', {visible: true});
await page.type('input[name=password]', "my.password");
await page.waitForSelector('#passwordNext', {visible: true});
await page.waitFor(1000);
await page.click('#passwordNext');
await page.waitFor(5000);
})();
Please also note few differences from your code - the selector for password field is different. I had to add await page.waitForSelector('#passwordNext', {visible: true}); and a small timeout after that so the button could be clicked successfully.
I've tested all the code above and it worked successfully. Please, let me know if you still need some help or facing troubles with my example.
The purpose of question is to login to Gmail. I will share another method that does not involve filling email and password fields on puppeteer script
and works in headless: true mode.
Method
Login to your gmail using normal browser (google chrome preferebbly)
Export all cookies for the gmail tab
Use page.setCookie to import the cookies to your puppeteer instance
Login to gmail
This should be no brainer.
Export all cookies
I will use an extension called Edit This Cookie, however you can use other extensions or manual methods to extract the cookies.
Click the browser icon and then click the Export button.
Import cookies to puppeteer instance
We will save the cookies in a cookies.json file and then import using page.setCookie function before navigation. That way when gmail page loads, it will have login information right away.
The code might look like this.
const puppeteer = require("puppeteer");
const cookies = require('./cookies.json');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
// Set cookies here, right after creating the instance
await page.setCookie(...cookies);
// do the navigation,
await page.goto("https://mail.google.com/mail/u/0/#search/stackoverflow+survey", {
waitUntil: "networkidle2",
timeout: 60000
});
await page.screenshot({ path: "example.png" });
await browser.close();
})();
Result:
Notes:
It was not asked, but I should mention following for future readers.
Cookie Expiration: Cookies might be short lived, and expire shortly after, behave differently on a different device. Logging out on your original device will log you out from the puppeteer as well since it's sharing the cookies.
Two Factor: I am not yet sure about 2FA authentication. It did not ask me about 2FA probably because I logged in from same device.

How do I reference the current page object in puppeteer once user moves from login to homepage?

So I am trying to use puppeteer to automate some data entry functions in Oracle Cloud applications.
As of now I am able to launch the cloud app login page, enter username and password credentials and click login button. Once login is successful, Oracle opens a homepage for the user. Once this happens if I take screenshot or execute a page.content the screenshot and the content html is from the login page not of the homepage.
How do I always have a reference to the current page that the user is on?
Here is the basic code so far.
const puppeteer = require('puppeteer');
const fs = require('fs');
(async () => {
const browser = await puppeteer.launch({headless: false});
let page = await browser.newPage();
await page.goto('oraclecloudloginurl', {waitUntil: 'networkidle2'});
await page.type('#userid', 'USERNAME', {delay : 10});
await page.type('#password', 'PASSWORD', {delay : 10});
await page.waitForSelector('#btnActive', {enabled : true});
page.click('#btnActive', {delay : 1000}).then(() => console.log('Login Button Clicked'));
await page.waitForNavigation();
await page.screenshot({path: 'home.png'});
const html = await page.content();
await fs.writeFileSync('home.html', html);
await page.waitFor(10000);
await browser.close();
})();
With this the user logs in fine and the home page is displayed. But I get an error after that when I try to screenshot the homepage and render the html content. It seems to be the page has changed and I am referring to the old page. How can I refer to the context of the current page?
Below is the error:
(node:14393) UnhandledPromiseRejectionWarning: Error: Protocol error (Runtime.callFunctionOn): Cannot find context with specified id undefined
This code looks problematic for two reasons:
page.click('#btnActive', {delay : 1000}).then(() => console.log('Login Button Clicked'));
await page.waitForNavigation();
The first problem is that the page.click().then() spins off a totally separate promise chain:
page.click() --> .then(...)
|
v
page.waitForNavigation()
|
v
page.screenshot(...)
|
v
...
This means the click that triggers the navigation and the navigation are running in parallel and can never be rejoined into the same promise chain. The usual solution here is to tie them into the same promise chain:
// Note: this code is still broken; keep reading!
await page.click('#btnActive', {delay : 1000});
console.log('Login Button Clicked');
await page.waitForNavigation();
This adheres to the principle of not mixing then and await unless you have good reason to.
But the above code is still broken because Puppeteer requires the waitForNavigation() promise to be set before the event that triggers navigation is fired. The fix is:
await Promise.all([
page.waitForNavigation(),
page.click('#btnActive', {delay : 1000}),
]);
or
const navPromise = page.waitForNavigation(); // no await
await page.click('#btnActive', {delay : 1000});
await navPromise;
Following this pattern, Puppeteer should no longer be confused about its context.
Minor notes:
'networkidle2' is slow and probably unnecessary, especially for a page you're soon going to be navigating away from. I'd default to 'domcontentloaded'.
await page.waitFor(10000); is deprecated along with page.waitForTimeout(), although I realize this is an older post.

Categories

Resources