Puppeteer get window URL through page redirects - javascript

In Puppeteer I'm trying to get the current URL of the page I'm on, however, when the page changes my setInterval doesn't pick up the new URL of the new page, for example, the URL journey looks like:
https://example.com/
https://google.com/ <- redirects after X seconds
I expect to see https://example.com/ listed in the console.log, which I do, but after navigating to a new URL I only ever see the original URL.
What am I missing from my code here?
(async () => {
const browser = await puppeteer.launch({
headless: false
});
const page = await browser.newPage();
await page.goto(argv.url); // <-- the page I go to has some auto-redirects
const currentUrl = () => {
return window.location.href;
}
let data = await page.evaluate(currentUrl);
setInterval(() => {
console.log(`current URL is: ${data}`);
}, 250)
// await browser.close();
})();

You need to call the function each time (currently you just output the same static result of one past call):
setInterval(async () => {
console.log(`current URL is: ${await page.evaluate(currentUrl)}`);
}, 250)

Related

How to set Cookies enabled in puppeteer

I am currently having the problem (puppeteer) in a project that I think cookies are not activated. But I also don't know how to activate them if they are not already activated from the beginning.
Since every website nowaday has Captcha, so we can skip the auto-login part.
I'm new too, I got an idea from here for this.
Firstly check if there is saved cookies.json file, if not, do the manually login, you click the submit button yourself and solve the captcha puzzle (in non-headless mode), the page should be redirected to destination page.
Once the destination page is loaded, save the cookies in to a Json file for next time.
Example:
const puppeteer = require('puppeteer');
const fs = require('fs');
(async () => {
const browser = await puppeteer.launch({
headless: false, //launch in non-headless mode so you can see graphics
defaultViewport: null
});
let [page] = await browser.pages();
await page.setRequestInterception(true);
const getCookies = async (page) => {
// Get page cookies
const cookies = await page.cookies()
// Save cookies to file
fs.writeFile('./cookies.json', JSON.stringify(cookies, null, 4), (err) => {
if (err) console.log(err);
return
});
}
const setCookies = async (page) => {
// Get cookies from file as a string
let cookiesString = fs.readFileSync('./cookies.json', 'utf8');
// Parse string
let cookies = JSON.parse(cookiesString)
// Set page cookies
await page.setCookie.apply(page, cookies);
return
}
page.on('request', async (req) => {
// If URL is already loaded in to system
if (req.url() === 'https://example.com/LoggedUserCP') {
console.log('logged in, get the cookies');
await getCookies(page);
// if it's being in login page, try to set existed cookie
} else if (req.url() === 'https://example.com/Login?next=LoggedUserCP') {
await setCookies(page);
console.log('set the saved cookies');
}
// otherwise go to login page and login yourself
req.continue();
});
await page.goto('https://example.com/Login?next=LoggedUserCP');
})();

How to change url path with puppeteer after login

I'm trying to change the URL path because I have path variable I use, I don't want to do it with page.click because I reach a dead end at some point.
My code is:
const generarPDF = async (id, fecha) => {
const usuarios = await Usuarios.find();
usuarios.forEach(async dato => {
const urlBase = 'http://localhost:3000';
const urlDestino = '/monitor/604c058e90de8c58c8c5ddb3';
const navegador = await puppeteer.launch();
const pagina = await navegador.newPage();
await pagina.setViewport({ width: 1920, height: 1080 });
await pagina.goto(urlBase);
await pagina.type('#usuario', dato.usuario);
await pagina.type('#passwd', '1234');
await pagina.click('#ingresar');
await pagina.goto(urlBase+urlDestino)
await pagina.pdf({ path: 'archivos/incidencia1.pdf', format: 'A4' });
})
}
generarPDF();
These three lines are the ones I use to log in
await pagina.type('#usuario', dato.usuario);
await pagina.type('#passwd', '1234');
await pagina.click('#ingresar');
I know I login correctly, the problem is when I do the second page.goto because it logs me out, Is there any way to prevent that from happening. If I put the url manually it works, also if I do page.url() I obtain the correct url, so the problem is that it logs me out.
Thanks for any help :D
When you use .goto(...) puppeeter wait for the page loading.
When you use .click(...) , NOT. ( https://pptr.dev/#?product=Puppeteer&version=v8.0.0&show=api-pageclickselector-options )
probably you change page before the login page endpoint is fully loaded. Try replace:
await pagina.click('#ingresar')
with
const [response] = await Promise.all([
await pagina.click('#ingresar'),
page.waitForNavigation({'waitUntil':'networkidle0')
]);
Ps... usually i prefer, instead of waitForNavigation, waitForSelector. For example, if in the "login confirmation page" there is a div like: <div class="login-conf">
You can write:
await pagina.click('#ingresar');
await waitForSelector('div.login-conf', 'timeout':3000)
// raise an exception if after 3seconds the page is not loaded with this element

How To Get The URL After Redirecting from Current Page To Another Using Puppeteer?

I'm Aadarshvelu! Recently Started Testing My WebApp Code Using Jest With Puppeteer. So I Have Page Which All Credentials Have Been Filled With Puppeteer.But When SummitButton('signBtn') Clicked POST process Starts
Is There Any Test That Process POST Request?..
Or
How Do I Know Test Has Been Completely Finished?
Or
How to Get The Redirect Page URL While Test Running?
This Is My Code!
const puppeteer = require('puppeteer');
const timeOut = 100 * 1000;
test("Full E2E Test!" , async () => {
const browser = await puppeteer.launch({
headless: false,
slowMo:30,
args: ['--window-size=1920,1080']
});
const page = await browser.newPage();
await page.goto('https://mypass-webapp.herokuapp.com/signUp');
await page.click('input#email');
await page.type('input#email', 'Automation#puppeteer.com');
await page.click('input#username');
await page.type('input#username' , "puppeteer");
await page.click('input#password');
await page.type('input#password' , "puppeteer");
await page.click('#signBtn').then(await console.log(page.url()));
// Here I Need a Test That Checks The Current Page!
await browser.close();
} , timeOut);
Is There Any Test That Process POST Request?..
const [response] = await Promise.all([
page.click('input[type="submit"]'), // After clicking the submit
page.waitForNavigation() // This will set the promise to wait for navigation events
// Then the page will be send POST and navigate to target page
]);
// The promise resolved
How Do I Know Test Has Been Completely Finished?
const [response] = await Promise.all([
page.click('a.my-link'), // Clicking the link will indirectly cause a navigation
page.waitForNavigation('networkidle2') // The promise resolves after navigation has finished after no more than 2 request left
]);
// The promise resolved
How to Get The Redirect Page URL While Test Running?
For example, if the website http://example.com has a single redirect to https://example.com, then the chain will contain one request:
const response = await page.goto('http://example.com');
const chain = response.request().redirectChain();
console.log(chain.length); // Return 1
console.log(chain[0].url()); // Return string 'http://example.com'
If the website https://google.com has no redirects, then the chain will be empty:
const response = await page.goto('https://google.com');
const chain = response.request().redirectChain();
console.log(chain.length); // Return 0
await page.click('#signBtn')
After this simply make another page
const [, page2] = await browser.pages();
And here is your redirect Page Url 👇
const redirectPageUrl = page2.url();
console.log(redirectPageUrl);

Set localstorage items before page loads in puppeteer?

We have some routing logic that kicks you to the homepage if you dont have a JWT_TOKEN set... I want to set this before the page loads/before the js is invoked.
how do i do this ?
You have to register localStorage item like this:
await page.evaluate(() => {
localStorage.setItem('token', 'example-token');
});
You should do it after page page.goto - browser must have an url to register local storage item on it. After this, enter the same page once again, this time token should be here before the page is loaded.
Here is a fully working example:
const puppeteer = require('puppeteer');
const http = require('http');
const html = `
<html>
<body>
<div id="element"></div>
<script>
document.getElementById('element').innerHTML =
localStorage.getItem('token') ? 'signed' : 'not signed';
</script>
</body>
</html>`;
http
.createServer((req, res) => {
res.writeHead(200, { 'Content-Type': 'text/html' });
res.write(html);
res.end();
})
.listen(8080);
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('http://localhost:8080/');
await page.evaluate(() => {
localStorage.setItem('token', 'example-token');
});
await page.goto('http://localhost:8080/');
const text = await page.evaluate(
() => document.querySelector('#element').textContent
);
console.log(text);
await browser.close();
process.exit(0);
})();
There's some discussion about this in Puppeteer's GitHub issues.
You can load a page on the domain, set your localStorage, then go to the actual page you want to load with localStorage ready. You can also intercept the first url load to return instantly instead of actually load the page, potentially saving a lot of time.
const doSomePuppeteerThings = async () => {
const url = 'http://example.com/';
const browser = await puppeteer.launch();
const localStorage = { storageKey: 'storageValue' };
await setDomainLocalStorage(browser, url, localStorage);
const page = await browser.newPage();
// do your actual puppeteer things now
};
const setDomainLocalStorage = async (browser, url, values) => {
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', r => {
r.respond({
status: 200,
contentType: 'text/plain',
body: 'tweak me.',
});
});
await page.goto(url);
await page.evaluate(values => {
for (const key in values) {
localStorage.setItem(key, values[key]);
}
}, values);
await page.close();
};
in 2021 it work with following code:
// store in localstorage the token
await page.evaluateOnNewDocument (
token => {
localStorage.clear();
localStorage.setItem('token', token);
}, 'eyJh...9_8cw');
// open the url
await page.goto('http://localhost:3000/Admin', { waitUntil: 'load' });
The next line from the first comment does not work unfortunately
await page.evaluate(() => {
localStorage.setItem('token', 'example-token'); // not work, produce errors :(
});
Without requiring to double goTo this would work:
const browser = await puppeteer.launch();
browser.on('targetchanged', async (target) => {
const targetPage = await target.page();
const client = await targetPage.target().createCDPSession();
await client.send('Runtime.evaluate', {
expression: `localStorage.setItem('hello', 'world')`,
});
});
// newPage, goTo, etc...
Adapted from the lighthouse doc for puppeteer that do something similar: https://github.com/GoogleChrome/lighthouse/blob/master/docs/puppeteer.md
Try and additional script tag. Example:
Say you have a main.js script that houses your routing logic.
Then a setJWT.js script that houses your token logic.
Then within your html that is loading these scripts order them in this way:
<script src='setJWT.js'></script>
<script src='main.js'></script>
This would only be good for initial start of the page.
Most routing libraries, however, usually have an event hook system that you can hook into before a route renders. I would store the setJWT logic somewhere in that callback.

How to recreate a page with all of the cookies?

I am trying to:
Visit a page that initialises a session
Store the session in a JSON object
Visit the same page, which now should recognise the existing session
The implementation I have attempted is as follows:
import puppeteer from 'puppeteer';
const createSession = async (browser, startUrl) => {
const page = await browser.newPage();
await page.goto(startUrl);
await page.waitForSelector('#submit');
const cookies = await page.cookies();
const url = await page.url();
return {
cookies,
url
};
};
const useSession = async (browser, session) => {
const page = await browser.newPage();
for (const cookie of session.cookies) {
await page.setCookie(cookie);
}
await page.goto(session.url);
};
const run = async () => {
const browser = await puppeteer.launch({
headless: false
});
const session = await createSession(browser, 'http://foo.com/');
// The session has been established
await useSession(browser, session);
await useSession(browser, session);
};
run();
createSession is used to capture the cookies of the loaded page.
useSession are expected to load the page using the existing cookies.
However, this does not work – the session.url page does not recognise the session. It appears that not all cookies are being captured this way.
It appears that page#cookies returns some cookies with the session=true,expires=0 configuration. setCookie ignores these values.
I worked around this by constructing a new cookies array overriding the expires and session properties.
const cookies = await page.cookies();
const sessionFreeCookies = cookies.map((cookie) => {
return {
...cookie,
expires: Date.now() / 1000 + 10 * 60,
session: false
};
});
At the time writing this answer, session property is not documented. Refer to the following issue https://github.com/GoogleChrome/puppeteer/issues/980.
Puppeteer page.cookies() method only fetches cookies for the current page domain. However, there might be cases where it can have cookies from different domains as well.
You can call the internal method Network.getAllCookies to fetch cookies from all the domains.
(async() => {
const browser = await puppeteer.launch({});
const page = await browser.newPage();
await page.goto('https://stackoverflow.com', {waitUntil : 'networkidle2' });
// Here we can get all of the cookies
console.log(await page._client.send('Network.getAllCookies'));
})();
More on this thread here - Puppeteer get 3rd party cookies

Categories

Resources