We have an angularJs application that popup a modal form (component) on button pressed.
This component loads an iFrame, which I cannot seem to access with Puppeteer.
Have tried with mainFrame.
await page.waitFor(15000);
const frame = page.mainFrame().childFrames().find((iframe) => {
console.log('FRAME', iframe.name(), iframe.url());
return iframe.name() === 'iFrameName';
});
The above only has one frame (the main frame/window).
Have tried with frames
await page.waitFor(15000);
const frame = page.frames().find((iframe) => {
console.log('FRAME', iframe.name(), iframe.url());
return iframe.name() === 'iFrameName';
});
Have tried with contentFrame
await page.waitForSelector('iframe', { visible: true, timeout: 2000 });
const elementHandle = await page.$('iframe');
await page.waitFor(1000);
const frame = await elementHandle.contentFrame();
With the above, elementHandle has a value but frame is null
We have this working with Protractor, were hopping to move to Puppeteers but if there is no solution will have to stick with Protractor (which has it own other issues)
Currently, there is no support for out-of-process iframes (OOPIFs). To be able to work with them, you need to launch Chromium with --disable-features=site-per-process:
const browser = await puppeteer.launch({
args: ['--disable-features=site-per-process']
});
You can track puppeteer's issue/support here.
I have a similar problem, an iframe dynamically called, so that src=(unknown) with a JS
href="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(VARİABLES,,true,,false,))
is it possible to clone The or an iframe via invoking js calling it in puppeteer? if so you can try.
Related
At the following site, after entering a search phrase such as "baby" (try it!), the Puppeteer call page.mouse.down() doesn't have the same effect as clicking and holding the physical mouse: https://www.dextools.io/app/bsc
After entering a search phrase, a fake dropdown select menu appears, which is really an UL, and I am trying to click the first search result. So I use code like this
await page.mouse.move(200, 350); // let's assume this is inside the element I want
await page.mouse.down();
await new Promise((resolve) => setTimeout(resolve, 2000)); // wait 2 secs
await page.mouse.up();
The expected effect of this code is that, for the 2 seconds that Puppeteer is "holding" the mouse button down, the fake dropdown stays visible, and when Puppeteer "releases" the mouse button, the site redirects to the search result for the item selected.
This is exactly what happens when I use the physical mouse.
However, what happens with Puppeteer is, the dropdown just disappears, as if I had hit the Escape key, and the page.mouse.up() command later has no effect any more.
I am aware that PPT has some quirks in respect to mouse, keyboard, holding and releasing buttons and modifier keys, especially when doing all of the above at once. For example, Drag & Drop doesn't work as expected, but none of the workarounds proposed here work for me: https://github.com/puppeteer/puppeteer/issues/1265
I cannot reproduce the issue with this test script. The link is clicked with following navigation:
import puppeteer from 'puppeteer';
const browser = await puppeteer.launch({ headless: false, defaultViewport: null });
try {
const [page] = await browser.pages();
await page.goto('https://www.dextools.io/app/bsc', { timeout: 0 });
const input = await page.waitForSelector('.input-container input');
await input.type('baby');
const link = await page.waitForSelector('.suggestions-container.is-visible a:not(.text-sponsor)');
await link.click();
} catch (err) { console.error(err); }
Instead of two separate mouse-down and up operations, you could try this according to puppeteer docs:
// selector would uniquely identify the button on your page that you would like to click
selector = '#dropdown-btn'
await page.click(selector, {delay: 2000})
Once you have the element of the list that you wanna click, you should look for the first <a> tag inside this element and use the reference you make on this <a> to perform a click.
From puppeteer's documentation it's saying if there is a navigation you should use:
const [response] = await Promise.all([
page.waitForNavigation(waitOptions),
page.click(selector, clickOptions),
]);
where selector will be a reference to the mentioned <a> tag.
Hi i am trying to get to take a screenshot of a website using puppeteer but the site loads quite slow which leads to always not being able to grab any data or take screen shots, I would like to delay my screenshot until the site is finished loading, I have tried a bunch of methods and cant figure it out. Thanks in advance for any help.
This is my Code
const puppeteer = require("puppeteer-extra");
// add stealth plugin and use defaults (all evasion techniques)
const StealthPlugin = require("puppeteer-extra-plugin-stealth");
puppeteer.use(StealthPlugin());
async function scrapeProduct(url) {
//launching puppeteer
const browser = await puppeteer.launch({ headless: true });
const page = await browser.newPage();
await page.goto(url, { waitUntil: "load" });
await page.waitFor("*");
function time() {
var d = new Date();
var n = d.getSeconds();
return console.log(n);
}
time();
await page.screenshot({ path: "testresult.png" });
time();
await browser.close();
}
scrapeProduct("https://www.realcanadiansuperstore.ca/search?search-bar=milk");
waitFor has been depreciated recently so you are better off trying the other events.
I can't inspect the webpage you are taking a screenshot of so cannot tell what might be happening after the load event.
However have you tried the other events puppeteer offers?
waitForNavigation and waitForSelector mentioned in https://stackoverflow.com/a/52501934/484337
If you have control of the page you are taking a screenshot of then you can add a DOM event to it which your puppeteer code can wait for using waitForEvent.
If all else fails and time is not important then you can put in a sleep(n) that is long enough to guarantee the page is loaded.
Since ESPN does not provide an API, I am trying to use Puppeteer to scrape data about my fantasy football league. However, I am having a hard time trying to login using puppeteer due to the login form being nested with an iframe element.
I have gone to http://www.espn.com/login and selected the iframe. I can't seem to select any of the elements within the iframe except for the main section by doing
frame.$('.main')
This is the code that seems to get the iframe with the login form.
const browser = await puppeteer.launch({headless:false});
const page = await browser.newPage();
await page.goto('http://www.espn.com/login')
await page.waitForSelector("iframe");
const elementHandle = await page.$('div#disneyid-wrapper iframe');
const frame = await elementHandle.contentFrame();
await browser.close()
I want to be able to access the username field, password field, and the login button within the iframe element. Whenever I try to access these fields, I get a return of null.
You can get the iframe using contentFrame as you are doing now, and then call $.
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto('http://www.espn.com/login')
const elementHandle = await page.waitForSelector('div#disneyid-wrapper iframe');
const frame = await elementHandle.contentFrame();
await frame.waitForSelector('[ng-model="vm.username"]');
const username = await frame.$('[ng-model="vm.username"]');
await username.type('foo');
await browser.close()
I had an issue with finding stripe elements.
The reason for that is the following:
You can't access an with different origin using JavaScript, it would be a huge security flaw if you could do it. For the same-origin policy browsers block scripts trying to access a frame with a different origin. See more detailed answer here
Therefore when I tried to use puppeteer's methods:Page.frames() and Page.mainFrame(). ElementHandle.contentFrame() I did not return any iframe to me. The problem is that it was happening silently and I couldn't figure out why it couldn't find anything.
Adding these arguments to launch options solved the issue:
'--disable-web-security',
'--disable-features=IsolateOrigins,site-per-process'
This question already has answers here:
Can the browser turned headless mid-execution when it was started normally, or vice-versa?
(2 answers)
Closed 5 months ago.
I'd like to load the page with headless off to let me login.
After login I want to hide it, turning on the headless and let it do what it has to do.
How can I turn on/off the headless after launch?
You cannot toggle headless on fly. But you can share the login using cookies and setCookie if you want.
We will create a simple class to keep the code clean (or that's what I believe for these type of work since they usually get big later). You can do this without all these complexity though. Also, Make sure the cookies are serialized. Do not pass array to toe setCookie function.
There will be three main functions.
1. init()
To create a page object. Mostly to make sure the headless and headful version has similar style of browsing, same user agent etc. Note, I did not include the code to set user agents, it's just there to show the concept.
async init(headless) {
const browser = await puppeteer.launch({
headless
});
const page = await browser.newPage();
// do more page stuff before loading, ie: user agent and so on
return {
page,
browser
};
}
2. getLoginCookies()
Example of showing how you can get cookies from the browser.
// will take care of our login using headful
async getLoginCookies() {
const {
page,
browser
} = await this.init(false)
// asume we load page and login here using some method
// and the website sets some cookie
await page.goto('http://httpbin.org/cookies/set/authenticated/true')
// store the cookie somewhere
this.cookies = await page.cookies() // the cookies are collected as array
// close the page and browser, we are done with this
await page.close();
await browser.close();
return true;
}
You won't need such function if you can provide cookies manually. You can use EditThisCookie or any cookie editing tool. You will get an array of all cookies for that site. Here is how you can do this,
3. useHeadless()
Example of showing how you can set cookies to a browser.
// continue with our normal headless stuff
async useHeadless() {
const {
page,
browser
} = await this.init(true)
// we set all cookies we got previously
await page.setCookie(...this.cookies) // three dots represents spread syntax. The cookies are contained in a array.
// verify the cookies are working properly
await page.goto('http://httpbin.org/cookies');
const content = await page.$eval('body', e => e.innerText)
console.log(content)
// do other stuff
// close the page and browser, we are done with this
// deduplicate this however you like
await page.close();
await browser.close();
return true;
}
4. Creating our own awesome puppeteer instance
// let's use this
(async () => {
const loginTester = new myAwesomePuppeteer()
await loginTester.getLoginCookies()
await loginTester.useHeadless()
})()
Full Code
Walk through the code to understand it better. It's all commented.
const puppeteer = require('puppeteer');
class myAwesomePuppeteer {
constructor() {
// keeps the cookies on the class scope
this.cookies;
}
// creates a browser instance and applies all kind of setup
async init(headless) {
const browser = await puppeteer.launch({
headless
});
const page = await browser.newPage();
// do more page stuff before loading, ie: user agent and so on
return {
page,
browser
};
}
// will take care of our login using headful
async getLoginCookies() {
const {
page,
browser
} = await this.init(false)
// asume we load page and login here using some method
// and the website sets some cookie
await page.goto('http://httpbin.org/cookies/set/authenticated/true')
// store the cookie somewhere
this.cookies = await page.cookies()
// close the page and browser, we are done with this
await page.close();
await browser.close();
return true;
}
// continue with our normal headless stuff
async useHeadless() {
const {
page,
browser
} = await this.init(true)
// we set all cookies we got previously
await page.setCookie(...this.cookies)
// verify the cookies are working properly
await page.goto('http://httpbin.org/cookies');
const content = await page.$eval('body', e => e.innerText)
console.log(content)
// do other stuff
// close the page and browser, we are done with this
// deduplicate this however you like
await page.close();
await browser.close();
return true;
}
}
// let's use this
(async () => {
const loginTester = new myAwesomePuppeteer()
await loginTester.getLoginCookies()
await loginTester.useHeadless()
})()
Here is the result,
➜ node app.js
{
"cookies": {
"authenticated": "true"
}
}
So in short,
You can use the cookies function to get cookies.
You can use extensions like Edit This Cookie to get cookies from your normal browser.
You can use setCookie to set any kind of cookie you get from browser.
I am trying to use jQuery on the pages I load with puppeteer and I wanted to know how I can do the same? My code structure is like:
const puppeteer = require('puppeteer');
let browser = null;
async function getSelectors() {
try{
browser = await puppeteer.launch({args: ['--no-sandbox', '--disable-setuid-sandbox']});
const page = await browser.newPage();
await page.setViewport({width: 1024, height: 1080});
await page.goto('https://www.google.com/');
await page.addScriptTag({url: 'https://code.jquery.com/jquery-3.2.1.min.js'});
var button = $('h1').text();
console.log(button);
} catch (e) {
console.log(e);
}
}
getSelectors();
Also I will be navigating to many pages within puppeteer so is there a way I can just add jQuery once and then use it throughout? A local jquery file implementation would be helpful as well.
I tried implementing the answers from inject jquery into puppeteer page but couldn't get my code to work. I will be doing much more complex stuff than the one illustrated above so I need jQuery and not vanilla JS solutions.
I finally got a tip from How to scrape that web page with Node.js and puppeteer
which helped me understand that the Puppeteer page.evaluate function gives you direct access to the DOM of the page you've just launched in Puppeteer. To get the following code to work, you should know I'm running this test in Jest. Also, you need a suitable URL to a page that has a table element with an ID. Obviously, you can change the details of both the page and the jQuery function you want to try out. I was in the middle of a jQuery Datatables project so I needed to make sure I had a table element and that jQuery could find it. The nice thing about this environment is that the browser is quite simply a real browser, so if I add a script tag to the actual HTML page instead of adding it via Puppeteer, it works just the same.
test('Check jQuery datatables', async () => {
const puppeteer = require('puppeteer');
let browser = await puppeteer.launch();
let page = await browser.newPage();
await page.goto('http://localhost/jest/table.html');
await page.addScriptTag({url: 'https://code.jquery.com/jquery-3.3.1.slim.min.js'});
const result = await page.evaluate(() => {
try {
var table = $("table").attr("id");
return table;
} catch (e) {
return e.message;
}
});
await console.log("result", result);
await browser.close();
});
The key discovery for me: within the page.evaluate function, your JavaScript code runs in the familiar context of the page you've just opened in the browser. I've moved on to create tests for complex objects created using jQuery plugins and within page.evaluate they behave as expected. Trying to use JSDOM was driving me crazy because it behaved a bit like a browser, but was different with regard to the key points I was using to test my application.