Puppeteer - Cannot read property 'executablePath' of undefined - javascript

Regardless what i tried , i'm always getting executablePath is undefined. Unfortunately there's not much info on this on google. It would be great anyone could let me know where to dig into deeper to solve this error. revisionInfo is returning undefined.
Error
BrowserFetcher {
_product: 'chrome',
_downloadsFolder: '/var/www/node_modules/puppeteer/.local-chromium',
_downloadHost: 'https://storage.googleapis.com',
_platform: 'linux' }
TypeError: Cannot read property 'executablePath' of undefined
at demo1 (/var/www/filename.js:10:36)
Source Code
const puppeteer = require('puppeteer');
const demo1 = async () => {
try {
const browserFetcher = puppeteer.createBrowserFetcher();
console.log(browserFetcher);
const revisionInfo = await browserFetcher.download('970485');
const browser = await puppeteer.launch({
headless: false,
executablePath: revisionInfo.executablePath,
args: ['--window-size=1920,1080', '--disable-notifications'],
});
const page = await browser.newPage();
await page.setViewport({
width: 1080,
height: 1080,
});
await page.goto('https://example.com', {
waitUntil: 'networkidle0',
});
await page.close();
await browser.close();
} catch (e) {
console.error(e);
}
};
demo1();

Based on your error message, the problem is with this line
executablePath: revisionInfo.executablePath,
where revisionInfo is undefined, meaning this does not give you the data you want:
const revisionInfo = await browserFetcher.download('970485');
If you really need a specific executablePath, you need to make sure that revisionInfo gets the value you want.
Otherwise, you can just remove the line executablePath: revisionInfo.executablePath, and let puppeteer use its default chromium browser.

Look into two things
If you did a apt install chromium-browser , remove that
Try run and install using a x86 server instead of ARM based Server (t4g instance by aws)
Either one of those solved my issue. The code was still the same.

Related

Puppeteer "Failed to launch the browser process!" when launching multiple browsers [duplicate]

So what I am trying to do is to open puppeteer window with my google profile, but what I want is to do it multiple times, what I mean is 2-4 windows but with the same profile - is that possible? I am getting this error when I do it:
(node:17460) UnhandledPromiseRejectionWarning: Error: Failed to launch the browser process!
[45844:13176:0410/181437.893:ERROR:cache_util_win.cc(20)] Unable to move the cache: Access is denied. (0x5)
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless:false,
'--user-data-dir=C:\\Users\\USER\\AppData\\Local\\Google\\Chrome\\User Data',
);
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
Note: It is already pointed in the comments but there is a syntax error in the example. The launch should look like this:
const browser = await puppeteer.launch({
headless: false,
args: ['--user-data-dir=C:\\Users\\USER\\AppData\\Local\\Google\\Chrome\\User Data']
});
The error is coming from the fact that you are launching multiple browser instances at the very same time hence the profile directory will be locked and cannot be moved to reuse by puppeteer.
You should avoid starting chromium instances with the very same user data dir at the same time.
Possible solutions
Make the opened windows sequential, can be useful if you have only a few. E.g.:
const firstFn = async () => await puppeteer.launch() ...
const secondFn = async () => await puppeteer.launch() ...
(async () => {
await firstFn()
await secondFn()
})();
Creating copies of the user-data-dir as User Data1, User Data2 User Data3 etc. to avoid conflict while puppeteer copies them. This could be done on the fly with Node's fs module or even manually (if you don't need a lot of instances).
Consider reusing Chromium instances (if your use case allows it), with browser.wsEndpoint and puppeteer.connect, this can be a solution if you would need to open thousands of pages with the same user data dir.
Note: this one is the best for performance as only one browser will be launched, then you can open as many pages in a for..of or regular for loop as you want (using forEach by itself can cause side effects), E.g.:
const puppeteer = require('puppeteer')
const urlArray = ['https://example.com', 'https://google.com']
async function fn() {
const browser = await puppeteer.launch({
headless: false,
args: ['--user-data-dir=C:\\Users\\USER\\AppData\\Local\\Google\\Chrome\\User Data']
})
const browserWSEndpoint = await browser.wsEndpoint()
for (const url of urlArray) {
try {
const browser2 = await puppeteer.connect({ browserWSEndpoint })
const page = await browser2.newPage()
await page.goto(url) // it can be wrapped in a retry function to handle flakyness
// doing cool things with the DOM
await page.screenshot({ path: `${url.replace('https://', '')}.png` })
await page.goto('about:blank') // because of you: https://github.com/puppeteer/puppeteer/issues/1490
await page.close()
await browser2.disconnect()
} catch (e) {
console.error(e)
}
}
await browser.close()
}
fn()

How to fix "Node is either not visible or not an HTMLElement" error thrown by Puppeteer?

I am creating a bot for posting something every day on Instagram, and I want to use creator studio from Facebook. The script below works fine:
const puppeteer = require('puppeteer');
(async () => {
var username = require('./config')
var password = require('./config')
const browser = await puppeteer.launch();
const ig = await browser.newPage();
await ig.setViewport({
width: 1920,
height: 1080
})
await ig.goto('https://business.facebook.com/creatorstudio/');
await ig.click('.rwb8dzxj');
await ig.waitForSelector('#email');
await ig.type('#email', username.username);
await ig.type('#pass', username.password);
await ig.click('#loginbutton');
await ig.waitForSelector('#media_manager_chrome_bar_instagram_icon');
await ig.click('#media_manager_chrome_bar_instagram_icon');
await ig.waitForSelector('[role=presentation]');
await ig.click('[role=presentation]');
await ig.screenshot({path: 'example.png'});
await browser.close();
})().catch(err => {
console.log(err.message);
})
but when I continue and add:
await ig.waitForSelector('[role=menuitem]');
await ig.click('[role=menuitem]');
I get this error:
"Node is either not visible or not an HTMLElement"
The error caused by the click and hover methods called on an ElementHandle can be mitigated by ensuring the element is "in view" separately from the method (despite the docs stating that the method "scrolls it into view if needed", seems like sometimes the element does not become visible).
Assuming the element is indeed attached to DOM - you can check that via isConnected property, try using something like this (the sample is in TypeScript) - scrollIntoView should make sure the element is centered on:
const safeHover = async <E extends ElementHandle<Element>>(
elem: E
) => {
await elem.evaluate((e) =>
e.scrollIntoView({ block: "center", inline: "center" })
);
await elem.click();
};
Note that the sample uses elementHandle's click method, not Page's. Since you call waitForSelector before clicking, use the returned elementHandle:
//...
const pres = await ig.waitForSelector('[role=presentation]');
pres && await safeHover(pres);
//...
Also, these Q&As may prove useful:
Puppeteer in NodeJS reports 'Error: Node is either not visible or not an HTMLElement'
Puppeteer throws error with "Node not visible..."

Clicking a selector with Puppeteer

So I am having trouble clicking a login button on the nike website..
I am not sure why It keeps crashing, well because it can't find the selector I guess but I am not sure what I am doing wrong.
I would like to also say I am having some sort of memory leak before puppeteer crashes and sometimes it will even crash my macbook completely if I don't cancel the process in time inside the console.
EDIT:
This code also causes a memory leak whenever it crashes forcing me to have to hard reset my mac if I don't cancel the application fast enough.
Node Version: 14.4.0
Puppeteer Version: 5.2.1
Current code:
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
args: ['--start-maximized']
})
const page = await browser.newPage()
await page.goto('https://www.nike.com/')
const winner = await Promise.race([
page.waitForSelector('[data-path="join or login"]'),
page.waitForSelector('[data-path="sign in"]')
])
await page.click(winner._remoteObject.description)
})()
I have also tried:
await page.click('button[data-var]="loginBtn"');
Try it:
await page.click('button[data-var="loginBtn"]');
They are A/B testing their website, so you may land on a page with very different selectors than you retreived while you visited the site from your own chrome browser.
In such cases you can try to grab the elements by their text content (unfortunately in this specific case that also changes over the designs) using XPath and its contains method. E.g. $x('//span[contains(text(), "Sign In")]')[0]
So I suggest to detect both button versions and get their most stable selectors, these can be based on data attributes as well:
A
$('[data-path="sign in"]')
B
$('[data-path="join or login"]')
Then with a Promise.race you can detect which button is present and then extract its selector from the JSHandle#node like this: ._remoteObject.description:
{
type: 'object',
subtype: 'node',
className: 'HTMLButtonElement',
description: 'button.nav-btn.p0-sm.body-3.u-bold.ml2-sm.mr2-sm',
objectId: '{"injectedScriptId":3,"id":1}'
}
=>
button.nav-btn.p0-sm.prl3-sm.pt2-sm.pb2-sm.fs12-nav-sm.d-sm-b.nav-color-grey.hover-color-black
Example:
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
args: ['--start-maximized']
})
const page = await browser.newPage()
await page.goto('https://www.nike.com/')
const winner = await Promise.race([
page.waitForSelector('[data-path="join or login"]'),
page.waitForSelector('[data-path="sign in"]')
])
await page.click(winner._remoteObject.description)
FYI: Maximize the browser window as well to make sure the elment has the same selector name.
defaultViewport: null, args: ['--start-maximized']
Chromium starts in a bit smaller window with puppeteer by default.
You need to use { waitUntil: 'networkidle0' } with page.goto
This tells puppeteer to wait for the network to be idle (no requests for 500ms)
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless: false,
defaultViewport: null,
args: ['--start-maximized']
})
const page = await browser.newPage()
// load the nike.com page and wait for it to fully load (inc A/B scripts)
await page.goto('https://www.nike.com/', { waitUntil: 'networkidle0' })
// select whichever element appears first
var el = await page.waitForSelector('[data-path="join or login"], [data-path="sign in"]', { timeout: 1000 })
// execute click
await page.click(el._remoteObject.description)
})()

puppeteer can't get page source after using evaluate()

I'm using puppeteer to interact with a website using the evaluate() function to maniupulate page front (i.e to click on certain items etc...), click through works fine but I can't return the page source after clicking using evaluate.
I have recreated the error in this simplified script below it loads google.com, clicks on 'I feel lucky' and should then return the page source of the loaded page:
const puppeteer = require('puppeteer');
async function main() {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox']
});
const page = await browser.newPage();
await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});
response = await page.evaluate(() => {
document.getElementsByClassName('RNmpXc')[1].click()
});
await page.waitForNavigation({waitUntil: 'load'});
console.log(response.text());
}
main();
I get the following error:
TypeError: Cannot read property 'text' of undefined
UPDATE New code following suggestion to use page.content()
const puppeteer = require('puppeteer');
async function main() {
const browser = await puppeteer.launch({
headless: false,
args: ['--no-sandbox']
});
const page = await browser.newPage();
await page.goto('https://www.google.com/', {waitUntil: 'networkidle2'});
await page.evaluate(() => {
document.getElementsByClassName('RNmpXc')[1].click()
});
const source = await page.content()
console.log(source);
}
main();
I am now getting the following error:
Error: Execution context was destroyed, most likely because of a navigation.
My question is: How can I return page source using the .text() method after manipulating the webpage using the evaluate() method?
All suggestions / insight / proposals would be very much appreciated thanks.
Since you're asking for page source after javascript modification, I'd assume you want DOM and not the original HTML content. your evaluate function doesn't return anything which results in undefined response. You can use
const source = await page.evaluate(() => new XMLSerializer().serializeToString(document.doctype) + document.documentElement.outerHTML);
or
const source = await page.content();

Puppeteer Crawler - Error: net::ERR_TUNNEL_CONNECTION_FAILED

Currently I have my Puppeteer running with a Proxy on Heroku. Locally the proxy relay works totally fine. I however get the error Error: net::ERR_TUNNEL_CONNECTION_FAILED. I've set all .env info in the Heroku config vars so they are all available.
Any idea how I can fix this error and resolve the issue?
I currently have
const browser = await puppeteer.launch({
args: [
"--proxy-server=https=myproxy:myproxyport",
"--no-sandbox",
'--disable-gpu',
"--disable-setuid-sandbox",
],
timeout: 0,
headless: true,
});
page.authentication
The correct format for proxy-server argument is,
--proxy-server=HOSTNAME:PORT
If it's HTTPS proxy, you can pass the username and password using page.authenticate before even doing a navigation,
page.authenticate({username:'user', password:'password'});
Complete code would look like this,
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch({
headless:false,
ignoreHTTPSErrors:true,
args: ['--no-sandbox','--proxy-server=HOSTNAME:PORT']
});
const page = await browser.newPage();
// Authenticate Here
await page.authenticate({username:user, password:password});
await page.goto('https://www.example.com/');
})();
Proxy Chain
If somehow the authentication does not work using above method, you might want to handle the authentication somewhere else.
There are multiple packages to do that, one is proxy-chain, with this, you can take one proxy, and use it as new proxy server.
The proxyChain.anonymizeProxy(proxyUrl) will take one proxy with username and password, create one new proxy which you can use on your script.
const puppeteer = require('puppeteer');
const proxyChain = require('proxy-chain');
(async() => {
const oldProxyUrl = 'http://username:password#hostname:8000';
const newProxyUrl = await proxyChain.anonymizeProxy(oldProxyUrl);
// Prints something like "http://127.0.0.1:12345"
console.log(newProxyUrl);
const browser = await puppeteer.launch({
args: [`--proxy-server=${newProxyUrl}`],
});
// Do your magic here...
const page = await browser.newPage();
await page.goto('https://www.example.com');
})();

Categories

Resources