I have make a node script that uses puppeteer on MacOS. The script just launches puppeteer and intercept requests.
Here is the part of the code that uses puppeteer:
const getAllUrls = async (rootUrl) => {
const puppeteer = require('puppeteer');
const urls = [];
await puppeteer.launch().then(async browser => {
const page = await browser.newPage();
await page.setRequestInterception(true);
page.on('request', interceptedRequest => {
if (isRelevantUrl(interceptedRequest.url())) {
urls.push(interceptedRequest.url());
interceptedRequest.abort();
} else {
interceptedRequest.continue();
}
});
await page.goto(rootUrl);
await browser.close()
})
.catch(err => console.log(err));
return urls;
}
While running it on MacOS the script works great. But when I try running it on my office with Windows I get the following error message:
Error: Failed to launch chrome!
TROUBLESHOOTING:
https://github.com/GoogleChrome/puppeteer/blob/master/docs/troubleshooting.md
at onClose
(C:\Users........\node_modules\puppeteer\lib\Launcher.js:339:14) at
ChildProcess.helper.addEventListener
(C:\Users........\node_modules\puppeteer\lib\Launcher.js:329:60)
I have tried the following config recommended by puppeteer troubleshooting:
const browser = await puppeteer.launch({
ignoreDefaultArgs: ['--disable-extensions'],
});
But it didn't helped.
I have hard copied the script (without node-modules of course) and paste it on the project at my office. Then did npm i.
The rest of the packages used on the script worked good on Windows as well.
Please help.
Related
I'm on a Windows 10 machine, I've downloaded the Tor browser and using the Tor browser normally works fine, but I'd like to make Puppeteer use Tor to launch in a headless mode, I'm seeing a lot regarding the Socks5 proxy but can't figure out how to set this up and why it's not working? Presumably when running the launch method it launches Tor in the background?
Here's my JS code in node so far...
// puppeteer-extra is a drop-in replacement for puppeteer,
// it augments the installed puppeteer with plugin functionality
const puppeteer = require('puppeteer-extra')
// add stealth plugin and use defaults (all evasion techniques)
const StealthPlugin = require('puppeteer-extra-plugin-stealth')
puppeteer.use(StealthPlugin())
// artificial sleep function
const sleep = async (ms) => {
return new Promise((res, rej) => {
setTimeout(() => {
res()
}, ms)
})
}
// login function
const emulate = async () => {
// initiate a Puppeteer instance with options and launch
const browser = await puppeteer.launch({
headless: false,
args: [
'--proxy-server=socks5://127.0.0.1:1337'
]
});
// launch Facebook and wait until idle
const page = await browser.newPage()
// go to Tor
await page.goto('https://check.torproject.org/');
const isUsingTor = await page.$eval('body', el =>
el.innerHTML.includes('Congratulations. This browser is configured to use Tor')
);
if (!isUsingTor) {
console.log('Not using Tor. Closing...')
return await browser.close()
}
// do something...
}
// kick it off
emulate()
This gives me a ERR_PROXY_CONNECTION_FAILED error in chromium, why isn't it launching using Tor?
There are lot more steps you need to take.
You need to install tor on your system. You might want to use-
brew install tor
Start tor with-
brew services start tor
tor use port 9050 by default, so your proxy should look like this;
--proxy-server=socks5://127.0.0.1:9050
If you must use another port, then it must be added in the torrc file.
Also, you might need to do your //go to tor// before your //launch facebook//
I'm using puppeteer on my telegram bot to fetch stuff from the web.
Locally, it's all work just fine, but here is the catch:
When trying to fetch the URL's HTML, the given response is not the HTML file, but the web app with all the js files. On my local machine - I'm getting the HTML file with all needed links.
That's my code:
return await puppeteer.launch({
args: ['--no-sandbox', '--disable-setuid-sandbox'],
headless: true,
}).then(async browser => {
const page = await browser.newPage()
page.setJavaScriptEnabled(true)
await page.goto(targetUrl)
await page.waitForTimeout(2000)
const data = await page.$$eval(SELECTORS, (res) => {
return res.map(r => {
return r.getAttribute('ATRR')
})
}) as never as string[]
.......
})
PS --------------
already added this repo to Heroku:
https://github.com/jontewks/puppeteer-heroku-buildpack
Apparently, when you are using puppeteer-extra, you need to install puppeteer library too.
For some reason, they both depend BUT not auto-installed, so the Chromium engine didn't install on Heroku, and it does explain how it does on my local machine - which has it by default.
I'm trying something really simple:
Navigate to google.com
Fill the search box with "cheese"
Press enter on the search box
Print the text for the title of the first result
So simple, but I can't get it to work. This is the code:
const playwright = require('playwright');
(async () => {
for (const browserType of ['chromium', 'firefox', 'webkit']) {
const browser = await playwright[browserType].launch();
try {
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
await page.press('input[name=q]', 'Enter');
await page.waitForNavigation();
page.waitForSelector('div#rso h3')
.then(firstResult => console.log(`${browserType}: ${firstResult.textContent()}`))
.catch(error => console.error(`Waiting for result: ${error}`));
} catch(error) {
console.error(`Trying to run test on ${browserType}: ${error}`);
} finally {
await browser.close();
}
}
})();
At first I tried to get the first result with a page.$() but it didn't work. After investigating the issue a little bit I discovered that page.waitForNavigation() that I thought would be the solution, but it isn't.
I'm using the latest playwright version: 1.0.2.
It seems to me that the only problem was with your initial promise composition, I've just refactored the promise to async/await and using page.$eval to retrieve the textContent it works perfectly, there are no target closed errors anymore.
try {
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
await page.press('input[name=q]', 'Enter');
await page.waitForNavigation();
// page.waitForSelector('div#rso h3').then(firstResult => console.log(`${browserType}: ${firstResult.textContent()}`)).catch(error => console.error(`Waiting for result: ${error}`));
await page.waitForSelector('div#rso h3');
const firstResult = await page.$eval('div#rso h3', firstRes => firstRes.textContent);
console.log(`${browserType}: ${firstResult}`)
} catch(error) {
console.error(`Trying to run test on ${browserType}: ${error}`);
} finally {
await browser.close();
}
}
Output:
chrome: Cheese – Wikipedia
firefox: Cheese – Wikipedia
webkit: Cheese – Wikipedia
Note: chrome and webkit works, firefox fails on waitForNavigation for me. If I replaced it with await page.waitForTimeout(5000); firefox worked as well. It might be an issue with playwright's Firefox support for the navigation promise.
If you await the page.press('input[name=q]', 'Enter'); it might be too late for waitForNavigation to work.
You could remove the await on the press call. You can need to wait for the navigation, not the press action.
const context = await browser.newContext();
const page = await context.newPage();
await page.goto('https://google.com');
await page.fill('input[name=q]', 'cheese');
page.press('input[name=q]', 'Enter');
await page.waitForNavigation();
var firstResult = await page.waitForSelector('div#rso h3');
console.log(`${browserType}: ${await firstResult.textContent()}`);
Also notice that you need to await for textContent().
In my case the Playwright error Target closed appeared at the first attempt to retrieve a text from the page.
The error is inaccurate, the actual reason was that the Basic Auth was enabled in the target site.
Playwright could not open a page and just stuck with "Target closed".
const options = {
httpCredentials = { username: 'user', password: 'password'}
};
const context = await browser.newContext(options);
One more issue was that local tests were running without a problem, including docker containers, while Github CI was failing with Playwright without any details except the above error.
The reason was with a special symbol in a Github Secret. For example, the dollar sign $ will be just removed from the secret in Github Actions. To correct it, either use env: section
env:
XXX: ${ secrets.SUPER_SECRET }
or wrap the secret in single quotes:
run: |
export XXX='${{ secrets.YYY}}'
A similar escaping specificity exists in Kubernetes, Docker and Gitlub; $$ becomes $ and z$abc becomes z.
Use mcr.microsoft.com/playwright docker hub image from Microsoft with pre-installed node, npm and playwright. Alternatively during the playwright installation do not forget to install system package dependencies by running npx playwright install-deps.
A VM should have enough resources to handle browser tests. A common problem in CI/CD worfklows.
I am trying to run puppeteer in bamboo build run. But seems there is problem to execute it properly. The detail error below
I wonder if there is stuff I have to install to get it able to run in bamboo? or I have to do other alternative. There is no articles available online regarding this issue.
And a bit more background, I am trying to implement jest-image-snapshot into my test process. and making a call to generate snapshot like this
const puppeteer = require('puppeteer');
let browser;
beforeAll(async () => {
browser = await puppeteer.launch();
});
it('show correct page: variant', async () => {
const page = await browser.newPage();
await page.goto(
'http://localhost:8080/app/register?experimentName=2018_12_STREAMLINED_ACCOUNT&experimentVariation=STREAMLINED#/'
);
const image = await page.screenshot();
expect(image).toMatchImageSnapshot();
});
afterAll(async () => {
await browser.close();
});
the reason log of TypeError: Cannot read property 'newPage' of undefined is because const page = await browser.newPage();
The important part is in your screenshot:
Failed to launch chrome! ... No usable sandbox!
Try to launch puppeteer without a sandbox like this:
await puppeteer.launch({
args: ['--no-sandbox']
});
Depending on the platform, you might also want to try the following arguments (also in addition):
--disable-setuid-sandbox
--disable-dev-shm-usage
If all three do not work, the Troubleshooting guide might have additional information.
I'm trying use puppeteer to automate the login process for our agents in Amazon Connect however I can't get puppeteer to finish loading the CCP login page. See code below:
const browser = await puppeteer.launch();
const page = await browser.newPage();
const url = 'https://ccalderon-reinvent.awsapps.com/connect/ccp#/';
await page.goto(url, {waitUntil: 'domcontentloaded'});
console.log(await page.content());
// console.log('waiting for username input');
// await page.waitForSelector('#wdc_username');
await browser.close();
I can never see the content of the page, it times out. Am I doing something wrong? If I launch the browser with { headless: false } I can see the page never finishes loading.
Please note the same code works fine with https://www.github.com/login so it must be something specific to the source code of Connect's CCP.
In case you are from future and having problem with puppeteer for no reason, try to downgrade the puppeteer version first and see if the issue persists.
This seems like a bug with Chromium Development Version 73.0.3679.0, The error log said it could not load specific script somehow, but we could still load the script manually.
The Solution:
Using Puppeteer version 1.11.0 solved this issue. But if you want to use puppeteer version 1.12.2 but with a different chromium revision, you can use the executablePath argument.
Here are the respective versions used on puppeteer (at this point of answer),
Chromium 73.0.3679.0 - Puppeteer v1.12.2
Chromium 72.0.3582.0 - Puppeteer v1.11.0
Chromium 71.0.3563.0 - Puppeteer v1.9.0
Chromium 70.0.3508.0 - Puppeteer v1.7.0
Chromium 69.0.3494.0 - Puppeteer v1.6.2
I checked my locally installed chrome,which was loading the page correctly,
$(which google-chrome) --version
Google Chrome 72.0.3626.119
Note: The puppeteer team suggested on their doc to specifically use the chrome provided with the code (most likely the latest developer version) instead of using different revisions.
Also I edited the code a little bit to finish loading when all network requests is done and the username input is visible.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({
headless: false,
executablePath: "/usr/bin/google-chrome"
});
const page = await browser.newPage();
const url = "https://ccalderon-reinvent.awsapps.com/connect/ccp#/";
await page.goto(url, { waitUntil: "networkidle0" });
console.log("waiting for username input");
await page.waitForSelector("#wdc_username", { visible: true });
await page.screenshot({ path: "example.png" });
await browser.close();
})();
The specific revision number can be obtained in many ways, one is to check the package.json of puppeteer package. The url for 1.11.0 is,
https://github.com/GoogleChrome/puppeteer/blob/v1.11.0/package.json
If you like to automate the chrome revision downloading, you can use browserFetcher to fetch specific revision.
const browserFetcher = puppeteer.createBrowserFetcher();
const revisionInfo = await browserFetcher.download('609904'); // chrome 72 is 609904
const browser = await puppeteer.launch({executablePath: revisionInfo.executablePath})
Result: