How to use puppeteer to automante Amazon Connect CCP login? - javascript

I'm trying use puppeteer to automate the login process for our agents in Amazon Connect however I can't get puppeteer to finish loading the CCP login page. See code below:
const browser = await puppeteer.launch();
const page = await browser.newPage();
const url = 'https://ccalderon-reinvent.awsapps.com/connect/ccp#/';
await page.goto(url, {waitUntil: 'domcontentloaded'});
console.log(await page.content());
// console.log('waiting for username input');
// await page.waitForSelector('#wdc_username');
await browser.close();
I can never see the content of the page, it times out. Am I doing something wrong? If I launch the browser with { headless: false } I can see the page never finishes loading.
Please note the same code works fine with https://www.github.com/login so it must be something specific to the source code of Connect's CCP.

In case you are from future and having problem with puppeteer for no reason, try to downgrade the puppeteer version first and see if the issue persists.
This seems like a bug with Chromium Development Version 73.0.3679.0, The error log said it could not load specific script somehow, but we could still load the script manually.
The Solution:
Using Puppeteer version 1.11.0 solved this issue. But if you want to use puppeteer version 1.12.2 but with a different chromium revision, you can use the executablePath argument.
Here are the respective versions used on puppeteer (at this point of answer),
Chromium 73.0.3679.0 - Puppeteer v1.12.2
Chromium 72.0.3582.0 - Puppeteer v1.11.0
Chromium 71.0.3563.0 - Puppeteer v1.9.0
Chromium 70.0.3508.0 - Puppeteer v1.7.0
Chromium 69.0.3494.0 - Puppeteer v1.6.2
I checked my locally installed chrome,which was loading the page correctly,
$(which google-chrome) --version
Google Chrome 72.0.3626.119
Note: The puppeteer team suggested on their doc to specifically use the chrome provided with the code (most likely the latest developer version) instead of using different revisions.
Also I edited the code a little bit to finish loading when all network requests is done and the username input is visible.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({
headless: false,
executablePath: "/usr/bin/google-chrome"
});
const page = await browser.newPage();
const url = "https://ccalderon-reinvent.awsapps.com/connect/ccp#/";
await page.goto(url, { waitUntil: "networkidle0" });
console.log("waiting for username input");
await page.waitForSelector("#wdc_username", { visible: true });
await page.screenshot({ path: "example.png" });
await browser.close();
})();
The specific revision number can be obtained in many ways, one is to check the package.json of puppeteer package. The url for 1.11.0 is,
https://github.com/GoogleChrome/puppeteer/blob/v1.11.0/package.json
If you like to automate the chrome revision downloading, you can use browserFetcher to fetch specific revision.
const browserFetcher = puppeteer.createBrowserFetcher();
const revisionInfo = await browserFetcher.download('609904'); // chrome 72 is 609904
const browser = await puppeteer.launch({executablePath: revisionInfo.executablePath})
Result:

Related

In Puppeteer, is it possible to launch a headed browser instance, get the user authenticated, and then continue the session in a headless state? [duplicate]

I want to start a chromium browser instant headless, do some automated operations, and then turn it visible before doing the rest of the stuff.
Is this possible to do using Puppeteer, and if it is, can you tell me how? And if it is not, is there any other framework or library for browser automation that can do this?
So far I've tried the following but it didn't work.
const browser = await puppeteer.launch({'headless': false});
browser.headless = true;
const page = await browser.newPage();
await page.goto('https://news.ycombinator.com', {waitUntil: 'networkidle2'});
await page.pdf({path: 'hn.pdf', format: 'A4'});
Short answer: It's not possible
Chrome only allows to either start the browser in headless or non-headless mode. You have to specify it when you launch the browser and it is not possible to switch during runtime.
What is possible, is to launch a second browser and reuse cookies (and any other data) from the first browser.
Long answer
You would assume that you could just reuse the data directory when calling puppeteer.launch, but this is currently not possible due to multiple bugs (#1268, #1270 in the puppeteer repo).
So the best approach is to save any cookies or local storage data that you need to share between the browser instances and restore the data when you launch the browser. You then visit the website a second time. Be aware that any state the website has in terms of JavaScript variable, will be lost when you recrawl the page.
Process
Summing up, the whole process should look like this (or vice versa for headless to headfull):
Crawl in non-headless mode until you want to switch mode
Serialize cookies
Launch or reuse second browser (in headless mode)
Restore cookies
Revisit page
Continue crawling
As mentioned, this isn't currently possible since the headless switch occurs via Chromium launch flags.
I usually do this with userDataDir, which the Chromium docs describe as follows:
The user data directory contains profile data such as history, bookmarks, and cookies, as well as other per-installation local state.
Here's a simple example. This launches a browser headlessly, sets a local storage value on an arbitrary page, closes the browser, re-opens it headfully, retrieves the local storage value and prints it.
const puppeteer = require("puppeteer"); // ^18.0.4
const url = "https://www.example.com";
const opts = {userDataDir: "./data"};
let browser;
(async () => {
{
browser = await puppeteer.launch({...opts, headless: true});
const [page] = await browser.pages();
await page.goto(url, {waitUntil: "domcontentloaded"});
await page.evaluate(() => localStorage.setItem("hello", "world"));
await browser.close();
}
{
browser = await puppeteer.launch({...opts, headless: false});
const [page] = await browser.pages();
await page.goto(url, {waitUntil: "domcontentloaded"});
const result = await page.evaluate(() => localStorage.getItem("hello"));
console.log(result); // => world
}
})()
.catch(err => console.error(err))
.finally(() => browser?.close())
;
Change const opts = {userDataDir: "./data"}; to const opts = {}; and you'll see null print instead of world; the user data doesn't persist.
The answer from a few years ago mentions issues with userDataDir and suggests a cookies solution. That's fine, but I haven't had any issues with userDataDir so either they've been resolved on the Puppeteer end or my use cases haven't triggered the issues.
There's a useful-looking answer from a reputable source in How to turn headless on after launch? but I haven't had a chance to try it yet.

Can't use Brave Browser with Puppeteer

About a month ago I wrote a question asking if it were possible using Brave Browser with puppeteer; the answer was yes, I tested it, and everything worked perfectly;
today I tried to run the same code but i got the error ERROR: process "xxxxx" not found
Any ideas about this issue?
const puppeteer = require('puppeteer');
(async()=>{
const browser = await puppeteer.launch({
executablePath:"C:/Program Files (x86)/BraveSoftware/Brave-Browser/Application/brave.exe",
headless:false,
devtools:false,
})
const page = await browser.newPage()
})()
You need to do at least two things to get puppeteer working with Brave.
First, you need to enable remote debugging on brave. You need to go to chrome://settings/privacy and then enable Remote debugging.
Second, Brave doesn't like many default command-line arguments that puppeteer sends. So you might want to ignore default arguments.
(async()=>{
const browser = await puppeteer.launch({
executablePath:"/Applications/Brave Browser.app/Contents/MacOS/Brave Browser",
headless:false,
ignoreDefaultArgs: true
})
const page = await browser.newPage()
page.goto("https://www.google.com")
})()

Failed to launch chrome!, failed to launch chrome puppeteer in bamboo for jest image snapshot test

I am trying to run puppeteer in bamboo build run. But seems there is problem to execute it properly. The detail error below
I wonder if there is stuff I have to install to get it able to run in bamboo? or I have to do other alternative. There is no articles available online regarding this issue.
And a bit more background, I am trying to implement jest-image-snapshot into my test process. and making a call to generate snapshot like this
const puppeteer = require('puppeteer');
let browser;
beforeAll(async () => {
browser = await puppeteer.launch();
});
it('show correct page: variant', async () => {
const page = await browser.newPage();
await page.goto(
'http://localhost:8080/app/register?experimentName=2018_12_STREAMLINED_ACCOUNT&experimentVariation=STREAMLINED#/'
);
const image = await page.screenshot();
expect(image).toMatchImageSnapshot();
});
afterAll(async () => {
await browser.close();
});
the reason log of TypeError: Cannot read property 'newPage' of undefined is because const page = await browser.newPage();
The important part is in your screenshot:
Failed to launch chrome! ... No usable sandbox!
Try to launch puppeteer without a sandbox like this:
await puppeteer.launch({
args: ['--no-sandbox']
});
Depending on the platform, you might also want to try the following arguments (also in addition):
--disable-setuid-sandbox
--disable-dev-shm-usage
If all three do not work, the Troubleshooting guide might have additional information.

Use different ip addresses in puppeteer requests

I have multiple ip interfaces in my server and I can't find how to force puppeteer to use them in its requests
I am using node v10.15.0 and puppeteer 1.11.0
You can use the flag --netifs-to-ignore when launching the browser to specify which interfaces should be ignored by Chrome. Quote from the List of Chromium Command Line Switches:
--netifs-to-ignore: List of network interfaces to ignore. Ignored interfaces will not be used for network connectivity
You can use the argument like this when launching the browser:
const browser = await puppeteer.launch({
args: ['--netifs-to-ignore=INTERFACE_TO_IGNORE']
});
Maybe this will help. You can see the full code here
'use strict';
const puppeteer = require('puppeteer');
(async() => {
const browser = await puppeteer.launch({
// Launch chromium using a proxy server on port 9876.
// More on proxying:
// https://www.chromium.org/developers/design-documents/network-settings
args: [ '--proxy-server=127.0.0.1:9876' ]
});
const page = await browser.newPage();
await page.goto('https://google.com');
await browser.close();
})();

How to use addScriptTag() with local file path in Puppeteer

I am trying to use puppeteer with a local script file.
I get the script file to load when I host the file and use addScriptTag() with the localhost address. This is not ideal. I need to use the local file directly from the path. The current working directory is /maps in this case. I set the relative path as path in the options of the addScriptTag() function but the code just goes dark on me at this stage. There is no error and no stepping into anything.
console.log(`Current directory: ${process.cwd()}`);
// C:\Users\dbauszus\Documents\GitHub\maps
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setContent(jsr.templates('./views/report.html').render(), {waitUntil: 'load'});
// works with an url to the same file.
// await page.addScriptTag('http://localhost:3000/maps/js/build/report_bundle.js');
// path for js file on windows C:\Users\dbauszus\Documents\GitHub\maps\public\js\build\report_bundle.js
await page.addScriptTag({path: 'public\\js\\build\\report_bundle.js'});
await page.screenshot({path: 'example.png'});
await browser.close();
})();
Any help will be welcome as I find the puppetteer documentation increasingly frustrating and there aren't (m)any working examples as of now.
It's a version problem. This method was only introduced in the latest update 0.12 from yesterday. I installed puppeteer the day before. Meh!
https://github.com/GoogleChrome/puppeteer/issues/949

Categories

Resources