connect puppeteer to already open browser page

connect puppeteer to already open browser page - javascript

I'm creating an chrome extension that helps people to login in one webesite, and I need to puppeteer connect in this url that the user is, can I connect in one already open website page to manipulate it?
I've tried to
const browserURL = "http://127.0.0.1:21222";
const browser = await puppeteer.connect({ browserURL });
and I tried start the chrome with:
chrome.exe --remote-debugging-port=21222
I need to connect one specific url for example fecebook.com, I tied:
example:
const browserURL = "http://facebook.com:21222";
and without the ":21222"...
I'm using window 10
node v16.16.0
thanks for helping!

Related

Node - How to scrape dynamic websites?

I am trying to scrape the content of a website that must be dynamic because I do not get the content using a regular fetching. The suggested libraries to do that (i could find) was phantom that is not maintained anymore and playwright. With playwright I do get the content but playwright opens the browser
let browser;
browser = await playwright.chromium.launch({headless: false});
const page = await browser.newPage();
await page.goto("https://www.example.com");
the problem with this, I would have a crom job or something like that to periodically fetch content, so the browser couldnt be opened so no content would be fetched again.
Is there any other way I could do it? Thanks!

how to connect puppeteer with an open chrome browser using authenticated proxies

so am connection puppeteer to an already opened chrome browser Using This Peace of code
const browserURL = 'http://127.0.0.1:9222';
const browser = await puppeteer.connect({browserURL , defaultViewport : null });
const page = await browser.newPage();
but most of the times i want to connect to the browser using an authenticated proxies , maybe launching the chrome with specific flags ?
i tried proxy-login-Automator and it would launch chrome with authenticated proxy that i would connect to it but this so complicated if i want to keep changing proxies + if am using many instances of the same code
am launching Chrome with this.
chrome --remote-debugging-port=9222 --user-data-dir="C:\Users\USER\AppData\Local\Google\Chrome\User Data

puppeteer chromium url redirects to malware

I have a node.js (v12.17.0) npm (6.14.4) project I am running on my Windows 10 command prompt. It is using puppeteer ("puppeteer": "^5.3.1",) to go to a url and get info.
My puppeteer code runs this to go to the page:
let targetURL = "https://www.walmart.com/ip/Small-Large-Dogs-Muzzle-Anti-Stop-Bite-Barking-Chewing-Mesh-Mask-Training-S-XXL/449423974";
await page.goto(targetURL, { waitUntil: 'networkidle2' });
And my puppeteer is running a chromium exe I just downloaded, I specify it's locating when I start puppeteer:
var options = {
executablePath: "C:\\Users\\marti\\Downloads\\chrome-win-new\\chrome-win\\chrome.exe",
So when puppeteer stars and goes to that URL, it loads the walmart page at first and seems fine, then a couple of seconds later the page looks like this:
I ran windows defender and malware antibytes to double check if i had a virus, I checked and if i open chromium without puppeteer it doesnt redirect, only seems to happen in my node js program when I run it.
I tried using a different url (https://www.petcarerx.com/chicken-soup-for-the-soul-bacon-and-cheese-crunchy-bites-dog-treats/32928?sku=50857#50857) and it did not redirect, does anyone know how i can fix this for my Walmart url?

Connecting Browsers in Puppeteer

Is it possible to connect a browser to puppeteer without instantiating it in puppeteer? For example, running an instance of chromium like a regular user and then connecting that to an instance of puppeteer in code?

The answer is Yes and No.
You can connect to an existing using the connect function:
const browserURL = 'http://127.0.0.1:21222';
const browser = await puppeteer.connect({browserURL});
But, if you want to use those 2 lines you need to launch Chrome with the "--remote-debugging-port=21222 argument.

I believe you need to connect to an address ended with an id:
ws://127.0.0.1:9222/devtools/browser/{id}
When you launch Chrome with --remote-debugging-port, you'll see something like
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 [17:57:55]
...
DevTools listening on ws://127.0.0.1:9222/devtools/browser/44b3c476-5524-497e-9918-d73fa39e40cf
The address on the last line is what you need, i.e.
const browser = await puppeteer.connect({
browserWSEndpoint: "ws://127.0.0.1:9222/devtools/browser/44b3c476-5524-497e-9918-d73fa39e40cf"
});

Download a file using Watir Webdriver and phantomjs

I am using Watir Webdriver and a headless(phantomjs) browser to goto a website, login into it and click and download a file using javascript submit button.When I click on submit, I am redirected with 302 to a different address that i can see under my Network.This is url of the file to download.I am degugging using screenshots so i can see the phantomjs is working fine but after it hits on submit button, nothing happens.This whole procedure is working fine on firefox too.Using watir webdriver, how can i get that link and save it in database and redirect my phantomjs to download the file using that link?I tried reading github pull requests, official documentation and blog posts but i am unable to reach to any solution.Please provide me with suggestions or solutions. Even one word suggestion is also appreciated as it might help me to approach the problem.I have tried getting 'http request headers' but didn't succeed.I have browser.cookie.to_a and browser.headers is giving me only object like this Watir::HTMLElementCollection:0x000000024b88c0.Thank you

I was not to find solution to my question using Phantomjs but I have solved the problem using watirwebdriver(0.9.1) headless and firefox(44.0).
These are the settings i have used.
profile = Selenium::WebDriver::Firefox::Profile.new
profile['download.prompt_for_download'] = false
profile['browser.download.folderList'] = 2 # custom location
profile['browser.download.dir'] = download_directory
profile['browser.helperApps.neverAsk.saveToDisk'] = "application/pdf"
profile['pdfjs.disabled'] = true
profile['pdfjs.firstRun'] = false
headless = Headless.new
headless.start
browser = Watir::Browser.new(:firefox, :profile => profile)
browser.goto 'www.google.com'
browser.window.resize_to(1280, 720)
puts browser.title
puts browser.url

Develop Reference

JavaScript is the programming language of the Web.

connect puppeteer to already open browser page - javascript

Related

Node - How to scrape dynamic websites?

how to connect puppeteer with an open chrome browser using authenticated proxies

puppeteer chromium url redirects to malware

Connecting Browsers in Puppeteer

Download a file using Watir Webdriver and phantomjs

Categories

Resources