puppeteer chromium url redirects to malware - javascript

I have a node.js (v12.17.0) npm (6.14.4) project I am running on my Windows 10 command prompt. It is using puppeteer ("puppeteer": "^5.3.1",) to go to a url and get info.
My puppeteer code runs this to go to the page:
let targetURL = "https://www.walmart.com/ip/Small-Large-Dogs-Muzzle-Anti-Stop-Bite-Barking-Chewing-Mesh-Mask-Training-S-XXL/449423974";
await page.goto(targetURL, { waitUntil: 'networkidle2' });
And my puppeteer is running a chromium exe I just downloaded, I specify it's locating when I start puppeteer:
var options = {
executablePath: "C:\\Users\\marti\\Downloads\\chrome-win-new\\chrome-win\\chrome.exe",
So when puppeteer stars and goes to that URL, it loads the walmart page at first and seems fine, then a couple of seconds later the page looks like this:
I ran windows defender and malware antibytes to double check if i had a virus, I checked and if i open chromium without puppeteer it doesnt redirect, only seems to happen in my node js program when I run it.
I tried using a different url (https://www.petcarerx.com/chicken-soup-for-the-soul-bacon-and-cheese-crunchy-bites-dog-treats/32928?sku=50857#50857) and it did not redirect, does anyone know how i can fix this for my Walmart url?

Related

connect puppeteer to already open browser page

I'm creating an chrome extension that helps people to login in one webesite, and I need to puppeteer connect in this url that the user is, can I connect in one already open website page to manipulate it?
I've tried to
const browserURL = "http://127.0.0.1:21222";
const browser = await puppeteer.connect({ browserURL });
and I tried start the chrome with:
chrome.exe --remote-debugging-port=21222
I need to connect one specific url for example fecebook.com, I tied:
example:
const browserURL = "http://facebook.com:21222";
and without the ":21222"...
I'm using window 10
node v16.16.0
thanks for helping!

how to open multiple puppeteer instances without effecting it's speed

so what happening to me is when i open 1 puppeteer instance it would go fast a but the more i open the more time it need to load the URL + fill information is that a normal thing ?
const puppeteer = require('puppeteer');
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto('https://example.com');
await page.screenshot({ path: 'example.png' });
await browser.close();
})();
Answer
Performance of
multiple pupeteer instances
and
running on the same machine
and
testing a single application
is highly dependent on the performance of your machine (4 cored , 8 threads , corei7 7700hq)
On my local setup I could not run more than 10 parallel instances and the performance drop was noticeable the more instances I've launched.
My Story
I have faced similar challenge, when I was trying to simulate multiple users using the same application in parallel.
I know: pupeteer (and/or) similar ui-test-automation tools are not good tools for stresstesting your application; or that: there are better tools for that.
Nevertheless, my case was:
Run "user-like" behavior
From the other end of the world
Collect HAR files - that represent network timings of the browser interacting with 10-20 different systems
Analyze the behavior
My approach was - maybe this helps you:
Create a puppeteer test
Enable headless running
Make it triggerable via curl
Dockerize it
Run the docker image on 10 different machines (5-10 dockerized pupeteer tests/machine)
Trigger them all at once via curl

Launch Puppeteer on a manual Page

Is it possible to manually navigate to a Page, then launch a Puppeteer Script there, navigate to a different Page, launch script there, and so on..
I already did a bit of research but couldn't find anything.
I need to autofill a Calendar but its a little bit difficult to automate the whole Process, so it would be nice if I could navigate manually and launch the script when needed
Does anyone know if this is possible??
You could code an interactive console app, Like the one explained here.
On that app, you would launch a browser with headless in false, navigate where you want to go, and then from the console app you could type a command like fillform and execute the puppeteer code you want to run.
Not sure why someone down voted?
Yeah it's possible. It's not recommended. It's better to work your way through the errors and then understand how page automation really works. This is the point of Puppeteer. It's also already possible to run JavaScript on a page in chrome, using the console in dev-tools.
But if you wanted to manually navigate to a page using puppeteer, then run 'macros' on the page using node.js based on a condition, you'd want to do something like this:
headless: false launch (obviously so you can see the browser)
have your script/fill in function wait for an event on the page like a request which indicates the page was refreshed. You might be able to
use page.on() event to trigger the code wait for the
request to finish.
await page.setRequestInterception(true);
page.on('request', request => {
// Override headers
const headers = Object.assign({}, request.headers(), {
foo: 'bar', // set "foo" header
origin: undefined, // remove "origin" header
});
request.continue({headers});
});
from puppeteer

Connecting Browsers in Puppeteer

Is it possible to connect a browser to puppeteer without instantiating it in puppeteer? For example, running an instance of chromium like a regular user and then connecting that to an instance of puppeteer in code?
The answer is Yes and No.
You can connect to an existing using the connect function:
const browserURL = 'http://127.0.0.1:21222';
const browser = await puppeteer.connect({browserURL});
But, if you want to use those 2 lines you need to launch Chrome with the "--remote-debugging-port=21222 argument.
I believe you need to connect to an address ended with an id:
ws://127.0.0.1:9222/devtools/browser/{id}
When you launch Chrome with --remote-debugging-port, you'll see something like
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222 [17:57:55]
...
DevTools listening on ws://127.0.0.1:9222/devtools/browser/44b3c476-5524-497e-9918-d73fa39e40cf
The address on the last line is what you need, i.e.
const browser = await puppeteer.connect({
browserWSEndpoint: "ws://127.0.0.1:9222/devtools/browser/44b3c476-5524-497e-9918-d73fa39e40cf"
});

Download a file using Watir Webdriver and phantomjs

I am using Watir Webdriver and a headless(phantomjs) browser to goto a website, login into it and click and download a file using javascript submit button.When I click on submit, I am redirected with 302 to a different address that i can see under my Network.This is url of the file to download.I am degugging using screenshots so i can see the phantomjs is working fine but after it hits on submit button, nothing happens.This whole procedure is working fine on firefox too.Using watir webdriver, how can i get that link and save it in database and redirect my phantomjs to download the file using that link?I tried reading github pull requests, official documentation and blog posts but i am unable to reach to any solution.Please provide me with suggestions or solutions. Even one word suggestion is also appreciated as it might help me to approach the problem.I have tried getting 'http request headers' but didn't succeed.I have browser.cookie.to_a and browser.headers is giving me only object like this Watir::HTMLElementCollection:0x000000024b88c0.Thank you
I was not to find solution to my question using Phantomjs but I have solved the problem using watirwebdriver(0.9.1) headless and firefox(44.0).
These are the settings i have used.
profile = Selenium::WebDriver::Firefox::Profile.new
profile['download.prompt_for_download'] = false
profile['browser.download.folderList'] = 2 # custom location
profile['browser.download.dir'] = download_directory
profile['browser.helperApps.neverAsk.saveToDisk'] = "application/pdf"
profile['pdfjs.disabled'] = true
profile['pdfjs.firstRun'] = false
headless = Headless.new
headless.start
browser = Watir::Browser.new(:firefox, :profile => profile)
browser.goto 'www.google.com'
browser.window.resize_to(1280, 720)
puts browser.title
puts browser.url

Categories

Resources