I am trying to scrape the following webpage: https://steamdb.info/app/730/graphs/
(I have gained permission from the website)
The problem is that the "Monthly Breakdown" table seems to be loaded by Javascript, and BeautifulSoup does not work. When using Selenium to open the webpage, it says that to see the table "You must have Javascript enabled.", which should be enabled when using Selenium. Here is my code:
from selenium import webdriver
options = webdriver.ChromeOptions()
options.add_argument("--enable-javascript")
browser = webdriver.Chrome(options=options)
browser.maximize_window()
url = "https://steamdb.info/app/730/graphs/"
browser.get(url)
Any ways to solve this problem?
How the page should look:
How it looks on Selenium:
Try this and see if you no longer get that error message:
options.add_argument("javascript.enabled", True)
You may also need to look into "waits" here to make sure the async operation on the webpage has time to load.
Update:
To enable or disable javascript :
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
chrome_options = Options()
#value 1 enables it , if you set to 2 it disables it
chrome_options.add_experimental_option( "prefs",{'profile.managed_default_content_settings.javascript': 1})
driver =webdriver.Chrome(r".\chromedriver.exe",options=chrome_options)
driver.get("https://www.google.com")
THis wont solve your issue
you can check if the javascript is enabled by typing below in your address bar:
chrome://settings/content/javascript?search=javascript
you can see that even if its enabled the website won't load properly.
it seems they have enabled security to avoid using selenium in thier website
Previous answer:
There is no command line argument called --enable-javascript chromium project try the above javascript-harmony flag instead.
below are the full list of supported chrome flags:
https://peter.sh/experiments/chromium-command-line-switches/#login-profile
please add screenshots and other information if more help is needed
Related
I am trying to create a script using python and selenium to automate the checkout process at bestbuy.ca.
I get all the way to the final stage where you click to review the final order, but get the following 403 forbidden message (as seen in the network response) when I try to click through to the final step.
Is there something server side that has detected that I am using selenium and preventing me to proceed?
How can I hide the fact that it is selenium being used?
These are the options I am using for selenium:
options = Options()
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument("start-maximized")
options.add_argument("--disable-extensions")
driver = webdriver.Chrome(options=options)
I currently have 10 second delays after each action(ie open page, wait, click add to cart, wait, click checkout, wait)
I have implemented a random useragent to be used on each run:
import fake_useragent
ua = UserAgent()
userAgent = ua.random
options.add_argument(f'user-agent={userAgent}')
I have also modified my chromedriver binary as per the comments in THIS THREAD
Error seen when proceeding to order review page:
After much testing the last few days, here are the options that have allowed me to bypass the restrictions I was facing.
Modified cdc_ string in my chromedriver
Chromedriver options:
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument("--disable-extensions")
options.add_experimental_option('useAutomationExtension', False)
options.add_experimental_option("excludeSwitches", ["enable-automation"])
chrome_driver = webdriver.Chrome(options=options)
Change the property value of the navigator for webdriver to undefined:
chrome_driver.execute_script("Object.defineProperty(navigator, 'webdriver', {get: () => undefined})")
After all three of these were implemented I no longer faced any 403 error when navigating the site and the cart/checkout process.
In my case, either using code to control the browser, or simply starting Chrome through python and manually using the browser always leads to the 403 error, even just adding a product to the cart.
As you said, I think that this site someway knows that the user is using Selenium or some sort of automation tool and the server is blocking API requests.
Searching in stackoverflow I found this https://stackoverflow.com/a/52108199/3228768 but editing the chromedriver results anyway in a failure.
The only way I completed the flow is settings this options:
u = 'https://www.bestbuy.ca/en-ca/category/appliances/26517'
# relevant part start here
options = webdriver.ChromeOptions()
options.add_argument("--disable-blink-features")
options.add_argument("--disable-blink-features=AutomationControlled")
# relevant part ends here
driver = webdriver.Chrome(executable_path=r"chromedriver.exe", options=options)
driver.maximize_window()
driver.get(u)
In this way I managed to add a product to the cart. I think you could use it to proceed the flow until checkout.
Let me know.
Try this one: https://github.com/ultrafunkamsterdam/undetected-chromedriver
It avoids the selenium detection quite well, I've been having good result with it so far. Headless is not guaranteed though.
import undetected_chromedriver as uc
driver = uc.Chrome()
driver.get('https://www.bestbuy.ca/')
Problem Description:
I am currently trying to set up a Selenium webdriver in Java.
However every time I try to load this specific webpage: Expected Website I end up with this Unintentional Website. No matter which driver I use (Firefox,Chrome,Edge), I somehow always get redirected and I did not find any solution to overcome this. Please note that the page loads some JS during the page loading process. This might be causing this redirection.
However if I use a standard browser I get the Expected Website as wished.
Goal:
Load this website with a Selenium webdriver: Expected Website
Additional Information:
The code I am using so far:
System.setProperty("webdriver.gecko.driver", "E:/Downloads/geckodriver.exe");
File pathToBinary = new File(
"C:/Program Files (x86)/Mozilla Firefox/firefox.exe");
FirefoxBinary ffBinary = new FirefoxBinary(pathToBinary);
FirefoxProfile firefoxProfile = new FirefoxProfile();
driver = new FirefoxDriver(ffBinary,firefoxProfile);
driver.get("https://www.liketoknow.it/featured");
try {
Thread.sleep(10000);
}catch (InterruptedException e) {}
driver.quit();
The reason why this happens is the following:
If you want to open this page you have to have granted access. To have so you have to first login on the main webpage.
For other people having a similar issues of getting redirected:
Use a different user agent when you set up your Webdriver and switch to a either mobile or PC/MAC, depending on your needs.
Cheers
Have you seen that /featured is redirecting to the root site literally?
I would go for that first. Probably is something related with it and you will end with the same result if you connect to https://www.liketoknow.it/ in the first place.
I am using Watir Webdriver and a headless(phantomjs) browser to goto a website, login into it and click and download a file using javascript submit button.When I click on submit, I am redirected with 302 to a different address that i can see under my Network.This is url of the file to download.I am degugging using screenshots so i can see the phantomjs is working fine but after it hits on submit button, nothing happens.This whole procedure is working fine on firefox too.Using watir webdriver, how can i get that link and save it in database and redirect my phantomjs to download the file using that link?I tried reading github pull requests, official documentation and blog posts but i am unable to reach to any solution.Please provide me with suggestions or solutions. Even one word suggestion is also appreciated as it might help me to approach the problem.I have tried getting 'http request headers' but didn't succeed.I have browser.cookie.to_a and browser.headers is giving me only object like this Watir::HTMLElementCollection:0x000000024b88c0.Thank you
I was not to find solution to my question using Phantomjs but I have solved the problem using watirwebdriver(0.9.1) headless and firefox(44.0).
These are the settings i have used.
profile = Selenium::WebDriver::Firefox::Profile.new
profile['download.prompt_for_download'] = false
profile['browser.download.folderList'] = 2 # custom location
profile['browser.download.dir'] = download_directory
profile['browser.helperApps.neverAsk.saveToDisk'] = "application/pdf"
profile['pdfjs.disabled'] = true
profile['pdfjs.firstRun'] = false
headless = Headless.new
headless.start
browser = Watir::Browser.new(:firefox, :profile => profile)
browser.goto 'www.google.com'
browser.window.resize_to(1280, 720)
puts browser.title
puts browser.url
I'm trying to turn off javascript via the profile when opening using Selenium. This has work previously but now I've updated Selenium/Firfox I can't get it to work.
profile = webdriver.FirefoxProfile()
profile.set_preference('javascript.enabled', False)
driver = webdriver.Firefox(profile)
driver.implicitly_wait(30)
driver.get("http://www.enable-javascript.com/")
All other settings seem to change while using profile.set_preference() on other option and javascript.enabled exists and is set to True when I look at the Firefox settings about:config. Is it possible Javascript is being set to True after loading the profile or something?
FF version 43.0.3
Selenium version 2.48.0
Any suggestions on why this could be happening?
UPDATE
Adding profile.add_extension("path/to/noscript_security_suite-2.9.xpi"); to the above code with the downloaded extension as #alecxe suggested fixed the issue.
This issue affects selenium starting with 2.46.0, javascript.enabled is being ignored:
Firefox driver 2.46.0 regression - unable to set to non-js
As a workaround, load the noscript addon, see:
How to disable Javascript when using Selenium by JAVA?
I am using selenium for some browser automation. I need to install an extension in the browser for my work. I am doing it as follows:
import selenium
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
executable_path = "/usr/bin/chromedriver"
options = Options()
options.add_extension('/home/TheRookie/Downloads/extensionSamples/abhcfceiempjmchhhdhbnkbimnfpckgl.crx')
browser = webdriver.Chrome(executable_path=executable_path, chrome_options=options)
The browser is starting fine but I am prompted with a pop-up to confirm that I want to add the extension as follows:
and after I get this pop-up, Python soon returns with the following exception:
selenium.common.exceptions.WebDriverException: Message: u'unknown
error: failed to wait for extension background page to load:
chrome-extension://abhcfceiempjmchhhdhbnkbimnfpckgl/toolbar.html\nfrom
unknown error: page could not be found:
chrome-extension://abhcfceiempjmchhhdhbnkbimnfpckgl/toolbar.html\n
(Driver info: chromedriver=2.12.301324
(de8ab311bc9374d0ade71f7c167bad61848c7c48),platform=Linux
3.13.0-39-generic x86_64)'
I tried handling the popup as a regular JavaScript alert using the following code:
alert = browser.switch_to_alert()
alert.accept()
However, this doesn't help. Could anyone please tell me how do I install this extension without the popup or a way to accept the popup? Any help would be greatly appreciated. Thanks!
Usually, you cannot test inline installation of a Chrome extension with just Selenium, because of that installation dialog. There are a few examples in the wild that show how to use external tools outside Selenium to solve this problem, but these are not very portable (i.e. platform-specific) and rely on a state of Chrome's UI, which is not guaranteed to be consistent.
But that does not mean that you cannot test inline installation. If you replace chrome.webstore.install with a substitute that behaves like the chrome.webstore.install API (but without the dialog), then the end-result is the same for all intents and purposes.
"Behaves like chrome.webstore.install" consists of two things:
Same behavior in error reporting and callback invocation.
An extension is installed.
I have just set up such an example on Github, which includes the source code of the helper extension/app and a few examples using Selenium (Python, Java). I suggest to read the README and the source code to get a better understanding of what happens: https://github.com/Rob--W/testing-chrome.webstore.install.
The sample does not require the tested extension to be available in the Chrome Web store. It does not even connect to the Chrome Web store. In particular, it does not check whether the site where the test runs is listed as a verified website, which is required for inline installation to work.
I had some really big code which I would have to re-write if I had to use Java. Luckily, python has a library for automating GUI events called ldtp. I used that to automate the clicking on the "Add" button. I did something on the following lines:
from ldtp import *
from threading import Thread
import selenium
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
def thread_function():
for i in range(5):
if activatewindow('Confirm New Extension'):
generatekeyevent('<left><space>')
break
time.sleep(1)
def main():
executable_path = "/usr/bin/chromedriver"
options = Options()
options.add_extension('/home/TheRookie/Downloads/extensionSamples/abhcfceiempjmchhhdhbnkbimnfpckgl.crx')
thread.start()
browser = webdriver.Chrome(executable_path=executable_path, chrome_options=options)
Hope it helps somebody.