Click javascript tab using mechanize and ruby - javascript

I am using mechanize, ruby and ruby & rails to scrape this website .
And i want to click the "Trabajo En Sala" tab so that I could scrape whatever information in that tab.
I know that mechanize doesn't support javascript, but i read it here how this guy is using mechanize to handle the javascript response. And one thing I noticed, I have more or less the same problem and could probably use the same solution like he did. The reasons being:
1) The tab href is using the same __doPostBack() function
<a id="ctl00_mainPlaceHolder_btnSala" href="javascript:__doPostBack('ctl00$mainPlaceHolder$btnSala','')">Trabajo en sala</a>
2) When I look at the source code, I could clearly see the form which is related to the javascript __doPostBack function:
So I have read the post of that guy wrote and tried to modified his solution into mine. And this is what I got so far:
require 'mechanize'
task :scraper_test => [:environment] do
agent = Mechanize.new
page = agent.get("https://www.camara.cl/camara/diputado_detalle.aspx?prmid=968")
form = page.form("aspnetForm.add_field!('__EVENTTARGET','')")
form.add_field!('ctl00$mainPlaceHolder$btnSala','')
tab = agent.submit(form)
end
p/s: im using rake within rails app to run this.
But when i run it, I got this error:
NoMethodError: undefined method `add_field!' for nil:NilClass
So, can you help me to figure out the right way to do this? Thanks!

I just ran this in my console and you're getting this error
NoMethodError: undefined method `add_field!' for nil:NilClass
because this line returns nil
form = page.form("aspnetForm.add_field!('__EVENTTARGET','')")
Change it to this and it will fix that current error.
form = page.form("aspnetForm")

Related

Execute JavaScript function using Python requests_html

I am trying to execute a JavaScript function (run with a button click) within a session using Python's requests_html
I understand the regular requests library does not have JavaScript support so I am trying to use requests_html instead.
Here's what I have (using requests):
s = requests.Session()
r = s.post(url)
print(r.text)
r2 = s.post(url2)
print(r2.text)
url is the link to the page containing the button and url2 is the POST request link the button's JavaScript function executes. (I found url2 through the network tab while in my browser inspector and clicking the button as a test)
However, this does not work and I get this from r2.text:
<h2>Error(500): An error occurred.</h2>
<p>We are sorry but an unexpected error has occurred on our side while handling your request. In the meantime, please retry your request or try the following:</p>
To my understanding, an error 500 means that the issue is server-side, not client-side. However, clicking the button manually on the webpage works fine.
This brings me to attempting to directly execute the JavaScript function instead. I couldn't find anything on the requests_html documentation. I've also looked at Selenium, but that doesn't seem to be up to date.
It is also worth mentioning that the button inspector looks like this: <button onclick="registerInterest(72833,959320000, '')" type="button" class="btn btn-primary"><i class="far fa-clipboard"></i> Register Interest</button>
So essentially, I would like to execute registerInterest(72833,959320000, '') after my first POST request.
Any help would be greatly appreciated,
I will gladly provide any additional needed information.
You need to use Selenium for manipulating html elements. You can use code like this:
from selenium import webdriver
#set chromodriver.exe path
driver = webdriver.Chrome(executable_path="C:\\chromedriver.exe")
#implicit wait
driver.implicitly_wait(0.5)
#maximize browser
driver.maximize_window()
#launch URL
driver.get("https://www.tutorialspoint.com/index.htm")
#identify element
l =driver.find_element_by_xpath("//button[text()='Check it Now']")
#perform click
l.click()
print("Page title is: ")
print(driver.title)
#close browser
driver.quit()
Just check docs on methods of Selenium and find a method which fits you the best.

How can I read console log (e.g. messages like ok, connected..) of particular website by using a python coding?

I would like to know how to read console log output for a particular website using Python for the automation.
Currently i am trying to read console data using Selenium, but there is no function that I can use to read my logs, I can read the messages but I cannot read the real time data.
Is there any other library that can I use?
One way, how to do it, is to launch Chrome with: chrome.exe" --enable-logging.
Then you can write a function, which would read chrome_debug.log in real time, for example something like this:
source = open(r'C:\Users\spolm\AppData\Local\Google\Chrome\User Data\chrome_debug.log' , encoding="utf-8")`
export=[]
for line in source:
try:
loadline=line.strip()
x+=1
export.append(loadline)
except UnicodeEncodeError:
line.strip()
x+=1
pass
except UnicodeDecodeError:
line.strip()
x+=1
pass
So turns out it's not that trivial. Didn't have the time to try it, but may have found some options:
Option A
Get console content with JS
Run JS with webdriver: driver.execute_script(script,*args)
Get value back to python from selenium
Option B
Chrome devtools API with Selenium:
Selenium side
DevTools side

Sending keys via Selenium to Google Auth fails (using Python and Firefox)

I am following a Django Tutorial by Marina Mele, which is pretty good but a bit outdated since it was last updated in 2016, I believe. I am now trying the Selenium testing and ran into the problem that I can send my e-mail address via Selenium but not the password. My code is
self.get_element_by_id("identifierId").send_keys(credentials["Email"])
self.get_button_by_id("identifierNext").click()
self.get_element_by_tag('input').send_keys(credentials["Passwd"])
self.get_button_by_id("passwordNext").click()
with these functions being defined as:
def get_element_by_id(self, element_id):
return self.browser.wait.until(EC.presence_of_element_located(
(By.ID, element_id)))
def get_element_by_tag(self, element_tag):
return self.browser.wait.until(EC.presence_of_element_located(
(By.TAG_NAME, element_tag)))
def get_button_by_id(self, element_id):
return self.browser.wait.until(EC.element_to_be_clickable(
(By.ID, element_id)))
Most advices that I read to this issue circled around waiting until the element appears. However, this is covered through these functions. And I am using by_tag since the current version of Google Authentication is using an input for the password field that has not an ID but is a div/div/div child of the div with the "passwordIdentifier"-id. I have also tried using Xpath but it seems that this does not make a difference.
Also, it seems like Selenium is capable of finding the elements...at least when I check with print commands. So, locating the element seems not to be the problem. However, Selenium fails to send the keys from what I can see when I look at what happens in the Firefox browser, while Selenium is testing. What could be the issue? Why is Selenium struggling to send the password keys to the Authentication form?
Thanks to everyone in advance!
When you search an input on google registration page, you will find 8 WebElements. I think it is the origin of your problem.
I would use another localizer such as an xpath = //input[#name='password'] or a By on the name instead of the tag name, as implemented below:
def get_element_by_name(self, element_tag):
return self.browser.wait.until(EC.presence_of_element_located(
(By.NAME, element_tag)))
and:
self.get_element_by_id("identifierId").send_keys(credentials["Email"])
self.get_button_by_id("identifierNext").click()
self.get_element_by_name('password').send_keys(credentials["Passwd"])
self.get_button_by_id("passwordNext").click()

jQuery.tokenInput.js script in JavaScript not working

I am making an application that is purely out of JavaScript (frontend and backend). So now I am using jQuery.tokenInput.js and I am having some troubles with the plugin recognizing the script.
First of all, it's not logging any error messages so I don't even know if it's an issue on my end or not.
I've essentially created a route in the application /autocomplete/tags and it accepts q parameter as well.
So when I type in something like this /autocomplete/tags?q=r I get the following result on the page
[{"tag_name":"Android","_id":"ooJaBpZ6MShmzbshY"},{"tag_name":"RPG","_id":"KpvAqCRqKKP5rbGLD"}]
So now when I initialize the plugin like this
$('#tag_input').tokenInput("/autocomplete/tags", {
theme: "facebook",
propertyToSearch: "tag_name",
tokenLimit: 5
});
It changes the input and everything. I've even tried with constant data and it seems to work but not with a script for some reason.
Is there a way I can debug/troubleshoot? Can I somehow turn on logging for this plugin? I don't actually see any issue with the way that I am doing it. I've looked at the demos and they return JSON in exactly the same way.
If you've got any ideas, it would be great!
The JSON returned from an external service must be returned under an application/json header type - we found that this service was returning text/html instead.
Information about how to specify the content type with Meteor can be found on this question.

Python Selenium get javascript document

I have a webpage that contains some information that I am interested in. However, those information are generated by Javascript.
If you do something similar like below:
browser = webdriver.Chrome()
browser.set_window_size(1000, 1000)
browser.get('https://www.xxx.com') # cannot make the web public, sorry
print browser.page_source
It only print out a few javascript functions and some headers which doesn't contain that information that I want - Description of Suppliers, etc... So, when I try to collect those information using Selenium, the browser.find_element_by_class_name would not find the element I want successfully either.
I tried the code below assuming it would has the same effect as typing document in the javascript console, but obviously not.
result = browser.execute_script("document")
print result
and it returns NULL...
However, if I open up the page in Chrome, right click the element and inspect element. I could see the populated source code. See the attached picture.
Also, I was inspired by this commend that helps a lot.
I could open up the javascript console in Chrome, and if I type in
document
I could see the complete html sitting there, which is exactly what I want. I am wondering is there a way to store the js populated source code using selenium?
I've read some posts saying that it requires some security work to store the populated document to client's side.
Hope I have made myself clear and appreciates any suggestion or correction.
(Note, I have zero experience with JS so detailed explaination would be gratefully appreciated!)

Categories

Resources