Trick elapsed visitation time when visiting a website using Python & selenium - javascript

Ok, so I am trying to create a program that allows the background of my Gnome desktop to tap into the stream of wallpapers used by Google's chromecast devices.
Currently I use a loop function in Python that uses selenium and a Chrome webdriver to get the images that are dynamically displayed here:
https://clients3.google.com/cast/chromecast/home/
My function works exactly like I want it to. However the problem is > whenever I visit this website in a browser it starts displaying random wallpapers in a random order, there seems to be a big selection of possible wallpapers, but whenever you reload the page it always shows only about 5 (max) different wallpapers.. since my script reloads the page in each loop, I only get about 5 different wallpapers out of it, while there should be a multitude of that available via the website.
That leads me to question: Can I use selenium for Python to somehow trick the website into thinking I've been around longer then just a few seconds and thereby maybe showing me a different wallpaper?
NB: I know I could also get the wallpapers from non-dynamic websites such as this one, I already got that one to work, but the goal now is to actually tap into the live Chromecast stream. I've searched if there might be an API somewhere for it, but couldn't find one so decided to go with my current approach.
My current code:
import io
import os
from PIL import Image
from pyvirtualdisplay import Display
from random import shuffle
import requests
import sched
from selenium import webdriver
import subprocess
import time
s = sched.scheduler(time.time, time.sleep)
def change_desktop():
display = Display(visible=0, size=(800, 600))
display.start()
browser = webdriver.Chrome()
urllist = ["https://clients3.google.com/cast/chromecast/home/v/c9541b08", "https://clients3.google.com/cast/chromecast/home"]
shuffle(urllist)
browser.get(urllist[0])
element = browser.find_element_by_id("picture-background")
image_source = element.get_attribute("src")
browser.quit()
display.stop()
request = requests.get(image_source)
image = Image.open(io.BytesIO(request.content))
image_format = image.format
current_dir = os.path.dirname(os.path.realpath(__file__))
temp_local_image_location = current_dir + "/interactive_wallpaper." + image_format
image.save(temp_local_image_location)
subprocess.Popen(["/usr/bin/gsettings", "set", "org.gnome.desktop.background", "picture-uri", "'" + temp_local_image_location + "'"], stdout=subprocess.PIPE)
s.enter(30, 1, change_desktop())
s.enter(30, 1, change_desktop())
s.run()

Related

Is it possible to send a key code to an application that is not in front?

I'm writing a simple Automator script in Javascript.
I want to send a key-code(or key-storke) to an OS X application that is not in front.
Basically, I want to run this code and do my things while the script opens a certain application, write text, and hit enter - all of this without bothering my other work.
I want something like this:
Application("System Events").processes['someApp'].windows[0].textFields[0].keyCode(76);
In Script Dictionary, there is keyCode method under Processes Suite.
The above code, however, throws an error that follows:
execution error: Error on line 16: Error: Named parameters must be passed as an object. (-2700)
I understand that the following code works fine, but it require the application to be running in front:
// KeyCode 76 => "Enter"
Application("System Events").keyCode(76);
UPDATE: I'm trying to search something on iTunes(Apple Music). Is this possible without bringing iTunes app upfront?
It's possible to write text in application that is not in front with the help of the GUI Scripting (accessibility), but :
You need to know what UI elements are in the window of your specific
application, and to know the attributes and properties of the
specific element.
You need to add your script in the System Preferences --> Security
& Privacy --> Accessibility.
Here's a sample script (tested on macOS Sierra) to write some text at the position of the cursor in the front document of the "TextEdit" application.
Application("System Events").processes['TextEdit'].windows[0].scrollAreas[0].textAreas[0].attributes["AXSelectedText"].value = "some text" + "\r" // r is the return KEY
Update
To send some key code to a background application, you can use the CGEventPostToPid() method of the Carbon framework.
Here's the script to search some text in iTunes (Works on my computer, macOS Sierra and iTunes Version 10.6.2).
ObjC.import('Carbon')
iPid = Application("System Events").processes['iTunes'].unixId()
searchField = Application("System Events").processes['iTunes'].windows[0].textFields[0]
searchField.buttons[0].actions['AXPress'].perform()
delay(0.1) // increase it, if no search
searchField.focused = true
delay(0.3) // increase it, if no search
searchField.value = "world" // the searching text
searchField.actions["AXConfirm"].perform()
delay(0.1) // increase it, if no search
// ** carbon methods to send the enter key to a background application ***
enterDown = $.CGEventCreateKeyboardEvent($(), 76, true);
enterUp = $.CGEventCreateKeyboardEvent($(), 76, false);
$.CGEventPostToPid(iPid, enterDown);
delay(0.1)
$.CGEventPostToPid(iPid, enterUp);

Scrapy Splash not respecting Rendering "wait" time

I'm using Scrapy and Splash to scrape this page : https://www.athleteshop.nl/shimano-voor-as-108mm-37184
Here's the image I get in Scrapy Shell with view(response):
scrapy shell img
I need the barcode highlighted in red. But it's generated in javascript as it can be seen in the source code in Chrome with F12.
However, although displayed correctly in both Scrapy Shell and Splash localhost, although Splash localhost gives me the right html, the barcode I want to select always equals to None with response.xpath("//table[#class='data-table']//tr[#class='even']/td[#class='data last']/text()").extract_first().
The selector isn't the problem since it works in Chrome's source code.
I've been looking for the answer on the web and SO for two days and no one seems to have the same problem. Is it just that Splash doesn't support it ?
The settings are the classic ones as follows :
SPLASH_URL = 'http://192.168.99.100:8050/'
DOWNLOADER_MIDDLEWARES = {
'scrapy_splash.SplashCookiesMiddleware': 723,
'scrapy_splash.SplashMiddleware': 725,
'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware':
810,
}
SPIDER_MIDDLEWARES = {
'scrapy_splash.SplashDeduplicateArgsMiddleware': 100,
}
DUPEFILTER_CLASS = 'scrapy_splash.SplashAwareDupeFilter'
HTTPCACHE_STORAGE = 'scrapy_splash.SplashAwareFSCacheStorage'
My code is as follows (the parse part aims at clicking on the link provided by a search engine inside the website. It works fine) :
def parse(self, response):
try :
link=response.xpath("//li[#class='item last']/a/#href").extract_first()
yield SplashRequest(link, self.parse_item, endpoint = 'render.html', args={'wait': 20})
except Exception as e:
print (str(e))
def parse_item(self, response):
product = {}
product['name']=response.xpath("//div[#class='product-name']/h1/text()").extract_first()
product['ean']=response.xpath("//table[#class='data-table']//tr[#class='even']/td[#class='data last']/text()").extract_first()
product['price']=response.xpath("//div[#class='product-shop']//p[#class='special-price']/span[#class='price']/text()").extract_first()
product['image']=response.xpath("//div[#class='item image-photo']//img[#class='owl-product-image']/#src").extract_first()
print (product['name'])
print (product['ean'])
print (product['image'])
The print on the name and the image url work perfectly fine since they're not generated by javascript.
The code is alright, the settings are fine, the Splash localhost shows me something good, but my selectors don't work in the execution of the script (which doesn't show any errors), neither in Scrapy Shell.
The problem might be that Scrapy Splash instantly renders without caring about the wait time (20secs !) put in argument. What did I do wrong, please ?
Thanks in advance.
It doesn't seem to me, that the content of barcode field is generated dynamically, I can see it in the page source and extract from scrapy shell with response.css('.data-table tbody tr:nth-child(2) td:nth-child(2)::text').extract_first().

javascript + Selenium WebDriver cannot load list of followers in instgram

I am learning JavaScript,node.js and Selenium Web Driver.
As part of my education process I am developing simple bot for Instagram.
To emulate browser I use Chrome web driver.
Faced problem when trying to get list of followers and amount of followers for the account:
This code opens instagram page, enters credentials, goes to some account and opens followers for this account.
Data like username and password I take from the settings.json.
var webdriver = require('selenium-webdriver'),
by = webdriver.By,
Promise = require('promise'),
settings = require('./settings.json');
var browser = new webdriver
.Builder()
.withCapabilities(webdriver.Capabilities.chrome())
.build();
browser.manage().window().setSize(1024, 700);
browser.get('https://www.instagram.com/accounts/login/');
browser.sleep(settings.sleep_delay);
browser.findElement(by.name('username')).sendKeys(settings.instagram_account_username);
browser.findElement(by.name('password')).sendKeys(settings.instagram_account_password);
browser.findElement(by.xpath('//button')).click();
browser.sleep(settings.sleep_delay);
browser.get('https://www.instagram.com/SomeAccountHere/');
browser.sleep(settings.sleep_delay);
browser.findElement(by.partialLinkText('followers')).click();
This part should open all followers, but not working:
var FollowersAll = browser.findElement(by.className('_4zhc5 notranslate _j7lfh'));
Tried also by xpath:
var FollowersAll = browser.findElement(by.xpath('/html/body/div[2]/div/div[2]/div/div[2]/ul/li[3]/div/div[1]/div/div[1]/a'));
When I run in the browser's console:
var i = document.getElementsByClassName('_4zhc5 notranslate _j7lfh');
it is working fine.
I run code in debug mode (use WebStorm) and it shows in each case that variable "FollowersAll" is undfined.
The same happens when I try to check amount of followers for the account.
Thanks in advance.
example of the selected element
In DOM, class names may be used multiple time. In this case, findElement by className wont work.
Xpath should be Relative and should not be Absolute.
Try Xpath with unique HTML Attribute. For example:
1. //div[#id/text()='value']
In chrome browser, open Developer Tools(press F12). If you framed an Xpath, just press Ctrl+F and paste that Xpath. If it states 1 of 1, then you can surely use that Xpath.
If it states 1 of many, then you need to dig deeper to take exact Xpath.

JMeter - WebDriver Sampler - waitForPopUp

I am trying to work out a comparable command to use in jmeter webdriver sampler (JavaScript) how to do a waitForPopUp command. There must be a way. I have something that works for waiting for an element, but I can't work it out for a popup.
Update
I am using this code for waiting for an element:
var wait = new support_ui.WebDriverWait(WDS.browser, 5000)
WaitForLogo = function() {
var logo = WDS.browser.findElement(org.openqa.selenium.By.xpath("//img[#src='/images/power/ndpowered.gif']"))
}
wait.until(new com.google.common.base.Function(WaitForLogo))
And this works, but I can't work out how reuse this to wait for a popup, that has no name, in Java I have used:
selenium.waitForPopUp("_blank", "30000");
selenium.selectWindow("_blank");
And that works, but I can't work out an comparable JavaScript that will work in Jmeter for performance, as I can't get Java working in Jmeter.
I was able to get this working using:
var sui = JavaImporter(org.openqa.selenium.support.ui)
and:
wait.until(sui.ExpectedConditions.numberOfWindowsToBe(2))
In WebDriver Sampler you have the following methods:
WDS.browser.switchTo.frame('frame name or handle') - for switching to a frame
WDS.browser.switchTo.window('window name or handle') - for switching to a window
WDS.browser.switchTo.alert() - for switching to a modal dialog
WDS.browser.getWindowHandles() - for getting all open browser window handles
See JavaDoc on WebDriver.switchTo method and The WebDriver Sampler: Your Top 10 Questions Answered guide for more details.

Recursively iterate over multiple web pages and scrape using selenium

This is a follow up question to the query which I had about scraping web pages.
My earlier question: Pin down exact content location in html for web scraping urllib2 Beautiful Soup
This question is regarding doing the same, but the issue is to do the same recursively over multiple page s/views.
Here is my code
from selenium.webdriver.firefox import web driver
driver = webdriver.WebDriver()
driver.get('http://www.walmart.com/ip/29701960?page=seeAllReviews')
for review in driver.find_elements_by_class_name('BVRRReviewDisplayStyle3Main'):
title = review.find_element_by_class_name('BVRRReviewTitle').text
rating =review.find_element_by_xpath('.//div[#class="BVRRRatingNormalImage"]//img').get_attribute('title')
print title, rating
From the url, you'll see that no change is seen if we navigate to the second page, otherwise it wouldn't have been an issue. In this case, the next page clicker calls in a javascript from the server. Is there a way we can still scrape this using selenium in python just by some slight modification of my presented code ? Please let me know if there is.
Thanks.
Just click Next after reading each page:
from selenium.webdriver.firefox import webdriver
driver = webdriver.WebDriver()
driver.get('http://www.walmart.com/ip/29701960?page=seeAllReviews')
while True:
for review in driver.find_elements_by_class_name('BVRRReviewDisplayStyle3Main'):
title = review.find_element_by_class_name('BVRRReviewTitle').text
rating = review.find_element_by_xpath('.//div[#class="BVRRRatingNormalImage"]//img').get_attribute('title')
print title,rating
try:
driver.find_element_by_link_text('Next').click()
except:
break
driver.quit()
Or if you want to limit the number of pages that you are reading:
from selenium.webdriver.firefox import webdriver
driver = webdriver.WebDriver()
driver.get('http://www.walmart.com/ip/29701960?page=seeAllReviews')
maxNumOfPages = 10; # for example
for pageId in range(2,maxNumOfPages+2):
for review in driver.find_elements_by_class_name('BVRRReviewDisplayStyle3Main'):
title = review.find_element_by_class_name('BVRRReviewTitle').text
rating = review.find_element_by_xpath('.//div[#class="BVRRRatingNormalImage"]//img').get_attribute('title')
print title,rating
try:
driver.find_element_by_link_text(str(pageId)).click()
except:
break
driver.quit()
I think this would work. Although the python might be a little off, this should give you a starting point:
continue = True
while continue:
try:
for review in driver.find_elements_by_class_name('BVRRReviewDisplayStyle3Main'):
title = review.find_element_by_class_name('BVRRReviewTitle').text
rating =review.find_element_by_xpath('.//div[#class="BVRRRatingNormalImage"]//img').get_attribute('title')
print title, rating
driver.find_element_by_name('BV_TrackingTag_Review_Display_NextPage').click()
except:
print "Done!"
continue = False

Categories

Resources