how to visit page using requests with cookie? - javascript

I want to visit zoomeye.org using requests module, the cookie from firebug is as follows:
__jsluid=470133a1338c0be13b6fdccf396772c3; csrftoken=WG6eSMS9XaLZfLjICiin8esg1qO3UOFl; Hm_lvt_e58da53564b1ec3fb2539178e6db042e=1448411456; Hm_lpvt_e58da53564b1ec3fb2539178e6db042e=1448505898; __jsl_clearance=1448505830.313|0|EwXSRp%2BrIEF5DR0E5WALlzLMV2Q%3D
The scripts to read web page content:
import requests
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
"Accept-Encoding":"gzip, deflate",
"Accept-Language": "en-GB,en;q=0.5",
"Connection": "keep-alive",
"Host": "www.zoomeye.org",
"Referer": "https://www.zoomeye.org/",
"User-Agent": "Mozilla/5.0 (Windows NT 6.1; rv:41.0) Gecko/20100101 Firefox/41.0"
}
data = open("cookie.txt", "r").read()
cookieDict = {}
for item in data.split(";"):
keyValue = item.split("=")
cookieDict[keyValue[0]] = keyValue[1]
url = "https://www.zoomeye.org/search?q=apache"
r = requests.get(url,cookies=cookieDict, headers=headers)
print r.content
But i fail to read web page content, output as follows:
<script>var dc="";var t_d={hello:"world",t_c:function(x){if(x==="")return;if(x.s
lice(-1)===";"){x=x+" ";};if(x.slice(-2)!=="; "){x=x+"; ";};dc=dc+x;}};(function
(a){eval(function(p,a,c,k,e,d){e=function(c){return(c<a?"":e(parseInt(c/a)))+((c
=c%a)>35?String.fromCharCode(c+29):c.toString(36))};if(!''.replace(/^/,String)){
while(c--)d[e(c)]=k[c]||e(c);k=[function(e){return d[e]}];e=function(){return'\\
w+'};c=1;};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);
return p;}('b d=[5,4,0,1,2,3];b o=[];b p=0;g(b i=d.c;i--;){o[d[i]]=a[i]}o=o.m(\'
\');g(b i=0;i<o.c;i++){l(o.q(i)===\';\'){s(o,p,i);p=i+1}}s(o,p,o.c);j s(t,r,n){k
.h(t.y(r,n))};w("f.e=f.e.v(/[\\?|&]u-x/, \'\')",z);',36,36,'|||||||||||var|lengt
h||href|location|for|t_c||function|t_d|if|join||||charAt||||captcha|replace|setT
imeout|challenge|substring|1500'.split('|'),0,{}));})(['45 GMT;Path=/;', ' 26-No
v-15 03:52:', '__jsl_clearance=1448506365.', '687|0|rtcCTV', 'xuWxRiE8%2BC0', 'W
WncvYkCpQ%3D;Expires=Thu,']);document.cookie=dc;</script>
where the problem is?if you know a better solution for this question, please tell me. Thanks

For some reason the website does not like your user agent. Remove the user agent header and it will work.

Related

How to get html from scraped script data? [duplicate]

This question already has answers here:
Wait page to load before getting data with requests.get in python 3
(6 answers)
Closed 7 months ago.
I have scraped a website using the following code:
import requests
url = "https://jobs.51job.com/beijing/we03/p1000/"
payload={}
headers = {
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
'Accept-Language': 'en-US,en;q=0.9,zh-HK;q=0.8,zh;q=0.7,zh-CN;q=0.6,an;q=0.5,zh-TW;q=0.4',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
'Cookie': '_uab_collina=165942422997031185893743; guid=eb82eddafe61ebf8230a4198f833ae79; nsearch=jobarea%3D%26%7C%26ord_field%3D%26%7C%26recentSearch0%3D%26%7C%26recentSearch1%3D%26%7C%26recentSearch2%3D%26%7C%26recentSearch3%3D%26%7C%26recentSearch4%3D%26%7C%26collapse_expansion%3D; sensorsdata2015jssdkcross=%7B%22distinct_id%22%3A%22eb82eddafe61ebf8230a4198f833ae79%22%2C%22first_id%22%3A%2218257038d65902-00814c9fd9cb0b5-26021a51-2359296-18257038d66448%22%2C%22props%22%3A%7B%22%24latest_traffic_source_type%22%3A%22%E7%9B%B4%E6%8E%A5%E6%B5%81%E9%87%8F%22%2C%22%24latest_search_keyword%22%3A%22%E6%9C%AA%E5%8F%96%E5%88%B0%E5%80%BC_%E7%9B%B4%E6%8E%A5%E6%89%93%E5%BC%80%22%2C%22%24latest_referrer%22%3A%22%22%7D%2C%22identities%22%3A%22eyIkaWRlbnRpdHlfY29va2llX2lkIjoiMTgyNTcwMzhkNjU5MDItMDA4MTRjOWZkOWNiMGI1LTI2MDIxYTUxLTIzNTkyOTYtMTgyNTcwMzhkNjY0NDgiLCIkaWRlbnRpdHlfbG9naW5faWQiOiJlYjgyZWRkYWZlNjFlYmY4MjMwYTQxOThmODMzYWU3OSJ9%22%2C%22history_login_id%22%3A%7B%22name%22%3A%22%24identity_login_id%22%2C%22value%22%3A%22eb82eddafe61ebf8230a4198f833ae79%22%7D%2C%22%24device_id%22%3A%2218257038d65902-00814c9fd9cb0b5-26021a51-2359296-18257038d66448%22%7D; privacy=1659421937; search=jobarea%7E%60000000%7C%21ord_field%7E%601%7C%21recentSearch0%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA02%A1%FB%A1%FA%A1%FB%A1%FA05%A1%FB%A1%FA05%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch1%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA05%A1%FB%A1%FA05%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch2%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA06%A1%FB%A1%FA05%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch3%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA07%A1%FB%A1%FA05%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch4%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA07%A1%FB%A1%FA03%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21collapse_expansion%7E%601%7C%21; acw_tc=ac11000116594238560812054e00dff69255f53eeca27179c31d9ea4800a4f; acw_sc__v2=62e8cc702340c2244fcf4b283cf913fd254d0022; ssxmod_itna=euGQDILDC3ql4iuB=D9DfxmTjG5dDyGoFaeax0yD9qiODUxn4iaDTZY56ClGB1iW3Q5DQTP=YFPl03irbEE7AWzbokGIjbnlD0aDbqF24IzDYYDtxBYDQxAYDGDDpcDzutGuD0IdDje5DFxi1Mx08DeKQ2=DDUWiw5iv=DC57D7KDnOqDAhYDmg4DRP5Deg4D9xKDwgRmq4G7DAyA9xi3n1DDBpi4DQnxp/I6947p7UEvhbSv+4Kfw8k3yEHs2xib0j8wnr=sU7LNi1RqZ00q7jbrG0hrFG+x5erYderrFirr8GxrBVtPWA5xeDi=Gqe03YD; ssxmod_itna2=euGQDILDC3ql4iuB=D9DfxmTjG5dDyGoFaexnK30QqrDlPfExj4re=Fhckx6sAH7/rk2LantiTTNkn4+u5TdniKmSG+418Qey7e=U0oux3x/8CFwILUelS63=+adRjx5tVPUZWZvnkwEWi6EeTcjH+NZ+hXrD=Dn4b+oOCceozEPYTzrY7O/Kah8mzQSgkZEpfaKH1nZcL0OvTH2osczWGK1yk/j5t2Zdwqun0PUkTMaCwxzruvxTFst6FaNDR3ZlclW=kxUbxXeg0Q7WmBeSD7Q/Ab=tfvNKg8PgA8KoIX5gl2pOFTfzi+zvuDbR3wuPb8YQoTzobqoYmkYgM=TboalCLG+6r+kWbcAhLBhz2wWnd8eGQCCPDoxW+PBhBAht7h4CWYQTEkYPeGygDon5imrORplQhfWG7mbYZG4npBBCWe=O8EFCGW4PD7QG4GcDG75iDD=; guid=dfed6851f35e3386f20279eac6d274df; nsearch=jobarea%3D%26%7C%26ord_field%3D%26%7C%26recentSearch0%3D%26%7C%26recentSearch1%3D%26%7C%26recentSearch2%3D%26%7C%26recentSearch3%3D%26%7C%26recentSearch4%3D%26%7C%26collapse_expansion%3D; search=jobarea%7E%60000000%7C%21ord_field%7E%600%7C%21recentSearch0%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0202%A1%FB%A1%FA37%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch1%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0202%A1%FB%A1%FA01%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA99%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch2%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA01%A1%FB%A1%FA%A1%FB%A1%FA01%A1%FB%A1%FA04%A1%FB%A1%FA08%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA01%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch3%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA01%A1%FB%A1%FA%A1%FB%A1%FA01%A1%FB%A1%FA04%A1%FB%A1%FA04%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA01%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21recentSearch4%7E%60000000%A1%FB%A1%FA000000%A1%FB%A1%FA0000%A1%FB%A1%FA00%A1%FB%A1%FA01%A1%FB%A1%FA%A1%FB%A1%FA06%A1%FB%A1%FA01%A1%FB%A1%FA02%A1%FB%A1%FA99%A1%FB%A1%FA9%A1%FB%A1%FA02%A1%FB%A1%FA%A1%FB%A1%FA0%A1%FB%A1%FA%A1%FB%A1%FA1%A1%FB%A1%FA1%7C%21',
'DNT': '1',
'Pragma': 'no-cache',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
'Upgrade-Insecure-Requests': '1',
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36',
'sec-ch-ua': '".Not/A)Brand";v="99", "Google Chrome";v="103", "Chromium";v="103"',
'sec-ch-ua-mobile': '?0',
'sec-ch-ua-platform': '"Windows"'
}
response = requests.request("GET", url, headers=headers, data=payload)
print(response.text)
I expect to get the html code of https://jobs.51job.com/beijing/we03/p1000/
However, I got the following:
<html>
<script>
var arg1='759F9F39A9435F537AE8EBC86B15E2F2C821674D';
var _0x4818=['\x63\x73\x4b\x48\x77\x71\x4d\x49','\x5a\x73\x4b\x4a\x77\x72\x38\x56\x65\x41\x73\x79','\x55\x63\x4b\x69\x4e\x38\x4f\x2f\x77\x70\x6c\x77\x4d\x41\x3d\x3d','\x4a\x52\x38\x43\x54\x67\x3d\x3d','\x59\x73\x4f\x6e\x62\x53\x45\x51\x77\x37\x6f\x7a\x77\x71\x5a\x4b\x65\x73\x4b\x55\x77\x37\x6b\x77\x58\x38\x4f\x52\x49\x51\x3d\x3d','\x77\x37\x6f\x56\x53\x38\x4f\x53\x77\x6f\x50\x43\x6c\x33\x6a\x43\x68\x4d\x4b\x68\x77\x36\x48\x44\x6c\x73\x4b\x58\x77\x34\x73\x2f\x59\x73\x4f\x47','\x66\x77\x56\x6d\x49\x31\x41\x74\x77\x70\x6c\x61\x59\x38\x4f\x74\x77\x35\x63\x4e\x66\x53\x67\x70\x77\x36\x4d\x3d','\x4f\x63\x4f\x4e\x77\x72\x6a\x43\x71\x73\x4b\x78\x54\x47\x54\x43\x68\x73\x4f\x6a\x45\x57\x45\x38\x50\x63\x4f\x63\x4a\x38\x4b\x36','\x55\x38\x4b\x35\x4c\x63\x4f\x74\x77\x70\x56\x30\x45\x4d\x4f\x6b\x77\x34\x37\x44\x72\x4d\x4f\x58','\x48\x4d\x4f\x32\x77\x6f\x48\x43\x69\x4d\x4b\x39\x53\x6c\x58\x43\x6c\x63\x4f\x6f\x43\x31\x6b\x3d','\x61\x73\x4b\x49\x77\x71\x4d\x44\x64\x67\x4d\x75\x50\x73\x4f\x4b\x42\x4d\x4b\x63\x77\x72\x72\x43\x74\x6b\x4c\x44\x72\x4d\x4b\x42\x77\x36\x34\x64','\x77\x71\x49\x6d\x4d\x54\x30\x74\x77\x36\x52\x4e\x77\x35\x6b\x3d','\x44\x4d\x4b\x63\x55\x30\x4a\x6d\x55\x77\x55\x76','\x56\x6a\x48\x44\x6c\x4d\x4f\x48\x56\x63\x4f\x4e\x58\x33\x66\x44\x69\x63\x4b\x4a\x48\x51\x3d\x3d','\x77\x71\x68\x42\x48\x38\x4b\x6e\x77\x34\x54\x44\x68\x53\x44\x44\x67\x4d\x4f\x64\x77\x72\x6a\x43\x6e\x63\x4f\x57\x77\x70\x68\x68\x4e\x38\x4b\x43\x47\x63\x4b\x71\x77\x36\x64\x48\x41\x55\x35\x2b\x77\x72\x67\x32\x4a\x63\x4b\x61\x77\x34\x49\x45\x4a\x63\x4f\x63\x77\x72\x52\x4a\x77\x6f\x5a\x30\x77\x71\x46\x39\x59\x67\x41\x56','\x64\x7a\x64\x32\x77\x35\x62\x44\x6d\x33\x6a\x44\x70\x73\x4b\x33\x77\x70\x59\x3d','\x77\x34\x50\x44\x67\x63\x4b\x58\x77\x6f\x33\x43\x6b\x63\x4b\x4c\x77\x72\x35\x71\x77\x72\x59\x3d','\x77\x72\x4a\x4f\x54\x63\x4f\x51\x57\x4d\x4f\x67','\x77\x71\x54\x44\x76\x63\x4f\x6a\x77\x34\x34\x37\x77\x72\x34\x3d','\x77\x35\x58\x44\x71\x73\x4b\x68\x4d\x46\x31\x2f','\x77\x72\x41\x79\x48\x73\x4f\x66\x77\x70\x70\x63','\x4a\x33\x64\x56\x50\x63\x4f\x78\x4c\x67\x3d\x3d','\x77\x72\x64\x48\x77\x37\x70\x39\x5a\x77\x3d\x3d','\x77\x34\x72\x44\x6f\x38\x4b\x6d\x4e\x45\x77\x3d','\x49\x4d\x4b\x41\x55\x6b\x42\x74','\x77\x36\x62\x44\x72\x63\x4b\x51\x77\x70\x56\x48\x77\x70\x4e\x51\x77\x71\x55\x3d','\x64\x38\x4f\x73\x57\x68\x41\x55\x77\x37\x59\x7a\x77\x72\x55\x3d','\x77\x71\x6e\x43\x6b\x73\x4f\x65\x65\x7a\x72\x44\x68\x77\x3d\x3d','\x55\x73\x4b\x6e\x49\x4d\x4b\x57\x56\x38\x4b\x2f','\x77\x34\x7a\x44\x6f\x63\x4b\x38\x4e\x55\x5a\x76','\x63\x38\x4f\x78\x5a\x68\x41\x4a\x77\x36\x73\x6b\x77\x71\x4a\x6a','\x50\x63\x4b\x49\x77\x34\x6e\x43\x6b\x6b\x56\x62','\x4b\x48\x67\x6f\x64\x4d\x4f\x32\x56\x51\x3d\x3d','\x77\x70\x73\x6d\x77\x71\x76\x44\x6e\x47\x46\x71','\x77\x71\x4c\x44\x74\x38\x4f\x6b\x77\x34\x63\x3d','\x77\x37\x77\x31\x77\x34\x50\x43\x70\x73\x4f\x34\x77\x71\x41\x3d','\x77\x71\x39\x46\x52\x73\x4f\x71\x57\x4d\x4f\x71','\x62\x79\x42\x68\x77\x37\x72\x44\x6d\x33\x34\x3d','\x4c\x48\x67\x2b\x53\x38\x4f\x74\x54\x77\x3d\x3d','\x77\x71\x68\x4f\x77\x37\x31\x35\x64\x73\x4f\x48','\x55\x38\x4f\x37\x56\x73\x4f\x30\x77\x71\x76\x44\x76\x63\x4b\x75\x4b\x73\x4f\x71\x58\x38\x4b\x72','\x59\x69\x74\x74\x77\x35\x44\x44\x6e\x57\x6e\x44\x72\x41\x3d\x3d','\x59\x4d\x4b\x49\x77\x71\x55\x55\x66\x67\x49\x6b','\x61\x42\x37\x44\x6c\x4d\x4f\x44\x54\x51\x3d\x3d','\x77\x70\x66\x44\x68\x38\x4f\x72\x77\x36\x6b\x6b','\x77\x37\x76\x43\x71\x4d\x4f\x72\x59\x38\x4b\x41\x56\x6b\x35\x4f\x77\x70\x6e\x43\x75\x38\x4f\x61\x58\x73\x4b\x5a\x50\x33\x44\x43\x6c\x63\x4b\x79\x77\x36\x48\x44\x72\x51\x3d\x3d','\x77\x6f\x77\x2b\x77\x36\x76\x44\x6d\x48\x70\x73\x77\x37\x52\x74\x77\x6f\x39\x38\x4c\x43\x37\x43\x69\x47\x37\x43\x6b\x73\x4f\x52\x54\x38\x4b\x6c\x57\x38\x4f\x35\x77\x72\x33\x44\x69\x38\x4f\x54\x48\x73\x4f\x44\x65\x48\x6a\x44\x6d\x63\x4b\x6c\x4a\x73\x4b\x71\x56\x41\x3d\x3d','\x4e\x77\x56\x2b','\x77\x37\x48\x44\x72\x63\x4b\x74\x77\x70\x4a\x61\x77\x70\x5a\x62','\x77\x70\x51\x73\x77\x71\x76\x44\x69\x48\x70\x75\x77\x36\x49\x3d','\x59\x4d\x4b\x55\x77\x71\x4d\x4a\x5a\x51\x3d\x3d','\x4b\x48\x31\x56\x4b\x63\x4f\x71\x4b\x73\x4b\x31','\x66\x51\x35\x73\x46\x55\x6b\x6b\x77\x70\x49\x3d','\x77\x72\x76\x43\x72\x63\x4f\x42\x52\x38\x4b\x6b','\x4d\x33\x77\x30\x66\x51\x3d\x3d','\x77\x36\x78\x58\x77\x71\x50\x44\x76\x4d\x4f\x46\x77\x6f\x35\x64'];(function(_0x4c97f0,_0x1742fd){var _0x4db1c=function(_0x48181e){while(--_0x48181e){_0x4c97f0['\x70\x75\x73\x68'](_0x4c97f0['\x73\x68\x69\x66\x74']());}};var _0x3cd6c6=function(){var _0xb8360b={'\x64\x61\x74\x61':{'\x6b\x65\x79':'\x63\x6f\x6f\x6b\x69\x65','\x76\x61\x6c\x75\x65':'\x74\x69\x6d\x65\x6f\x75\x74'},'\x73\x65\x74\x43\x6f\x6f\x6b\x69\x65':function(_0x20bf34,_0x3e840e,_0x5693d3,_0x5e8b26){_0x5e8b26=_0x5e8b26||{};var _0xba82f0=_0x3e840e+'\x3d'+_0x5693d3;var _0x5afe31=0x0;for(var _0x5afe31=0x0,_0x178627=_0x20bf34['\x6c\x65\x6e\x67\x74\x68'];_0x5afe31<_0x178627;_0x5afe31++){var _0x41b2ff=_0x20bf34[_0x5afe31];_0xba82f0+='\x3b\x20'+_0x41b2ff;var _0xd79219=_0x20bf34[_0x41b2ff];_0x20bf34['\x70\x75\x73\x68'](_0xd79219);_0x178627=_0x20bf34['\x6c\x65\x6e\x67\x74\x68'];if(_0xd79219!==!![]){_0xba82f0+='\x3d'+_0xd79219;}}_0x5e8b26['\x63\x6f\x6f\x6b\x69\x65']=_0xba82f0;},'\x72\x65\x6d\x6f\x76\x65\x43\x6f\x6f\x6b\x69\x65':function(){return'\x64\x65\x76';},'\x67\x65\x74\x43\x6f\x6f\x6b\x69\x65':function(_0x4a11fe,_0x189946){_0x4a11fe=_0x4a11fe||function(_0x6259a2){return _0x6259a2;};var _0x25af93=_0x4a11fe(new RegExp('\x28\x3f\x3a\x5e\x7c\x3b\x20\x29'+_0x189946['\x72\x65\x70\x6c\x61\x63\x65'](/([.$?*|{}()[]\/+^])/g,'\x24\x31')+'\x3d\x28\x5b\x5e\x3b\x5d\x2a\x29'));var _0x52d57c=function(_0x105f59,_0x3fd789){_0x105f59(++_0x3fd789);};_0x52d57c(_0x4db1c,_0x1742fd);return _0x25af93?decodeURIComponent(_0x25af93[0x1]):undefined;}};var _0x4a2aed=function(){var _0x124d17=new RegExp('\x5c\x77\x2b\x20\x2a\x5c\x28\x5c\x29\x20\x2a\x7b\x5c\x77\x2b\x20\x2a\x5b\x27\x7c\x22\x5d\x2e\x2b\x5b\x27\x7c\x22\x5d\x3b\x3f\x20\x2a\x7d');return _0x124d17['\x74\x65\x73\x74'](_0xb8360b['\x72\x65\x6d\x6f\x76\x65\x43\x6f\x6f\x6b\x69\x65']['\x74\x6f\x53\x74\x72\x69\x6e\x67']());};_0xb8360b['\x75\x70\x64\x61\x74\x65\x43\x6f\x6f\x6b\x69\x65']=_0x4a2aed;var _0x2d67ec='';var _0x120551=_0xb8360b['\x75\x70\x64\x61\x74\x65\x43\x6f\x6f\x6b\x69\x65']();if(!_0x120551){_0xb8360b['\x73\x65\x74\x43\x6f\x6f\x6b\x69\x65'](['\x2a'],'\x63\x6f\x75\x6e\x74\x65\x72',0x1);}else if(_0x120551){_0x2d67ec=_0xb8360b['\x67\x65\x74\x43\x6f\x6f\x6b\x69\x65'](null,'\x63\x6f\x75\x6e\x74\x65\x72');}else{_0xb8360b['\x72\x65\x6d\x6f\x76\x65\x43\x6f\x6f\x6b\x69\x65']();}};_0x3cd6c6();}(_0x4818,0x15b));var _0x55f3=function(_0x4c97f0,_0x1742fd){var _0x4c97f0=parseInt(_0x4c97f0,0x10);var _0x48181e=_0x4818[_0x4c97f0];if(!_0x55f3['\x61\x74\x6f\x62\x50\x6f\x6c\x79\x66\x69\x6c\x6c\x41\x70\x70\x65\x6e\x64\x65\x64']){(function(){var _0xdf49c6=Function('\x72\x65\x74\x75\x72\x6e\x20\x28\x66\x75\x6e\x63\x74\x69\x6f\x6e\x20\x28\x29\x20'+'\x7b\x7d\x2e\x63\x6f\x6e\x73\x74\x72\x75\x63\x74\x6f\x72\x28\x22\x72\x65\x74\x75\x72\x6e\x20\x74\x68\x69\x73\x22\x29\x28\x29'+'\x29\x3b');var _0xb8360b=_0xdf49c6();var _0x389f44='\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x2b\x2f\x3d';_0xb8360b['\x61\x74\x6f\x62']||(_0xb8360b['\x61\x74\x6f\x62']=function(_0xba82f0){var _0xec6bb4=String(_0xba82f0)['\x72\x65\x70\x6c\x61\x63\x65'](/=+$/,'');for(var _0x1a0f04=0x0,_0x18c94e,_0x41b2ff,_0xd79219=0x0,_0x5792f7='';_0x41b2ff=_0xec6bb4['\x63\x68\x61\x72\x41\x74'](_0xd79219++);~_0x41b2ff&&(_0x18c94e=_0x1a0f04%0x4?_0x18c94e*0x40+_0x41b2ff:_0x41b2ff,_0x1a0f04++%0x4)?_0x5792f7+=String['\x66\x72\x6f\x6d\x43\x68\x61\x72\x43\x6f\x64\x65'](0xff&_0x18c94e>>(-0x2*_0x1a0f04&0x6)):0x0){_0x41b2ff=_0x389f44['\x69\x6e\x64\x65\x78\x4f\x66'](_0x41b2ff);}return _0x5792f7;});}());_0x55f3['\x61\x74\x6f\x62\x50\x6f\x6c\x79\x66\x69\x6c\x6c\x41\x70\x70\x65\x6e\x64\x65\x64']=!![];}if(!_0x55f3['\x72\x63\x34']){var _0x232678=function(_0x401af1,_0x532ac0){var _0x45079a=[],_0x52d57c=0x0,_0x105f59,_0x3fd789='',_0x4a2aed='';_0x401af1=atob(_0x401af1);for(var _0x124d17=0x0,_0x1b9115=_0x401af1['\x6c\x65\x6e\x67\x74\x68'];_0x124d17<_0x1b9115;_0x124d17++){_0x4a2aed+='\x25'+('\x30\x30'+_0x401af1['\x63\x68\x61\x72\x43\x6f\x64\x65\x41\x74'](_0x124d17)['\x74\x6f\x53\x74\x72\x69\x6e\x67'](0x10))['\x73\x6c\x69\x63\x65'](-0x2);}_0x401af1=decodeURIComponent(_0x4a2aed);for(var _0x2d67ec=0x0;_0x2d67ec<0x100;_0x2d67ec++){_0x45079a[_0x2d67ec]=_0x2d67ec;}for(_0x2d67ec=0x0;_0x2d67ec<0x100;_0x2d67ec++){_0x52d57c=(_0x52d57c+_0x45079a[_0x2d67ec]+_0x532ac0['\x63\x68\x61\x72\x43\x6f\x64\x65\x41\x74'](_0x2d67ec%_0x532ac0['\x6c\x65\x6e\x67\x74\x68']))%0x100;_0x105f59=_0x45079a[_0x2d67ec];_0x45079a[_0x2d67ec]=_0x45079a[_0x52d57c];_0x45079a[_0x52d57c]=_0x105f59;}_0x2d67ec=0x0;_0x52d57c=0x0;for(var _0x4e5ce2=0x0;_0x4e5ce2<_0x401af1['\x6c\x65\x6e\x67\x74\x68'];_0x4e5ce2++){_0x2d67ec=(_0x2d67ec+0x1)%0x100;_0x52d57c=(_0x52d57c+_0x45079a[_0x2d67ec])%0x100;_0x105f59=_0x45079a[_0x2d67ec];_0x45079a[_0x2d67ec]=_0x45079a[_0x52d57c];_0x45079a[_0x52d57c]=_0x105f59;_0x3fd789+=String['\x66\x72\x6f\x6d\x43\x68\x61\x72\x43\x6f\x64\x65'](_0x401af1['\x63\x68\x61\x72\x43\x6f\x64\x65\x41\x74'](_0x4e5ce2)^_0x45079a[(_0x45079a[_0x2d67ec]+_0x45079a[_0x52d57c])%0x100]);}return _0x3fd789;};_0x55f3['\x72\x63\x34']=_0x232678;}if(!_0x55f3['\x64\x61\x74\x61']){_0x55f3['\x64\x61\x74\x61']={};}if(_0x55f3['\x64\x61\x74\x61'][_0x4c97f0]===undefined){if(!_0x55f3['\x6f\x6e\x63\x65']){var _0x5f325c=function(_0x23a392){this['\x72\x63\x34\x42\x79\x74\x65\x73']=_0x23a392;this['\x73\x74\x61\x74\x65\x73']=[0x1,0x0,0x0];this['\x6e\x65\x77\x53\x74\x61\x74\x65']=function(){return'\x6e\x65\x77\x53\x74\x61\x74\x65';};this['\x66\x69\x72\x73\x74\x53\x74\x61\x74\x65']='\x5c\x77\x2b\x20\x2a\x5c\x28\x5c\x29\x20\x2a\x7b\x5c\x77\x2b\x20\x2a';this['\x73\x65\x63\x6f\x6e\x64\x53\x74\x61\x74\x65']='\x5b\x27\x7c\x22\x5d\x2e\x2b\x5b\x27\x7c\x22\x5d\x3b\x3f\x20\x2a\x7d';};_0x5f325c['\x70\x72\x6f\x74\x6f\x74\x79\x70\x65']['\x63\x68\x65\x63\x6b\x53\x74\x61\x74\x65']=function(){var _0x19f809=new RegExp(this['\x66\x69\x72\x73\x74\x53\x74\x61\x74\x65']+this['\x73\x65\x63\x6f\x6e\x64\x53\x74\x61\x74\x65']);return this['\x72\x75\x6e\x53\x74\x61\x74\x65'](_0x19f809['\x74\x65\x73\x74'](this['\x6e\x65\x77\x53\x74\x61\x74\x65']['\x74\x6f\x53\x74\x72\x69\x6e\x67']())?--this['\x73\x74\x61\x74\x65\x73'][0x1]:--this['\x73\x74\x61\x74\x65\x73'][0x0]);};_0x5f325c['\x70\x72\x6f\x74\x6f\x74\x79\x70\x65']['\x72\x75\x6e\x53\x74\x61\x74\x65']=function(_0x4380bd){if(!Boolean(~_0x4380bd)){return _0x4380bd;}return this['\x67\x65\x74\x53\x74\x61\x74\x65'](this['\x72\x63\x34\x42\x79\x74\x65\x73']);};_0x5f325c['\x70\x72\x6f\x74\x6f\x74\x79\x70\x65']['\x67\x65\x74\x53\x74\x61\x74\x65']=function(_0x58d85e){for(var _0x1c9f5b=0x0,_0x1ce9e0=this['\x73\x74\x61\x74\x65\x73']['\x6c\x65\x6e\x67\x74\x68'];_0x1c9f5b<_0x1ce9e0;_0x1c9f5b++){this['\x73\x74\x61\x74\x65\x73']['\x70\x75\x73\x68'](Math['\x72\x6f\x75\x6e\x64'](Math['\x72\x61\x6e\x64\x6f\x6d']()));_0x1ce9e0=this['\x73\x74\x61\x74\x65\x73']['\x6c\x65\x6e\x67\x74\x68'];}return _0x58d85e(this['\x73\x74\x61\x74\x65\x73'][0x0]);};new _0x5f325c(_0x55f3)['\x63\x68\x65\x63\x6b\x53\x74\x61\x74\x65']();_0x55f3['\x6f\x6e\x63\x65']=!![];}_0x48181e=_0x55f3['\x72\x63\x34'](_0x48181e,_0x1742fd);_0x55f3['\x64\x61\x74\x61'][_0x4c97f0]=_0x48181e;}else{_0x48181e=_0x55f3['\x64\x61\x74\x61'][_0x4c97f0];}return _0x48181e;};var arg3=null;var arg4=null;var arg5=null;var arg6=null;var arg7=null;var arg8=null;var arg9=null;var arg10=null;var l=function(){while(window[_0x55f3('0x1', '\x58\x4d\x57\x5e')]||window['\x5f\x5f\x70\x68\x61\x6e\x74\x6f\x6d\x61\x73']){};var _0x5e8b26=_0x55f3('0x3', '\x6a\x53\x31\x59');String[_0x55f3('0x5', '\x6e\x5d\x66\x52')][_0x55f3('0x6', '\x50\x67\x35\x34')]=function(_0x4e08d8){var _0x5a5d3b='';for(var _0xe89588=0x0;_0xe89588<this[_0x55f3('0x8', '\x29\x68\x52\x63')]&&_0xe89588<_0x4e08d8[_0x55f3('0xa', '\x6a\x45\x26\x5e')];_0xe89588+=0x2){var _0x401af1=parseInt(this[_0x55f3('0xb', '\x56\x32\x4b\x45')](_0xe89588,_0xe89588+0x2),0x10);var _0x105f59=parseInt(_0x4e08d8[_0x55f3('0xd', '\x58\x4d\x57\x5e')](_0xe89588,_0xe89588+0x2),0x10);var _0x189e2c=(_0x401af1^_0x105f59)[_0x55f3('0xf', '\x57\x31\x46\x45')](0x10);if(_0x189e2c[_0x55f3('0x11', '\x4d\x47\x72\x76')]==0x1){_0x189e2c='\x30'+_0x189e2c;}_0x5a5d3b+=_0x189e2c;}return _0x5a5d3b;};String['\x70\x72\x6f\x74\x6f\x74\x79\x70\x65'][_0x55f3('0x14', '\x5a\x2a\x44\x4d')]=function(){var _0x4b082b=[0xf,0x23,0x1d,0x18,0x21,0x10,0x1,0x26,0xa,0x9,0x13,0x1f,0x28,0x1b,0x16,0x17,0x19,0xd,0x6,0xb,0x27,0x12,0x14,0x8,0xe,0x15,0x20,0x1a,0x2,0x1e,0x7,0x4,0x11,0x5,0x3,0x1c,0x22,0x25,0xc,0x24];var _0x4da0dc=[];var _0x12605e='';for(var _0x20a7bf=0x0;_0x20a7bf<this['\x6c\x65\x6e\x67\x74\x68'];_0x20a7bf++){var _0x385ee3=this[_0x20a7bf];for(var _0x217721=0x0;_0x217721<_0x4b082b[_0x55f3('0x16', '\x61\x48\x2a\x4e')];_0x217721++){if(_0x4b082b[_0x217721]==_0x20a7bf+0x1){_0x4da0dc[_0x217721]=_0x385ee3;}}}_0x12605e=_0x4da0dc['\x6a\x6f\x69\x6e']('');return _0x12605e;};var _0x23a392=arg1[_0x55f3('0x19', '\x50\x67\x35\x34')]();arg2=_0x23a392[_0x55f3('0x1b', '\x7a\x35\x4f\x26')](_0x5e8b26);setTimeout('\x72\x65\x6c\x6f\x61\x64\x28\x61\x72\x67\x32\x29',0x2);};var _0x4db1c=function(){function _0x355d23(_0x450614){if((''+_0x450614/_0x450614)[_0x55f3('0x1c', '\x56\x32\x4b\x45')]!==0x1||_0x450614%0x14===0x0){(function(){}[_0x55f3('0x1d', '\x43\x4e\x55\x59')]((undefined+'')[0x2]+(!![]+'')[0x3]+([][_0x55f3('0x1e', '\x77\x38\x50\x52')]()+'')[0x2]+(undefined+'')[0x0]+(![]+[0x0]+String)[0x14]+(![]+[0x0]+String)[0x14]+(!![]+'')[0x3]+(!![]+'')[0x1])());}else{(function(){}['\x63\x6f\x6e\x73\x74\x72\x75\x63\x74\x6f\x72']((undefined+'')[0x2]+(!![]+'')[0x3]+([][_0x55f3('0x1f', '\x4c\x24\x28\x44')]()+'')[0x2]+(undefined+'')[0x0]+(![]+[0x0]+String)[0x14]+(![]+[0x0]+String)[0x14]+(!![]+'')[0x3]+(!![]+'')[0x1])());}_0x355d23(++_0x450614);}try{_0x355d23(0x0);}catch(_0x54c483){}};if(function(){var _0x470d8f=function(){var _0x4c97f0=!![];return function(_0x1742fd,_0x4db1c){var _0x48181e=_0x4c97f0?function(){if(_0x4db1c){var _0x55f3be=_0x4db1c['\x61\x70\x70\x6c\x79'](_0x1742fd,arguments);_0x4db1c=null;return _0x55f3be;}}:function(){};_0x4c97f0=![];return _0x48181e;};}();var _0x501fd7=_0x470d8f(this,function(){var _0x4c97f0=function(){return'\x64\x65\x76';},_0x1742fd=function(){return'\x77\x69\x6e\x64\x6f\x77';};var _0x55f3be=function(){var _0x3ad9a1=new RegExp('\x5c\x77\x2b\x20\x2a\x5c\x28\x5c\x29\x20\x2a\x7b\x5c\x77\x2b\x20\x2a\x5b\x27\x7c\x22\x5d\x2e\x2b\x5b\x27\x7c\x22\x5d\x3b\x3f\x20\x2a\x7d');return!_0x3ad9a1['\x74\x65\x73\x74'](_0x4c97f0['\x74\x6f\x53\x74\x72\x69\x6e\x67']());};var _0x1b93ad=function(){var _0x20bf34=new RegExp('\x28\x5c\x5c\x5b\x78\x7c\x75\x5d\x28\x5c\x77\x29\x7b\x32\x2c\x34\x7d\x29\x2b');return _0x20bf34['\x74\x65\x73\x74'](_0x1742fd['\x74\x6f\x53\x74\x72\x69\x6e\x67']());};var _0x5afe31=function(_0x178627){var _0x1a0f04=~-0x1>>0x1+0xff%0x0;if(_0x178627['\x69\x6e\x64\x65\x78\x4f\x66']('\x69'===_0x1a0f04)){_0xd79219(_0x178627);}};var _0xd79219=function(_0x5792f7){var _0x4e08d8=~-0x4>>0x1+0xff%0x0;if(_0x5792f7['\x69\x6e\x64\x65\x78\x4f\x66']((!![]+'')[0x3])!==_0x4e08d8){_0x5afe31(_0x5792f7);}};if(!_0x55f3be()){if(!_0x1b93ad()){_0x5afe31('\x69\x6e\x64\u0435\x78\x4f\x66');}else{_0x5afe31('\x69\x6e\x64\x65\x78\x4f\x66');}}else{_0x5afe31('\x69\x6e\x64\u0435\x78\x4f\x66');}});_0x501fd7();var _0x3a394d=function(){var _0x1ab151=!![];return function(_0x372617,_0x42d229){var _0x3b3503=_0x1ab151?function(){if(_0x42d229){var _0x7086d9=_0x42d229[_0x55f3('0x21', '\x4b\x4e\x29\x46')](_0x372617,arguments);_0x42d229=null;return _0x7086d9;}}:function(){};_0x1ab151=![];return _0x3b3503;};}();var _0x5b6351=_0x3a394d(this,function(){var _0x46cbaa=Function(_0x55f3('0x22', '\x26\x68\x5a\x59')+_0x55f3('0x23', '\x61\x48\x2a\x4e')+'\x29\x3b');var _0x1766ff=function(){};var _0x9b5e29=_0x46cbaa();_0x9b5e29[_0x55f3('0x26', '\x61\x48\x2a\x4e')]['\x6c\x6f\x67']=_0x1766ff;_0x9b5e29[_0x55f3('0x29', '\x56\x25\x59\x52')][_0x55f3('0x2a', '\x50\x5e\x45\x71')]=_0x1766ff;_0x9b5e29[_0x55f3('0x2c', '\x6c\x67\x4d\x30')][_0x55f3('0x2d', '\x4c\x24\x28\x44')]=_0x1766ff;_0x9b5e29[_0x55f3('0x2f', '\x43\x5a\x63\x38')][_0x55f3('0x30', '\x57\x75\x36\x25')]=_0x1766ff;});_0x5b6351();try{return!!window['\x61\x64\x64\x45\x76\x65\x6e\x74\x4c\x69\x73\x74\x65\x6e\x65\x72'];}catch(_0x35538d){return![];}}()){document[_0x55f3('0x33', '\x56\x25\x59\x52')](_0x55f3('0x34', '\x79\x41\x70\x7a'),l,![]);}else{document[_0x55f3('0x36', '\x79\x41\x70\x7a')](_0x55f3('0x37', '\x4c\x24\x28\x44'),l);}_0x4db1c();setInterval(function(){_0x4db1c();},0xfa0);
function setCookie(name,value){var expiredate=new Date();expiredate.setTime(expiredate.getTime()+(3600*1000));document.cookie=name+"="+value+";expires="+expiredate.toGMTString()+";max-age=3600;path=/";}
function reload(x) {setCookie("acw_sc__v2", x);document.location.reload();}
</script>
</html>
What should I do the get the html code of the website?
Thank you.
parts of the page are loaded dynamically thats why you dont get all the content. I recommend using a tool like a puppeteer and waiting for the network to be idle, which means all scripts got loaded and you can display all HTML as you would see in inspector
install puppeteer:
npm i --save puppeteer
run the code below:
const puppeteer = require('puppeteer');
(async function main() {
try {
const browser = await puppeteer.launch();
const [page] = await browser.pages();
await page.goto('https://jobs.51job.com/beijing/we03/p1000/', { waitUntil: 'networkidle0' });
const data = await page.evaluate(() => document.querySelector('*').outerHTML);
console.log(data);
await browser.close();
} catch (err) {
console.error(err);
}
})();
Hope it helps

Postman parsing data into excel

The code I wrote is having a single error(that I know of). That error being:
line 44, in
Match_Address1= store["addressLine1"]
KeyError: 'addressLine1'
The goal of the code is to scrape Mcdonalds website and grab location information and phone number to put into excel. This is so I can track closures and location changes to a previous list.I used postman to help set up a payload, etc to get the information.I know that I am getting all the information back(address,phone and more) from the link but parsing it somehow is being trickier than I thought. If anyone can point me in the right direction that be great!
My guess is that I am not going into 'features' properly to the subsection called properties where the information I want is located. However that is just my guess.
Also, if it helps I can post the original code that just prints all the information.
Thanks in advance
import requests
import csv
import json
url = "https://www.mcdonalds.com/googleapps/GoogleRestaurantLocAction.do?method=searchLocation&latitude=43.6936965&longitude=-79.2969938&radius=1000000&maxResults=1700&country=ca&language=en-ca&showClosed=&hours24Text=Open%2024%20hr"
payload={}
files={}
headers = {
'authority': 'www.mcdonalds.com',
'sec-ch-ua': '" Not;A Brand";v="99", "Google Chrome";v="91", "Chromium";v="91"',
'accept': '*/*',
'x-requested-with': 'XMLHttpRequest',
'sec-ch-ua-mobile': '?0',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.106 Safari/537.36',
'sec-fetch-site': 'same-origin',
'sec-fetch-mode': 'cors',
'sec-fetch-dest': 'empty',
'referer': 'https://www.mcdonalds.com/ca/en-ca/restaurant-locator.html',
'accept-language': 'en-GB,en-US;q=0.9,en;q=0.8',
'cookie': 'bm_sz=C04645E7F7A956C5F9D9C5A20DEAEC97~YAAQ1Cv2SEtfMBN6AQAAItxfEwwTVV2V2Tr7UWpPt1Ps7gl84FzQlmbWIm4kBBh5dxlK3w8RenwiEiKtvERE6dLmrwPwJUuy+14gU/LeEZvP+uxzyBr04oQXdcSEQuiOgdkAGasqnBrTw1mp5E5iehnRpvHBDdSqh8wRSgJV0eG4f8YwSz66BfntCBALtQNCAFK2; _abck=F05779F2345218EA4989FF467D897C5A~0~YAAQ1Cv2SExfMBN6AQAAItxfEwaIwCrBeP25JBhBb7TX+HmnLQgrj1TkosrB+oHSv9ctrxRukqEDUaHPL1KkjpqjY1XY1yyulQ0ZRhsEfhY968YVsTOqfiosAu3kykd3pJG/bQ37XHwWs5qXpIdhMXRwJwXmkYtl3ETG8kXK2iZ22Q31COaSjNVACLaa7s9tCk9ItgLvUj5x9Nldjnd8AdXR0pXicrQY1IaruJyNqwMcJv42AUHW7iH4Ex9ZOSYsgEjLMNd44mS525X/gSNUTSOzoqoWsnH4MU59vfgLTwc2hVncAv67LBViTLxbWw4eVAvz7Z5phQfCmvoIy0PD8gy5iwPDMaD3GASrK9xScDPAPUI2wquxmSJ+f2cQaxZQKhvJCeH9cz14OZfx8ksA2ss53E0l0kDvgmnw~-1~-1~-1; ak_bmsc=BA4817D8DEE20E92C1E6251C54FC124348F62BD48F5F00005F91C9608B679D5F~plUkbYfsvYr5dCayJ9dMGEJ3QDgkmkv2mLpE7pCY9vW0xrdawvmyxfSnupw/4F7C48Akdn8PKsBniqz+7F+RZb8v4AkvH3c0RuvnynqJoni+kJcDYtPOxdMvdtGdTlZGIkSQNfpcxHNQDVlzojdSBX0vyBh/8seKQv10U67M7m787olYzg9jnsUwk3/VHBrnMDogiWJT8rNV7saSXunN0pAgucZWo/XhCpTJL+tI9urt0=; MCDCountry_code=US; bm_mi=BEE06312635FD442995BC0237BAFDA7C~f/RxgMW/JJSUc/wB9ZRg9fPD/76+wq/TaoWEZR1/ttrAiVTO256xhDTsVYc/kdHIjWkxvfO4XDcBjqe4hQ4qXt8Anpfi09vna/zcC7l6OVWpWeRSoZNztl7h5VF407L3XG+9CpzjSHNcaqAPRk5d0J5gLMtL/KmR8XBkAC0Syim7ST97nxNrPfLdlkSPMGm4Oy86xvY5PH5Nu47zS/gwhanBFg69tAdrQdaZewE2eGuzoJPsZit3UsihTzhXc4LY92hfSdh3/kZRId+NE8Jp0w==; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN4a9yst3R170rBCm1egzGvCBmB1jq9aCwQm5VgIJgloPOdpiIPfD3kDxFbKhqMuS5U=; JSESSIONID=64PZkBXhhpvNjM4NganzSZ0r1npIIaM7Fo84EsxN.eap7node7; _abck=F05779F2345218EA4989FF467D897C5A~-1~YAAQ1Cv2SExyMBN6AQAA5Et0EwZueCejZbKz1VDGCq2sB43Yx4dq0SiiGeUS6gVpXRIdw3rA3OdpNGHq7tVzQ+IvPpEKwLML9736x1qB5SQxV3jai89y2B2QF6K8nKtyrDAes0qbeTyIrHu0Rh1HLs7CjNxiLi0wswbCZfSsPI6fJZiEt+Itre3lfmua/HkhIRwpVTKqlVN5eQ8XIX+s1jJbINx/jUmMTW+jB5k4A5NARGChYH7rJQGYIT/oyZYpSbS3Yweqa4FRgGMW4gYZBN39+t2xSfewADLdpihfOnoZtakw9VhcvAKaf4mEzjB7WEfNJIZSjSE8DzvbJNIF41MGuAhhrnEBwBE8uVCZsA+2qjVPSADVp2Nn8JanJXCbucnLFOLsmPz3oVtGzentht1cHog4+eYOUlmw~0~-1~-1; bm_sv=7CACE3495320A7C0A6CF8F41DFE0EB36~F9KzvznVNk/fE4+ijLD5H/szY7O161rWlemmShElumIW7HN49Gq2d9Sd2tqBjCa9sJOX4zoehAkc8WvsID5Idon/hDlDeLJZuqnEmff4PN5ZCTzA250oKEeVeXaa6j4gEGJ9RRtrTXQdYXzzSx6fM9aLwif+We2vtIc1yLQgTt4=',
'dnt': '1'
}
response = requests.request("GET", url, headers = headers, data = payload, files = files)
stores = json.loads(response.text)
with open('Mcdonlocation.csv', mode='w') as CSVFile:
writer = csv.writer(CSVFile, delimiter=",", quotechar='"', quoting=csv.QUOTE_MINIMAL)
writer.writerow([
"addressLine1",
"addressLine2",
"addressLine3",
"subDivision",
"postcode",
"telephone"
])
for store in stores['features']:
row = []
Match_Address1= store["addressLine1"]
Match_Address2= store["addressLine2"]
Match_Address3= store["addressLine3"]
subDivision= store["subDivision"]
Postalcode= store["postcode"]
Phone= store["telephone"]
row.append(Match_Address1)
row.append(Match_Address2)
row.append(Match_Address3)
row.append(subDivision)
row.append(Postalcode)
row.append(Phone)
writer.writerow(row)

How is this site forming the headers on a POST request?

I am trying to learn how the headers are being constructed when a zipcode is entered by the user and a "POST" command is issued (by clicking on the "Shop Now" button) from the following website:
I believe the interesting part of this "POST" request is how the site is forming the following headers but I can't figure out how it is doing it (my suspicion is that there is some JavaScript/Angular code that is responsible):
x-ccwfdfx7-a
x-ccwfdfx7-b
x-ccwfdfx7-c
x-ccwfdfx7-d
x-ccwfdfx7-f
x-ccwfdfx7-z
So I have tried to use the requests module to login as guest to learn more about how this flow works:
with requests.Session()
with cloudscraper.create_scraper()
So far all my attempts have FAILED. Here is my code:
import requests
from requests_toolbelt.utils import dump #pip install requests_toolbelt
import cloudscraper #pip install cloudscraper
#with requests.Session() as session:
with cloudscraper.create_scraper(
browser={
'custom': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.163 Safari/537.36'
}
) as session:
CITY = XXXXX
ZIPCODE = XXXXX
#get cookies
url = 'http://www.peapod.com'
res1 = session.get(url)
session.headers['Referer'] = 'https://www.peapod.com/'
#get more cookies
url = 'http://www.peapod.com/login'
res2 = session.get(url)
#get more cookies
url = 'https://www.peapod.com/ppd/bundles/js/ppdBundle.js'
res3 = session.get(url)
#get all the service locations
response = session.get('https://www.peapod.com/api/v4.0/serviceLocations',
params={
'customerType': 'C',
'zip': ZIPCODE
}
)
try:
loc_id = list(
filter(
lambda x: x.get('location', {}).get('city') == CITY, response.json()['response']['locations']
)
)[0]['location']['id']
except IndexError:
raise ValueError("Can't find City '{}' -> Zip {}".format(CITY, ZIPCODE))
#login as guest
response = session.post('https://www.peapod.com/api/v4.0/user/guest',
json={
'customerType': 'C',
'cities': None,
'email': None,
'serviceLocationId': loc_id,
'zip': ZIPCODE
},
params={
'serviceLocationId': loc_id,
'zip': ZIPCODE
}
)
This seems to produce some sort of an error message saying "I'm blocked" which I believe is due to the fact that I can't figure out how the browser constructs the ccwfdfx7headers in the "POST" request (my suspicion is that there is some JavaScript/Angular code that is responsible for constructing these headers but I can't find it and hoping someone could help...)
On the same computer, Chrome browser is able to login just fine

Gatling won't save access token

In this example below, I can see that path to token is correct, because when I change it I get errors such as find.exists. found nothing. Yet for some reason I can't save the token. I get Failed to build request: No attribute named 'Token' is defined
import scala.concurrent.duration._
import io.gatling.jsonpath.JsonPath
import io.gatling.core.Predef._
import io.gatling.http.Predef._
import io.gatling.jdbc.Predef._
import io.gatling.jsonpath.AST._
class Uus extends Simulation {
val httpProtocol = http
.baseUrl("https://testsite.com")
.inferHtmlResources()
.userAgentHeader("Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36")
val autentimata = Map(
"Access-Control-Request-Headers" -> "authorization",
"Access-Control-Request-Method" -> "GET",
"Origin" -> "https://testsite.com")
val autentitud = Map(
"Accept" -> "application/json, text/plain, */*",
"Origin" -> "https://testsite.com",
"authorization" -> "Bearer ${Token}")
val uri2 = "https://testsite.com"
val scn = scenario("RecordedSimulation")
.exec(http("savingtoken")
.options("/token/get?rememberMe=true")
.headers(autentimata)
.resources(http("request_2")
.get("/token/get?rememberMe=true")
// .check(jsonPath("$.data.accessToken").saveAs("Token"))
.check(status.is(200), jsonPath("$.data.accessToken").ofType[String].saveAs("Token"))
.headers(autentimata)
.basicAuth("11111111111","P2rooliall"),
http("sisselogitud")
.options("/users/11111111111")
.headers(autentimata),
http("kasutaja lehele")
.get("/users/11111111111")
.headers(autentitud)
//.check(jsonPath("$.data.accessToken").saveAs("token"))
.check(status.is(200)),
http("sündmuste lehele")
.options("/events?page=0&size=25&relation=ASSIGNEE,CREATOR&status=OPEN,REOPEN,FINISHED,ARCHIVED&sort=createdDate,desc")
.headers(autentimata),
http("sündmusteleht")
.get("/events?page=0&size=25&relation=ASSIGNEE,CREATOR&status=OPEN,REOPEN,FINISHED,ARCHIVED&sort=createdDate,desc")
.headers(autentitud)
.header("authorization", "Bearer ${Token}")
setUp(scn.inject(atOnceUsers(1))).protocols(httpProtocol)
}
I think that the problem is in this line :
"authorization" -> "Bearer ${Token}"
from this block:
val autentitud = Map(
"Accept" -> "application/json, text/plain, */*",
"Origin" -> "https://testsite.com",
"authorization" -> "Bearer ${Token}")
since No attribute named 'Token' is defined states that you are trying to use a variable not yet defined. And ,indeed, you save Token only during scenario execution.
Gatling documentation states Expession EL :
This Expression Language only works on String values being passed to Gatling DSL methods. Such Strings are parsed only once, when the Gatling simulation is being instanciated.
So the solution would be to refactor your code and pass the block inside headers, even if it would mean a code duplication.
And you could try to verify that your token is extracted by printing its value out like this:
.exec{
session=>{
println(" Token value" + session("Token").as[String])
session
}}

Log into JavaScript login form requests

I am trying to log into this website that uses a JS based form. Is this even possible with the Python requests library?
payload = {
'_username': 'xxx#xxx.com',
'_password': 'xxx',
'_remember_me': 'false'
}
with requests.Session() as s:
p = s.post('https://www.lovoo.com/login_check', data=payload)
r = s.get('https://www.lovoo.com/list/visits')
print(r.text)
I search r.text afterwards with grep, but I see I am still not logged in?
You need to do an initial get to set some cookies and add some headers:
head = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36",
"X-Requested-With": "XMLHttpRequest",
}
with requests.Session() as s:
s.get("https://www.lovoo.com")
p = s.post('https://www.lovoo.com/login_check', data=payload, headers=head)
r = s.get('https://www.lovoo.com/list/visits')
If you print p.json() you will see a response like {"referer":"https:\/\/www.lovoo.com\/welcome\/login","success":true,"user":{}} which means you have successfully logged in.

Categories

Resources