scrape a Javascript:_dopostback() - javascript

I am trying to automate a website and I am having trouble with a Javascript issue. After my first request '__EVENTTARGET' = 'ctl00$content$ctl01$btn'. New href pop-up on same page, and now i want to request this new javascript link '__EVENTTARGET' = 'ctl00$content$ctl01$' . I don't know how to scrape a Javascript:_dopostback() using scrapy spider. I tried looking into this issue and cannot find anything.
<a id="ctl00_content_" href="javascript:__doPostBack('ctl00$content$ctl01$,'')">SYED ALI </a>
my spider code::
URL = 'xyz'
class ExitRealtySpider(scrapy.Spider):
name = "campSpider"#name = "exit_realty"
allowed_domains = ["xyz"]
start_urls = [URL]
def parse(self, response):
# submit a form (first page)
self.data = {}
soup = BeautifulSoup(urlopen(URL), 'html.parser')
viewstate = soup.find('input', {'id': '__VIEWSTATE' })['value']
generator = soup.find('input', {'id': '__VIEWSTATEGENERATOR'})['value']
validation = soup.find('input', {'id': '__EVENTVALIDATION' })['value']
self.data['__VIEWSTATE']= viewstate,
self.data['__VIEWSTATEGENERATOR'] = generator,#'',
self.data['__VIEWSTATEENCRYPTED'] = '',
self.data['__EVENTVALIDATION'] = validation,
self.data['typAirmenInquiry'] = '7',
self.data['ctl00$content$ctl01$txtbxLastName'] = 'a',
self.data['ctl00$content$ctl01$txtbxCertNo'] = '123',
self.data['ctl00$content$ctl01$btnSearch'] = 'Search',
self.data['__EVENTTARGET'] = 'ctl00$content$ctl01$'
return FormRequest.from_response(response,
method='POST',
callback=self.parse_page,
formdata=self.data,
#encoding = 'utf-8',
#meta={'page': 1},
dont_filter=True
#headers=HEADERS
)
def parse_page (self,response):
print("\n\n\n\n\n",response.body,"\n\n\n\n\n")
self.data = {}
soup = BeautifulSoup(urlopen(URL), 'html.parser')
viewstate = soup.find('input', {'id': '__VIEWSTATE' })['value']
generator = soup.find('input', {'id': '__VIEWSTATEGENERATOR'})['value']
validation = soup.find('input', {'id': '__EVENTVALIDATION' })['value']
self.data['__EVENTARGUMENT']= '',
self.data['__LASTFOCUS']= '',
self.data['__VIEWSTATE']= viewstate,
self.data['__VIEWSTATEGENERATOR'] = generator,#'',
self.data['__VIEWSTATEENCRYPTED'] = '',
self.data['__EVENTVALIDATION'] = validation,
self.data['typAirmenInquiry'] = '7',
self.data['__EVENTTARGET'] = 'ctl00$content$ctl01$'
ans = FormRequest.from_response(response,
method='POST',
callback=self.parse_page2,
formdata=self.data,
#encoding = 'utf-8',
#meta={'page': 1},
dont_filter=True
#headers=HEADERS
)
return ans
def parse_page2 (self,response):
print("\n\n\n\n\n"),response.body,"\n\n\n\n\n")

Related

How to display the values of the attributes of the data queried and retrieved by Ajax call in django

I am trying to query the database based on what the user has clicked on the page and display the data retrieved by it without refreshing the page. I am using Ajax for this. Let me show you the codes
html
<label for="landacq" class="civil-label">Land Acquisation Cases</label>
<input class="civil-category" type="radio" name="civil-cat" id="landacq" value="land acquisation" hidden>
<label for="sc" class="civil-label">Supreme Court</label>
<input class="civil-court" type="radio" name="civil-court" id="sc" value="supreme court" hidden>
<label for="limitation" class="civil-label">Limitation</label>
<input class="civil-law-type" type="radio" name="civil-law-type" id="limitation" value="limitation" hidden>
js
for (i = 0; i < lawTypeInput.length; i++) {
lawTypeInput[i].addEventListener("click", (e) => {
e.preventDefault();
cat = civilCatval;
court = civilCourtval;
lawT = civillawTypeval;
console.log("this is from ajax : ", cat, court, lawT);
$.ajax({
type: "POST",
headers: { "X-CSRFToken": csrftoken },
mode: "same-origin", // Do not send CSRF token to another domain.
url: "civil",
data: {
"cat[]": civilCatval,
"court[]": civilCourtval,
"lawT[]": civillawTypeval,
},
success: function (query) {
showCivilQ(query);
// console.log(data);
},
error: function (error) {
console.log(error);
},
});
});
}
function showCivilQ(query) {
q.textContent = query;
console.log(query);
}
So here for example, if the user the click the radio button in the html, the values are grabbed by in js file and then sent to the url mentioned as a POST request. There these values are use to filter the database and return the objects like this
views.py
def civil_home(request):
if request.is_ajax():
get_cat = request.POST.get('cat[]')
get_court = request.POST.get('court[]')
get_lawT = request.POST.get('lawT[]')
query = Citation.objects.filter(law_type__contains ='civil' ,sub_law_type__contains= get_cat, court_name__contains = get_court, law_category__contains = get_lawT)
return HttpResponse(query)
else:
subuser = request.user
subscription = UserSubscription.objects.filter(user = subuser, is_active = True)
context = {
'usersub': subscription,
}
return render(request, 'civil/civil_home.html', context)
This is the result I am getting which is correct.
My Question is these objects contain attributes having some values in for eg, title, headnote etc. How can I display these attributes in the html rather than displaying the object names returned as shown in the Image like title of the citation, headnote of the citation etc
A solution could be to return a json object instead of the query resultset; because Ajax works well with json
You need a function that translates a Citation object into a dictionary (change it based on your real attributes). All elements must be translated into strings (see date example)
def citation_as_dict(item):
return {
"attribute1": item.attribute1,
"attribute2": item.attribute2,
"date1": item.date.strftime('%d/%m/%Y')
}
This dictionary must be translated into a json through import json package
def civil_home(request):
if request.is_ajax():
get_cat = request.POST.get('cat[]')
get_court = request.POST.get('court[]')
get_lawT = request.POST.get('lawT[]')
query = Citation.objects.filter(law_type__contains ='civil' ,sub_law_type__contains= get_cat, court_name__contains = get_court, law_category__contains = get_lawT)
response_dict = [citation_as_dict(obj) for obj in query]
response_json = json.dumps({"data": response_dict})
return HttpResponse(response_json, content_type='application/json')
else:
subuser = request.user
subscription = UserSubscription.objects.filter(user = subuser, is_active = True)
context = {
'usersub': subscription,
}
return render(request, 'civil/civil_home.html', context)
In your HTML page you should be able to parse the response as a normal JSON object
I figured out another way to do it, which is giving me the required results too.
Here I am filtering the values of the query, and then converting it to a list and passing it as a JsonResponse
views.py
def civil_home(request):
if request.method == "POST" and request.is_ajax():
get_cat = request.POST.get('cat[]')
get_court = request.POST.get('court[]')
get_lawT = request.POST.get('lawT[]')
query = Citation.objects.values().filter(law_type__contains ='civil' ,sub_law_type__contains= get_cat, court_name__contains = get_court, law_category__contains = get_lawT)
result = list(query)
return JsonResponse({"status": "success", "result": result})
else:
subuser = request.user
subscription = UserSubscription.objects.filter(user = subuser, is_active = True)
context = {
'usersub': subscription,
}
return render(request, 'civil/civil_home.html', context)
And then I am recieving the reponse here and iterrating over it to print the attributes in the html
js
for (i = 0; i < lawTypeInput.length; i++) {
lawTypeInput[i].addEventListener("click", (e) => {
e.preventDefault();
cat = civilCatval;
court = civilCourtval;
lawT = civillawTypeval;
console.log("this is from ajax : ", cat, court, lawT);
$.ajax({
type: "POST",
headers: { "X-CSRFToken": csrftoken },
mode: "same-origin", // Do not send CSRF token to another domain.
url: "civil",
data: {
"cat[]": civilCatval,
"court[]": civilCourtval,
"lawT[]": civillawTypeval,
},
success: function (response) {
console.log(response.result);
civilData = response.result;
if ((response.status = "success")) {
$("#queryResult").empty();
for (i = 0; i < civilData.length; i++) {
$("#queryResult").append(
`
${civilData[i].title}
<p>${civilData[i].headnote}</p>
`
);
}
} else {
$("#queryResult").empty();
$("#queryResult").append(
`
<p>No Citations Found</p>
`
);
}
},
error: function (error) {
console.log(error);
},
});
});
}
A csrf_token can be mentioned at the top of the html page and then it can be passed in the header to avoid any conflict.

Flask + Heroku: H12 Request Timeout Error

I have a Heroku app to run Scrapy Spider. After starting, I get error H12. How to fix it?
JQuery:
$.post('/wellness', {'specialty': specialty, 'state': state, 'city': city}, (res) => {
$(location).attr('href', 'http://127.0.0.1:5000/wellness')
});
flask:
if request.method == 'POST':
specialty = request.form.getlist('specialty[]')
state = request.form.getlist('state[]')
city = request.form.getlist('city[]')
settings = ''
with open(file_json, 'r') as f:
for line in f.read():
settings += line
settings = json.loads(settings)
settings['specialty'] = specialty
settings['state'] = state
settings['city'] = city
with open(file_json, 'w') as f:
f.write(json.dumps(settings, indent=4))
process = subprocess.Popen('python spiders/' + file_py, shell=True)
process.wait()
return render_template(file_html)

How to reload Flask-based webpage after receiving data

I have a simple web project which gets data from, processes it and needs to output result on the page. After the spider works, the data is written to sqlite, but not displayed on the page. How to refresh page after writing data in sqlite?
button
<button id="scrape" class="btn btn-success mr-2">Scrape</button>
js
$.post('/wellness', {'specialty': specialty, 'state': state, 'city': city}, (res) => {
});
flask
#app.route('/<page_id>', methods=['GET', 'POST'])
def page(page_id):
file_html = f"{page_id}.html"
file_py = f"{page_id}.py"
file_db = f"{page_id}.db"
specialty = request.form.getlist('specialty[]')
state = request.form.getlist('state[]')
city = request.form.getlist('city[]')
settings = ''
with open('settings.json', 'r') as f:
for line in f.read():
settings += line
settings = json.loads(settings)
settings['specialty'] = specialty
settings['state'] = state
settings['city'] = city
with open('settings.json', 'w') as f:
f.write(json.dumps(settings, indent=4))
process = subprocess.Popen('python e:/Python/sqlite/spiders/' + file_py, shell=True)
process.wait()
try:
db = sqlite3.connect(file_db)
cursor = db.cursor()
cursor.execute('SELECT * FROM wellness ORDER BY id DESC')
cards = cursor.fetchall()
db.close()
return render_template(file_html, cards = cards)
except:
return render_template(file_html)
Write the appropriate append/fill HTML code to the div where you want to show
(res) => {
# Write append or fill here
});
Example
$.ajax({
url: "/update/",
type: "POST",
data: {"data": data},
success: function(resp){
$('div.stats').html(resp.data); # Filling the data in appropriate div
}
});
You would have to do similar thing for your use case.
solution:
$.post('/wellness', {'specialty': specialty, 'state': state, 'city': city}, (res) => {
$(location).attr('href', 'http://127.0.0.1:5000/wellness')
});

Django Fullcalendar ajax no showing events

Hi I can't get events from ajax call to calendar please help
template:
$('#calendar').fullCalendar({
.....
events:{
url:'{% url 'rezerwacje' %}',
type: 'GET',
},
url.py
url(r'^rezerwacje/json2/', views.RezerwacjeLista, name='rezerwacje'),
views.py
def RezerwacjeLista(request):
if request.is_ajax():
try:
start = request.GET.get('start', False)
print start
end = request.GET.get('end',False)
except ValueError:
start = datetime.now()
end = start + timedelta(days=7)
rezerwacje = Rezerwacje.objects.filter(start__gte=start).filter(end__lte=end)
json_list = []
lista = []
for rezerwacja in rezerwacje:
id = rezerwacja.id
start = rezerwacja.start.strftime("%Y-%m-%d %H:%M:%S")
stanowisko = rezerwacja.stanowisko.nazwa
end = rezerwacja.end.strftime("%Y-%m-%d %H:%M:%S")
kolor = rezerwacja.kolor
status = rezerwacja.status
#allDay = False
json_entry = {"id":'1', 'start':start, 'end':end,'resourceId':stanowisko, 'kolor':kolor, 'status': status}
json_list.append(json_entry)
lista = ast.literal_eval(json.dumps(json_list))
return HttpResponse(lista, content_type='application/json')
That view returns json:
{'kolor': '1', 'status': 1, 'end': '2017-05-03 15:00:00', 'resourceId': 'K1', 'start': '2017-05-03 14:00:00', 'id': '1'}
That is good but when i select date in calendar there is no result on events. In response i get url and json
http://127.0.0.1:8000/rezerwacje/json2/?start=2017-05-03&end=2017-05-04&_=1493937740198
How to pass result to ajax?

Configuring passport-ldap

Trying to use this package: https://www.npmjs.com/package/passport-ldapauth
I managed to find the ldap-settings for my company in a medawikiserver (php)
$wgLDAPDomainNames = array("COMPANY");
$wgLDAPGroupBaseDNs = array("COMPANY"=>"dc=company,dc=se");
$wgLDAPAutoAuthDomain = "COMPANY";
$wgLDAPGroupUseFullDN = array("COMPANY"=>true );
$wgLDAPServerNames = array("COMPANY"=>"dcst.company.se");
$wgLDAPSearchStrings = array("COMPANY" => "COMPANY\\USER-NAME" );
$wgLDAPSearchAttributes = array("COMPANY"=>"sAMAccountName");
$wgLDAPBaseDNs = array("COMPANY"=>"dc=company,dc=se");
$wgLDAPEncryptionType = array("COMPANY" => "ssl" );
$wgMinimalPasswordLength = 1;
I need to map this to the node-package. I tried this:
var opts = {
server: {
url: 'ldaps://dcst.company.se',
bindDn: 'dc=company,dc=se',
//bindCredentials: 'secret',
searchBase: 'dc=company,dc=se',
searchFilter: '(&(objectcategory=person)(objectclass=user)(|(samaccountname={{username}})(mail={{username}})))',
searchAttributes: ['displayName', 'mail'],
}
};
I get "Bad request". This is from the docs:
badRequestMessage flash message for missing username/password (default: 'Missing credentials')
What have I done wrong?
You need to add the admin credentials. Here's how my configuration works:
var Strategy = require('passport-ldapauth').Strategy
, passport = require('passport')
, config = require('config')
, userLookup = require('./userLookup');
var ldapConfig = {
server: {
url: config.get('ldap.url'),
adminDn: config.get('ldap.adminDn'),
adminPassword: config.get('ldap.adminPassword'),
searchBase: config.get('ldap.searchBase'),
searchFilter: config.get('ldap.searchFilter')
}
};
passport.use('ldap', new Strategy(ldapConfig, userLookup));

Categories

Resources