Node.js Login to blog site and Create new post - javascript

I want to login to blogfa.com (a persian blog service) and create a new post by node.js
To do that i use request.js to post login site and go this url "/Desktop/Post.aspx?action=newpost" and post a new content
here is code i got so far :
var request = require('request');
var cheerio = require('cheerio');
var j = request.jar();
request = request.defaults({ jar : j }); //it will make the session default for every request
//...
var headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.89 Safari/537.36 OPR/28.0.1750.48',
'Content-Type': 'application/x-www-form-urlencoded'
};
request({
url:"http://blogfa.com/Desktop/login.aspx",
method:"POST",
form:{uid:"demoblog1",pwd:"test"},
headers: headers,
followRedirect:true
},
function(error,response,body){
console.log(body);
});
Problem i have its, i cant login in to the site .
i dont why not working ! i used jar for cookie , posted user and password , also set headers.
here is runnable code you can test :
http://code.runnable.com/Vc7EnmyVlgRa1Hx-/blog-login-for-node-js
Update
Request Cookies
.ASPXBLOG BC045A0CD184FEA10D91561EB67A302F1E036D88E50CE4264E4ABD003
__utma 36873331.1996897518.1435135939.1437159460.1439563539.7
__utmz 36873331.1437159460.6.6.utmcsr=chat.delgarm.com|utmccn=(referral)
pubset ar=1&z=12600&ds=0&cmt=0&cats=0&tag=0&nu=1&bt=dGVzdCB
ten 67145377

By default request does not follow redirects for non-GET requests, which is what is happening in this case. So set followAllRedirects: true in your request() options and it should work fine.

Related

How to find all the JavaScript requests made from my browser when I'm accessing a site

I want to scrape the contents of LinkedIn using requests and bs4 but I'm facing a problem with the JavaScript that is loading the page after I sign in(I don't get the home page directly), I don't wanna use Selenium
here is my code
import requests
from bs4 import BeautifulSoup
class Linkedin():
def __init__(self, url ):
self.url = url
self.header = {"User-Agent": "Mozilla/5.0 (Windows NT 6.1; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Safari/537.36"}
def saveRsulteToHtmlFile(self, nameOfFile=None):
if nameOfFile == None:
nameOfFile ="Linkedin_page"
with open(nameOfFile+".html", "wb") as file:
file.write(self.response.content)
def getSingInPage(self):
self.sess = requests.Session()
self.response = self.sess.get(self.url, headers=self.header)
soup = BeautifulSoup(self.response.content, "html.parser")
self.csrf = soup.find(attrs={"name" : "loginCsrfParam"})["value"]
def connecteToMyLinkdin(self):
self.form_data = {"session_key": "myemail#mail.com",
"loginCsrfParam": self.csrf,
"session_password": "mypassword"}
self.url = "https://www.linkedin.com/uas/login-submit"
self.response = self.sess.post(self.url, headers=self.header, data=self.form_data)
def getAnyPage(self,url):
self.response = self.sess.get(url, headers=self.header)
url = "https://www.linkedin.com/"
likedin_page = Linkedin(url)
likedin_page.getSingInPage()
likedin_page.connecteToMyLinkdin() #I'm connected but java script still loading
likedin_page.getAnyPage("https://www.linkedin.com/jobs/")
likedin_page.saveRsulteToHtmlFile()
I want help to pass the javascript loads without using Selenium...
Although it's technically possible to simulate all the calls from Python, at a dynamic page like LinkedIn, I think it will be quite tedious and brittle.
Anyway, you'd open "developer tools" in your browser before you open LinkedIn and see how the traffic looks like. You can filter for the requests from Javascript (in Firefox, the filter is called XHR).
You would then simulate the necessary/interesting requests in your code. The benefit is the servers usually return structured data to Javascript, such as JSON. Therefore you won't need to do as much HTML parsing.
If you find not progressing very much this way (it really depends on the particular site), then you will probably have to use Selenium or some alternative such as:
https://robotframework.org/
https://miyakogi.github.io/pyppeteer/ (port of Puppeteer to Python)
You should send all the XHR and JS requests manually [in the same session which you created during login]. Also, pass all the fields in request headers (copy from the network tools).
self.header_static = {
'authority': 'static-exp2.licdn.com',
'method': 'GET',
'path': '/sc/h/c356usw7zystbud7v7l42pz0s',
'scheme': 'https',
'accept': '*/*',
'accept-encoding': 'gzip, deflate, br',
'accept-language': 'en-GB,en;q=0.9,en-US;q=0.8,hi;q=0.7,la;q=0.6',
'cache-control': 'no-cache',
'dnt': '1',
'pragma': 'no-cache',
'referer': 'https://www.linkedin.com/jobs/',
'sec-fetch-mode': 'no-cors',
'sec-fetch-site': 'cross-site',
'user-agent': 'Mozilla/5.0 (Linux; Android 6.0; Nexus 5 Build/MRA58N) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.97 Mobile Safari/537.36'
}
def postConnectionRequests(self):
urls = [
"https://static-exp2.licdn.com/sc/h/62mb7ab7wm02esbh500ajmfuz",
"https://static-exp2.licdn.com/sc/h/mpxhij2j03tw91bpplja3u9b",
"https://static-exp2.licdn.com/sc/h/3nq91cp2wacq39jch2hz5p64y",
"https://static-exp2.licdn.com/sc/h/emyc3b18e3q2ntnbncaha2qtp",
"https://static-exp2.licdn.com/sc/h/9b0v30pbbvyf3rt7sbtiasuto",
"https://static-exp2.licdn.com/sc/h/4ntg5zu4sqpdyaz1he02c441c",
"https://static-exp2.licdn.com/sc/h/94cc69wyd1gxdiytujk4d5zm6",
"https://static-exp2.licdn.com/sc/h/ck48xrmh3ctwna0w2y1hos0ln",
"https://static-exp2.licdn.com/sc/h/c356usw7zystbud7v7l42pz0s",
]
for url in urls:
self.sess.get(url,headers=self.header_static)
print("REQUEST SENT TO "+url)
I called the postConnectionRequests() function after before saving the HTML content, and received the complete page.
Hope this helps.
XHR is send by JavaScript and Python will not run JavaScript code when it will get page using requests and beautifulsoup. Tools like Selenium loads page and runs JavaScript. You can also use Headless Browsers.

Receiving JSON between local React App and local Springboot service issues

I am running a local Springboot server, that when I access it locally in the browser, gives me a valid JSON object properly formatted (I verified this via JSON formatter).
I am also locally running a React application using node. I am attempting to use fetch() to get back that JSON object and running into issues. Finally got around CORs header issues, but not cannot figure out why the JSON object isn't coming back. Here's my code
var headers = new Headers();
headers.append("Content-type", "application/json;charset=UTF-8");
var myInit = { method: 'GET',
headers: headers,
mode: 'no-cors',
cache: 'default',
};
fetch(`http://localhost:3010/getJSON`, myInit)
.then(function(response){
console.log(response.data);
console.log(response);
console.log(JSON.parse(JSON.stringify(response)));
},function(error){
console.log(error);
});
So when I run this in Chrome with the debugger, the responses to the 3 log statements are:
1st logger
undefined
2nd logger
Response {type: "opaque", url: "", redirected: false, status: 0, ok: false,
…}
body
:
(...)
bodyUsed
:
false
headers
:
Headers {}
ok
:
false
redirected
:
false
status
:
0
statusText
:
""
type
:
"opaque"
url
:
""
__proto__
:
Response
3rd logger
{}
I have tried many different JSON parsing, stringify, etc, to no avail.
The next confusing part, is if within the Chrome debugger I go to the "Network" tab, click on the /getJSON, it shows me the entire JSON object just fine in both the "Preview" and "Response" tabs. So clearly Chrome is connecting to it correctly. Here's Chrome's "Headers" tab within "Network":
Request URL:http://localhost:3010/getJSON
Request Method:GET
Status Code:200
Remote Address:[::1]:3010
Referrer Policy:no-referrer-when-downgrade
Response Headers
view source
Content-Type:application/json;charset=UTF-8
Date:Thu, 12 Oct 2017 16:05:05 GMT
Transfer-Encoding:chunked
Request Headers
view source
Accept:*/*
Accept-Encoding:gzip, deflate, br
Accept-Language:en-US,en;q=0.8
Connection:keep-alive
Host:localhost:3010
Referer:http://localhost:3000/
User-Agent:Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36
I have tried to mimic this header in my request, but not sure how it differs? Any help would be greatly appreciated as I am currently banging my head against the way with this!
You're getting an opaque response, which tells me that maybe you haven't completely resolved the cors headers situation. If you're fetching from the client, I would suggest proxying that through your nodejs so that instead of calling your springboot service, you call node, thus getting rid of the cors issues.
EDIT
You could create something like this:
import express from 'express';
import request from 'request';
const router = express.Router();
router.get('/proxyname', (req, res) => {
// Removing IPv4-mapped IPv6 address format, if present
const requestUrl = [your service's endpoint];
request(requestUrl, (err, apiResponse, body) => {
res.status(apiResponse.statusCode);
try {
res.json(JSON.parse(body));
} catch (e) {
res.send(body);
}
});
});
export default router;
and then on your nodejs server file, add it, like this:
import proxy from '[path to proxy file above]';
app.use('/path-to-endpoint', proxy);
and then call that from the client instead of your SpringBoot service.

AWS API Gateway UnrecognizedClientException with Generated Javascript SDK

I'm encountering a 403 status code with an UnrecognizedClientException in the x-amzn-errortype header of the response to my API Gateway GET Request using the generated Javascript SDK. The Resource being called utilizes IAM Auth which differentiates the users role based on their user group.
Here is my API Client Initialize Function
function initializeAPIClient(accessKey, secretKey, sessionToken){
var config = {
region : region,
accessKey : accessKey,
secretKey : secretKey,
sessionToken : sessionToken
}
apigClient = apigClientFactory.newClient(config);
}
Here is my GET request Function
function testCall(){
var params = '';
var body = '';
var additionalParams = '';
apigClient.testCallGet(params, body, additionalParams)
.then(function(result){
alert("Permissions are available to this user.");
})
.catch(function(result){
alert("Permissions are NOT available to this user.");
});
}
Here are my request headers:
:authority:[API_ENDPOINT]
:method:GET
:path:/[STAGE]/[RESOURCE]
:scheme:https
accept:application/json
accept-encoding:gzip, deflate, sdch, br
accept-language:en-US,en;q=0.8
authorization:AWS4-HMAC-SHA256 Credential=[ACCESS_KEY_ID]/20170406/[REGION]/execute-api/aws4_request, SignedHeaders=accept;host;x-amz-date, Signature=[SIGNATURE]
origin:http://localhost:8000
referer:http://localhost:8000/php/[PAGE].php/?username=[USERNAME]&sessionToken=[SESSION_TOKEN]
user-agent:Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36
x-amz-date:20170406T180808Z
x-amz-security-token:[SESSION_TOKEN]
I'm not sure what could be causing this. The solutions recommended when I search UnrecognizedClientException seem to suggest doing what I'm already doing.
I've solved my own issue, so here's the answer for anybody who runs into a similar logic error. Do NOT use the Id token as your session token, which is what I was doing. The id token is used to generate the session token, along with the access key and secret key. Do not confuse the two.

How to programmatically replicate a request found in Chrome Developer Tools?

I'm looking at my balance on Venmo.com but they only show you 3 months at a time and I'd like to get my entire transaction history.
Looking at the Chrome Developer Tools, under the network tab, I can see the request to https://api.venmo.com/v1/transaction-history?start_date=2017-01-01&end_date=2017-01-31 which returns JSON.
I'd like to programmatically iterate through time and make several request and aggregate all of the transactions. However, I keep getting 401 Unauthorized.
My initial approach was just using Node.js. I looked at the cookie in the request and copied it into a secret.txt file and then sent the request:
import fetch from 'node-fetch'
import fs from 'fs-promise'
async function main() {
try {
const cookie = await fs.readFile('secret.txt')
const options = {
headers: {
'Cookie': cookie,
},
}
try {
const response = await fetch('https://api.venmo.com/v1/transaction-history?start_date=2016-11-08&end_date=2017-02-08', options)
console.log(response)
} catch(e) {
console.error(e)
}
} catch(e) {
console.error('please put your cookie in a file called `secret.txt`')
return
}
}
That didn't work do I tried copying all of the headers over:
const cookie = await fs.readFile('secret.txt')
const options = {
headers: {
'Accept-Encoding': 'gzip, deflate, sdch, br',
'Accept-Language': 'en-US,en;q=0.8',
'Cache-Control': 'no-cache',
'Connection': 'keep-alive',
'Cookie': cookie,
'Host': 'api.venmo.com',
'Origin': 'https://venmo.com',
'Pragma': 'no-cache',
'Referer': 'https://venmo.com/account/settings/balance/statement?end=02-08-2017&start=11-08-2016',
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36',
},
}
try {
const response = await fetch('https://api.venmo.com/v1/transaction-history?start_date=2016-11-08&end_date=2017-02-08', options)
console.log(response)
} catch(e) {
console.error(e)
}
This also did not work.
I even tried making the request from the console of the website and got a 401:
fetch('https://api.venmo.com/v1/transaction-history?start_date=2016-11-08&end_date=2017-02-08', {credentials: 'same-origin'}).then(console.log)
So my question here is this: I see a network request in Chrome Developer Tools. How can I make that same request programmatically? Preferably in Node.js or Python so I can write an automated script.
In the Network tab of the Chrome Developer Tools, right click the request and click "Copy" > "Copy as cURL (bash)". You can then either write a script using the curl command directly, or use https://curlconverter.com/ to convert the cURL command to Python, JavaScript, PHP, R, Go, Rust, Elixir, Java, MATLAB, Dart or JSON.

What is wrong with this HTTP-Get Basic authentication code?

I am using node.js restify as backend to run a REST API server and angularjs as front-end to call the HTTP GET. The REST server uses HTTP Basic Authentication. The username is foo and password is bar.
I have tested that the backend code works by using a restify client. Here is the working client code;
var client = restify.createJsonClient({
url: 'http://127.0.0.1'
});
client.basicAuth('foo', 'bar');
client.get('/alert?list=alertList', function(err, req, res, obj) {
console.log(obj);
});
I have trouble getting my angularjs http-get code to work. Here is the relevant code;
.controller('ViewCtrl', ['$scope', '$http', '$base64',
function ($scope, $http, $cookies, $base64) {
var url = '127.0.0.1/alert?list=alertList';
var auth = $base64.encode('foo:bar');
$http.defaults.headers.common['Authorization'] = 'Basic ' + auth;
$http.get(url).then(function (response) {
tableData = response.data;
//handle data
});
}
I cannot figure out what is wrong with the angularjs code. I am using restify authorizationParser. Are there any extra header requirements to get HTTP basic authentication working with restify authorizationParser?
The error message on the browser looks like this;
{
code: "NotAuthorized",
message: ""
}
In the chrome debugger, this is what I see;
Request Method:GET
Status Code:403 Forbidden
Remote Address:127.0.0.1:80
Request Headers
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding:gzip, deflate, sdch
Accept-Language:en-US,en;q=0.8,zh-CN;q=0.6,zh;q=0.4,zh-TW;q=0.2,ja;q=0.2
Cache-Control:max-age=0
Connection:keep-alive
Host:127.0.0.1
If-Modified-Since:Wed, 23 Dec 2015 02:22:04 GMT
Upgrade-Insecure-Requests:1
User-Agent:Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36
I am using this base64 angular module.
https://github.com/ninjatronic/angular-base64
EDIT: I discovered there is nothing with the angular code. The problem lies with the restify server. The restify server supports static web server and when http basic authentication was enabled, this static web server stopped working.
Inside the controller, you can pass the authentication header like this:
var url = '127.0.0.1/alert?list=alertList';
var auth = $base64.encode('foo:bar');
var headers = {"Authorization": "Basic " + auth};
$http.get(url, {headers: headers}).then(function (response)
tableData = response.data;
//handle data
});

Categories

Resources