Node.js: Remotely Submitting Forms - javascript

I'm currently working on a sort of Web Proxy for Node.js, but I am having trouble with submitting forms, on most sites I am able to successfully submit a form but on some other sites I am not so fortunate. I can't pinpoint if there is anything I'm doing wrong.
Is there a possible better way of doing this?
Also, how would I be able to handle multipart forms using the Express.js bodyparser?
At the moment this is what I have in the way of form processing is this:
function proxy(req, res,request)
{
var sess = req.session;
var onUrl_Parse = function(url){
var Uri= new URI.URI(url);//Parses incoming url
var options = {
uri: url,
method: req.method
}
options.headers={"User-Agent": "Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0", "Cookie":req.session.cook};
if(req.body) //If x-www-form-urlencoded is posted.
{
var options = {
uri: url,
method: req.method,
body: req.rawBody
}
options.headers={"User-Agent": "Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0", "Cookie":req.session.cook, "Content-Type":"application/x-www-form-urlencoded"};
}
onRequestOptions(options, url);
}
,onRequestOptions = function(options, url)
{
request(options, function(error, response, body)
{
if(!error){
if(response.headers['set-cookie'])
req.session.cook=response.headers['set-cookie'];
Proxy_Parser.Parser(body, url, async, onParse);// Parses returned html return displayable content
}
});
}
,onParse = function(HTML_BODY)
{
if(HTML_BODY=="")
res.end();
res.write(HTML_BODY);
res.end();
console.log("DONEEEEE");
}
Url_Parser.Url(req, URI, onUrl_Parse);
}

I am not sure what exactly you are trying to accomplish, but https://github.com/felixge/node-formidable is a anyway recommended !!

I would start with something like node-http-proxy. All the hard work is done for you and you can just define the routes you want to proxy and put in some handlers for the custom response info.

Related

react native fetch not getting the same content as post man

Im having a little problem with my request on getting an html from https://readnovelfull.com/beauty-and-the-beast-wolf-hubby-xoxo/chapter-1-i-would-not-be-responsible.html as example.
I can get all the html on the other url eg novel detalj, latest upgated etc.
but not when im getting the detali for the chapters.
I tested those url on postman and also on https://codebeautify.org/source-code-viewer as well and there is no problem on getting the content of the chapter of which it exist under the div #chr-content
So I am a bit lost now, what am I doing wrong?
Here is my fetch calls which is working on other novel sites.
static async getHtml(
url: string
): Promise<HTMLDivElement> {
console.log(`Sending html request to ${url}`);
var container = parse('<div>test</div>') as any;
try {
let headers = new Headers({
Accept: '*/*',
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'
});
var data = await fetch(url, {
method: 'GET',
headers: headers,
});
if (!data.ok) {
const message = `An error has occured:${data.status}`;
console.log(message);
} else {
var html = await data.text();
console.log('Data is ok. proceed to parse it');
container = parse('<div>' + html + '</div>') as any;
}
} catch (e) {
console.log(e);
}
return container as HTMLDivElement;
}
I should mention that am not getting any error what so ever, its just that the html I am getting is not the same as postman and other site is getting.
Update
Ok so i did some research on the site and this is what i come up with.
the site need X-CSRF-TOKEN and i was able to extract those and find those values
const csrf = 'x09Q6KGqJOJJx2iHwNQUa_mYfG4neV9EOOMsUBKTItKfNjSc0thQzwf2HvCR7SQCqfIpC2ogPj18jG4dQPgVtQ==';
const id = 774791;
which i need to send a request to https://readnovelfull.com/ajax/increase-chapter-views with the values above. and this will send back true/false
now i tried to inc the csrf on my fetch call after but its still the same old same no data.
any idee if i am doing something wrong still?
Looks like you have an issue with CORS. To make sure just try to send request through cors proxy. One of the ways you can quickly do that is add prefix URL:
https://cors-anywhere.herokuapp.com/https://readnovelfull.com/beauty-and-the-beast-wolf-hubby-xoxo/chapter-1-i-would-not-be-responsible.html`
NOTE: Using this CORS proxy on production is not recommended, because it's not secure
If after that you'll receive data, that means that you faced with CORS, and you need to figure out how to solve it in your specific case.
Reproducable example:
const parse = (str) => str;
const getHtml = async (url) => {
console.log(`Sending html request to ${url}`);
var container = parse('<div>No content =(</div>')
try {
let headers = new Headers({
Accept: '*/*',
'User-Agent':
'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36'
});
var data = await fetch(url, {
method: 'GET',
headers: headers,
});
if (!data.ok) {
const message = `An error has occured:${data.status}`;
console.log(message);
} else {
var html = await data.text();
console.log('Data is ok. proceed to parse it');
container = parse('<div>' + html + '</div>');
}
} catch (e) {
console.log(e);
}
return container;
}
getHtml('https://cors-anywhere.herokuapp.com/https://readnovelfull.com/beauty-and-the-beast-wolf-hubby-xoxo/chapter-1-i-would-not-be-responsible.html').then(htmlContent => document.querySelector('div').innerHTML = htmlContent);
<div>loading...</div>
If it doesn't help, please provide a reproducible RN example, but I believe there is no difference between RN and web environments in that case.

Using rejectUnauthorized with node-fetch in node.js

I currently use request to make http requests in node.js. I had at some point encountered an issue where I was getting errors that indicated UNABLE_TO_GET_ISSUER_CERT_LOCALLY. To get around that it set rejectUnauthorized. My working code with request looks like this:
var url = 'someurl';
var options = {
url: url,
port: 443,
// proxy: process.env.HTTPS_PROXY, -- no need to do this as request honors env vars
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko',
'Accept-Language': 'en-us',
'Content-Language': 'en-us'
},
timeout: 0,
encoding: null,
rejectUnauthorized: false // added this to prevent the UNABLE_TO_GET_ISSUER_CERT_LOCALLY error
};
request(options, function (err, resp, body) {
if (err) reject(err);
else resolve(body.toString());
});
I thought I would try switching to the fetch api using async/await and am now trying to use node-fetch to do the same thing. However, when I do the same thing I am back to the UNABLE_TO_GET_ISSUER_CERT_LOCALLY errors. I read that I needed to use a proxy agent and tried using the proxy-agent module but I am still not having any luck.
Based off of the post https://github.com/TooTallNate/node-https-proxy-agent/issues/11 I thought the following would work:
var options = {
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; Trident/7.0; rv:11.0) like Gecko',
'Accept-Language': 'en-us',
'Content-Language': 'en-us'
},
timeout: 0,
encoding: null
};
var proxyOptions = nodeurl.parse(process.env.HTTPS_PROXY);
proxyOptions.rejectUnauthorized = false;
options.agent = new ProxyAgent(proxyOptions);
const resp = await fetch('someurl', options);
return await resp.text();
but I still get the same error. So far the only way I've been able to get around this using node-fetch is to set NODE_TLS_REJECT_UNAUTHORIZED=0 in my environment which I don't really want to do. Can someone help show me how to get rejectUnauthorized to work with node-fetch (presumably using an agent but I don't honestly care how as long as it's specified as part of the request).
This is how I got this to work using rejectUnauthorized and the Fetch API in a Node.js app.
Keep in mind that using rejectUnauthorized is dangerous as it opens you up to potential security risks, as it circumvents a problematic certificate.
const fetch = require("node-fetch");
const https = require('https');
const httpsAgent = new https.Agent({
rejectUnauthorized: false,
});
async function getData() {
const resp = await fetch(
"https://myexampleapi.com/endpoint",
{
agent: httpsAgent,
},
)
const data = await resp.json()
return data
}
Use proxy
You should know that node-https-proxy-agent latest version have a problem and doesn't work with Fetch! You can use older version 3.x and down! And it will work! Otherwise Better you can use the node-tunnel module https://www.npmjs.com/package/tunnel! You can too use the wrapping module proxy-http-agent that is based on node-tunnel https://www.npmjs.com/package/proxy-http-agent! That provide automatic detection of protocol for the proxy! One method for all! And more options and affinity! And both of them support both http and https !
You can see the usage and see a good example of proxy building and setup in this module and repo (check the tests):
https://www.npmjs.com/package/net-proxy
https://github.com/Glitnirian/node-net-proxy#readme
ex:
import { ProxyServer } from 'net-proxy';
import { getProxyHttpAgent } from 'proxy-http-agent';
// ...
// __________ setting the proxy
const proxy = new ProxyServer({
port: proxyPort
});
proxy.server.on('data', (data: any) => { // accessing the server instance
console.log(data);
});
await proxy.awaitStartedListening(); // await server to start
// After server started
// ______________ making the call through the proxy to a server through http:
let proxyUrl = `http://localhost:${proxyPort}`; // Protocol from the proxy is automatically detected
let agent = getProxyHttpAgent({
proxy: proxyUrl,
endServerProtocol: 'http:' // the end server protocol (http://localhost:${localApiServerPort} for example)
});
const response = await fetch(`http://localhost:${localApiServerPort}`, {
method: 'GET',
agent
});
// ___________________ making a call through the proxy to a server through https:
agent = getProxyHttpAgent({
proxy: proxyUrl, // proxy as url string! We can use an object (as tunnel module require too)
rejectUnauthorized: false // <==== here it go
});
const response2 = await fetch(`https://localhost:${localApiHttpsServerPort}`, {
method: 'GET',
agent
});
You can see more examples and details in the doc here:
https://www.npmjs.com/package/proxy-http-agent
And you can too use directly node-tunnel! But the package is just a simple wrapper! That make it more simpler!
Add rejectUnauthorized
For the one that doesn't know well!
As per this thread
https://github.com/node-fetch/node-fetch/issues/15
We use the https.Agent to pass the rejectUnauthorized parameter!
const agent = new https.Agent({
key: fs.readFileSync(`${CERT_PATH}.key`),
cert: fs.readFileSync(`${CERT_PATH}.crt`),
rejectUnauthorized: false
})
A complete example
import https from "https";
const agent = new https.Agent({
rejectUnauthorized: false
});
fetch(myUrl, { agent });
For fetch you can too use an environment variable as follow
process.env.NODE_TLS_REJECT_UNAUTHORIZED = "0";
This way it gonna be set globaly and not per each call! Which may be more appropriate if you are using a constant proxy! For all calls! As when sitting behind the company proxy!
why
By default node fetch! And most of the http requests clients! All use the security and insure a valid ssl Certificate when using https!
To disable this behavior we need to disable that check somehow!
Depending on the libs it may be different!
For fetch that's how it's done!
With http.request! (underlying)
const https = require('https');
const options = {
hostname: 'encrypted.google.com',
port: 443,
path: '/',
method: 'GET',
rejectUnauthorized: false /// <<<== here
};
const req = https.request(options, (res) => {
console.log('statusCode:', res.statusCode);
console.log('headers:', res.headers);
res.on('data', (d) => {
process.stdout.write(d);
});
});
req.on('error', (e) => {
console.error(e);
});
req.end();
check this:
https://nodejs.org/api/https.html#https_https_request_url_options_callback
Also it's part of tls.connect Options
Which you can check here
https://nodejs.org/api/tls.html#tls_tls_connect_options_callback

How to send GET request without downloading response content using node-requests?

I'm currently learning node and i'm looking for HTTP library that would allow me to send GET request, without downloading server response content (body).
I need to send very large amount of http requests every minute. However i do not need to read their content (also to save bandwidth). I can't use HEAD for this purpose.
Is there any way to avoid downloading response body using node-requests, or perhaps any other library - could be used?
My sample code using node-request:
const options = {
url: "https://google.com",
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'
}
}
//How to avoid downloading a whole response?
function callback(err, response, body) {
console.log(response.request.uri.host + ' - ' + response.statusCode);
}
request(options, callback);
HTTP GET by standard fetches the file content, you cannot avoid downloading(getting response) it but you can ignore it. Which is basically what you are doing.
request(options, (err, response, body)=>{
//just return from here don't need to process anything
});
EDIT1:
To just use some bytes of the response, you can use http.get and get the data using the data event. From the doc:
http.get('http://nodejs.org/dist/index.json', (res) => {
res.setEncoding('utf8');
let rawData = '';
res.on('data', (chunk) => { rawData += chunk; });
res.on('end', () => {
//this is when the response will end
});
}).on('error', (e) => {
console.error(`Got error: ${e.message}`);
});

Node JS and making external web calls successfully?

Hi I am trying to start learning NodeJS now and am in the middle of creating an application. The goal currently is to call a website through node, get an authentication token, then call that website again now with a POST payload which includes my login info and the auth token.
I have created the same program using python and i get a 200 response where in nodeJS i am getting a 302.
I believe thats a quick solution, the main meat of the problem I guess is my lack of understanding in NodeJS where:
1. If I am supposed to nest these requests calls into one another because they are supposed to be a part of the same 'session' and
2. If so how do I go to the last url which is, example.com/poll and be able to store/modify that information (which is just a json) because/if i go to example.com/poll url using a browser, the browser automatically downloads a file which it contains is a JSON format and doesnt just display it, which is what i need. so that i can either save that data in a string or etc. and not download it
In python I do this (Create a session than make the two calls)
url = "https://example.com/"
session = requests.session()
first_req = session.get(url)
auth_token_str = re.search(XXX, first_req.text)
login_url = 'https://example.com/sessions'
payload = { 'session[username_or_email]' : 'username', 'session[password]' : 'password', 'redirect_after_login':'/', 'authenticity_token': authenticity_token }
login_req = session.post(login_url, data=payload, headers=user_agent)
print "login_req response: ", login_req.status_code //gets me 200
then in Node JS:
var initLoad = {
method: 'GET',
url: 'https://example.com/',
headers: {
'User-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36'
}
};
request(initLoad, function(error, response, body) {
if (error) throw new Error(error);
var $ = cheerio.load(body, {xmlMode: false});
var authenticityToken = $("input[name=authenticity_token]").val();
console.log(authenticityToken);
var options = {
method: 'POST',
url: 'https://example.com/sessions',
headers: response.headers,
form: {
'session[username_or_email]': 'someUsername',
'session[password]': 'somePassword',
redirect_after_login: '/',
authenticity_token: authenticityToken
}
};
request(options, function(error, response2, body2) {
if (error) throw new Error(error);
console.log(response2.statusCode); //gets me 302 not 200
var analytics_url = 'https://example.com/poll';
var tripleload = {
method: 'GET',
url: analytics_url,
headers: response2.headers
};
request(tripleload, function(error, response3, body3) {
if (error) throw new Error(error);
res.end(body3);
});
});
});
302 means temporarily moved redirection which you get due error page being displayed to you (or served to your server in this case). There is something with this call that you are doing wrong, maybe url is wrong if generated like this.
Your code is messy due you being newbie in node and due the fact you use request which is barebone and offers little to no comfort in writing this stuff.
Use something like Axios: https://github.com/mzabriskie/axios to make it easier to write requests like this.

How to get host name in http response in node.js?

I am using http module of node.js for making a request. I have bunch of urls in database. I am fetching these urls from database and making requests in loop. But when response comes, I want to get host name of that response, because I want to update something in database based on that response. But I am not getting for which site I am getting response, so I am unable to update record for that site.
Code is something like this:
for (site = 0; site < no_of_sites; site++) {
options = {
hostname: sites[site].name,
port: 80,
method: 'GET',
headers: {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0'
}
};
var req = http.request(options, function (res) {
console.log('HEADERS: ' + JSON.stringify(res.headers));
if (res.statusCode == 200) {
//Update record;
}
});
}
We can get host site in this object.
console.log(this._header.match(/Host\:(.*)/g));
Option one: use res.req
var req = http.request(options, function (res) {
console.log(res.req._headers.host)
});
Option two: use a closure
for (site = 0; site < no_of_sites; site++) {
(function(){
var options = {
// ...
};
var req = http.request(options, function (res) {
// options available here
console.log(options);
});
}());
}
Option three:
It seems this is the same as res.req in the http.request() callback, but I'm not completely sure.
The answer is
console.log(res.socket._httpMessage._headers.host);

Categories

Resources