Node.js can't make request to zomato.com - javascript

I would like to make request to https://zomato.com/ but there is no response, I am able to connect anywhere else but not to zomato I get timeout error every time. I was trying to set user-agent but it didn't work. I use node 6.6.0 and request 2.79.0. Any ideas?
var request = require('request');
var cheerio = require('cheerio');
var fs = require('fs');
var http = require('http');
request.get({
url: 'http://zomato.com/',
headers: {
'user-ggent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'
}
}, function(error, response, body) {
if(error) {
console.log("Error: " + error);
return;
}
else {
console.log("Status code: " + response.statusCode);
}
});
Update:
I've noticed that this:
curl -X GET "https://zomato.com/"
returns 301 redirect

I had some problems trying to do something similar with some websites. Try NigthmareJS instead of request
I didn't tested for zomato but here there is the code that I used for another website:
var website = new Nightmare()
.useragent("Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.111 Safari/537.36")
.goto('http://zomatoorwhateverwebsite.com/')
.evaluate(function(){
return document.documentElement.innerHTML;
})
.end()
.then(function(html) {
var $ = cheerio.load(html);
//Do what you need here
})
I hope this helps. Sometimes you need to add some wait() check the documentation for extra functions

if you look at the output of curl zomato.com -v you can see that we are being redirected :
HTTP/1.1 301 Moved Permanently
HTTP/1.1 301 Moved Permanently
So we need to add :
followAllRedirects: true,
Here :
request.get({
url: 'http://zamato.com/',
followAllRedirects: true,
headers: {
'user-agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36'
}

Related

Empty list when running request

I have a simple HTTP request with javascript that uses the await feature, however, I get the following result:
{}
undefined
When I run the function, I have tested the headers and url with python and it works successfully with requests am I missing something here?
import pkg from 'superagent';
const { get } = pkg;
const header = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.4 Safari/605.1.15'
};
const url = 'https://books.toscrape.com';
class Agent {
constructor(url, headers){
this.url = url;
this.headers = headers;
}
getAgent = async () => {
const res = await get(this.url)
.set(this.headers);
return res.body;
}
};
const request = new Agent(url, header);
console.log(await request.getAgent());
I.e. this works successfully in python:
import requests
header = {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.4 Safari/605.1.15'
};
url = 'https://books.toscrape.com';
print(requests.get(url, header).content)

How to fetch and get response header with no cors?

I try to fetch file hosting with browser. i disable cors web security with extension https://chrome.google.com/webstore/detail/allow-cors-access-control/lhobafahddgcelffkeicbaginigeejlf?hl=en. i tested. fetch return the html. but the problem i dont get the response headers. like cookie. it just give me 4 key. i need that response header cookie to do next fetch.
const initHeader = new Headers()
initHeader.append('accept', '*/*')
initHeader.append('connection', 'keep-alive')
initHeader.append('user-agent', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.127 Safari/537.36')
fetch('https://www56.zippyshare.com/v/d0M2gx7X/file.html', {
method: 'GET',
headers: initHeader
}).then(function(res) {
res.headers.forEach((val, key) => {
console.log(key + ': ' + val)
})
})

set user agent in a node request

I try to set user agent to a npm request. Here is the documentation, but it gives the following error:
Error: Invalid URI "/"
const request = require('async-request')
const run = async (url) => {
const {statusCode} = await request(url)
console.log(statusCode) // 200, works
const options = {
url,
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'
}
}
await request(options) // Error: Error: Invalid URI "/"
}
run('https://github.com/')
sandbox
I also tried request.get, as it mentioned in here, but it gives "request.get is not a function" error.
The problem is you are looking at the documentation of request, but using the async-request, which dont support calling with the object argument like you do.
const request = require('async-request')
const run = async (url) => {
const {statusCode} = await request(url)
console.log(statusCode) // 200, works
const options = {
headers: {
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.122 Safari/537.36'
}
}
const res = await request(url, options);
console.log(res.statusCode) // 200, works
}
run('https://github.com/')

Automating Sign Up

I'm trying to login to Amazon via the node.js request module, and seem to be having difficulties.
My aim is to login to the site via their form, here is my code:
const request = require("request");
const rp = require("request-promise");
var querystring = require("querystring");
var cookieJar = request.jar();
var mainUrl = "https://www.amazon.com/";
var loginUrl = "https://www.amazon.co.uk/ap/signin";
let req = request.defaults({
headers: {
"User-Agent":
"Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.61 Safari/537.36"
},
jar: cookieJar,
gzip: true,
followAllRedirects: true
});
var loginData =
"email=email#me.com&create=0&password=password123";
req.post(loginUrl, { data: loginData }, function(err, res, body) {
console.log(body);
});
I ran a debugger in the background, and found this seemed to be the URL called. I'm wondering if anyone knows what I may have done incorrectly.
Thank you.

Web scraping in Node.js

Lately I've been trying to scrape Information from a website using Nodejs, the request module and cheerio. The code it works well (statusCode = 200) on my localhost (127.0.0.1) but when I push the code to Heroku server, statusCode = 403.
Is it because of the cookie? If yes, why does it work on my localhost that doesn't add any cookie in the request?
request({
method: 'GET',
headers: {
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36'
},
url: 'https://www.example.com/login',
json: true
}, (err, response, body) => {
if (err) {
return console.log('Failed to request: ', err);
}
console.log(response.statusCode);
});

Categories

Resources