I am trying to get the links from a website. When I try to connect through the terminal I get this message: "You have to turn on javascript and cookies support in browser to visit this site". I have tried many different codes here in stackoverflow and all around google. None are working the way I would like it to. None of them is getting any data from this website that I try to get data from. Other sites work.
<?php
function get_url( $url, $javascript_loop = 0, $timeout = 5 )
{
$url = str_replace( "&", "&", urldecode(trim($url)) );
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_ENCODING, "" );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false ); # required for https urls
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
$content = curl_exec( $ch );
$response = curl_getinfo( $ch );
if(curl_exec($ch) === false)
{
echo 'Curl error: ' . curl_error($ch);
}
curl_close ( $ch );
if ($response['http_code'] == 301 || $response['http_code'] == 302)
{
ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
if ( $headers = get_headers($response['url']) )
{
foreach( $headers as $value )
{
if ( substr( strtolower($value), 0, 9 ) == "location:" )
return get_url( trim( substr( $value, 9, strlen($value) ) ) );
}
}
}
if ( ( preg_match("/>[[:space:]]+window\.location\.replace\('(.*)'\)/i", $content, $value) || preg_match("/>[[:space:]]+window\.location\=\"(.*)\"/i", $content, $value) ) &&
$javascript_loop < 5
)
{
return get_url( $value[1], $javascript_loop+1 );
}
else
{
return array( $content, $response );
}
}
$test = get_url('http://livefootball.ws');
print_r($test);
?>
If I switch out the URL to some other website, I get the result, but with this website it is not working. Any help would be appreciated.
Try setting CURLOPT_COOKIEFILE also pointing to your $cookie and make sure you have permissions for the server to be able to write to that file. That may take care of the cookie problem. But as far as the Javascript problem, I think you're out of luck.
How to simulate that JavaScript is enabled with PHP Curl?
Related
I have some problem with PHP Curl and cookies authentication.
I have a file Connector.php which authenticates users on another server and returns the cookie of the current user.
The Problem is that I want to authenticate thousands of users with curl but it authenticates and saves COOKIES only for one user at a time.
The code for connector.php is this:
<?php
if(!count($_REQUEST)) {
die("No Access!");
}
//Core Url For Services
define ('ServiceCore', 'http://example.com/core/');
//Which Internal Service Should Be Called
$path = $_GET['service'];
//Service To Be Queried
$url = ServiceCore.$path;
//Open the Curl session
$session = curl_init($url);
// If it's a GET, put the GET data in the body
if ($_GET['service']) {
//Iterate Over GET Vars
$postvars = '';
foreach($_GET as $key=>$val) {
if($key!='service') {
$postvars.="$key=$val&";
}
}
curl_setopt ($session, CURLOPT_POST, true);
curl_setopt ($session, CURLOPT_POSTFIELDS, $postvars);
}
//Create And Save Cookies
$tmpfname = dirname(__FILE__).'/cookie.txt';
curl_setopt($session, CURLOPT_COOKIEJAR, $tmpfname);
curl_setopt($session, CURLOPT_COOKIEFILE, $tmpfname);
curl_setopt($session, CURLOPT_HEADER, false);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
curl_setopt($session, CURLOPT_FOLLOWLOCATION, true);
// EXECUTE
$json = curl_exec($session);
echo $json;
curl_close($session);
?>
Here is the process of authentication:
User enters username and password: Connector.php?service=logon&user_name=user32&user_pass=123
Connector.php?service=logosessionInfo returns info about the user based on the cookies saved earlier with logon service.
The problem is that this code saves the cookie in one file for each user and can't handle multiple user authentications.
You can specify the cookie file with a curl opt. You could use a unique file for each user.
curl_setopt( $curl_handle, CURLOPT_COOKIESESSION, true );
curl_setopt( $curl_handle, CURLOPT_COOKIEJAR, uniquefilename );
curl_setopt( $curl_handle, CURLOPT_COOKIEFILE, uniquefilename );
The best way to handle it would be to stick your request logic into a curl function and just pass the unique file name in as a parameter.
function fetch( $url, $z=null ) {
$ch = curl_init();
$useragent = isset($z['useragent']) ? $z['useragent'] : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2';
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_POST, isset($z['post']) );
if( isset($z['post']) ) curl_setopt( $ch, CURLOPT_POSTFIELDS, $z['post'] );
if( isset($z['refer']) ) curl_setopt( $ch, CURLOPT_REFERER, $z['refer'] );
curl_setopt( $ch, CURLOPT_USERAGENT, $useragent );
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, ( isset($z['timeout']) ? $z['timeout'] : 5 ) );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $z['cookiefile'] );
curl_setopt( $ch, CURLOPT_COOKIEFILE, $z['cookiefile'] );
$result = curl_exec( $ch );
curl_close( $ch );
return $result;
}
I use this for quick grabs. It takes the url and an array of options.
In working with a similar problem I created the following function after combining a lot of resources I ran into on the web, and adding my own cookie handling. Hopefully this is useful to someone else.
function get_web_page( $url, $cookiesIn = '' ){
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => true, //return headers in addition to content
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLINFO_HEADER_OUT => true,
CURLOPT_SSL_VERIFYPEER => true, // Validate SSL Certificates
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_COOKIE => $cookiesIn
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$rough_content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header_content = substr($rough_content, 0, $header['header_size']);
$body_content = trim(str_replace($header_content, '', $rough_content));
$pattern = "#Set-Cookie:\\s+(?<cookie>[^=]+=[^;]+)#m";
preg_match_all($pattern, $header_content, $matches);
$cookiesOut = implode("; ", $matches['cookie']);
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['headers'] = $header_content;
$header['content'] = $body_content;
$header['cookies'] = $cookiesOut;
return $header;
}
First create temporary cookie using tempnam() function:
$ckfile = tempnam ("/tmp", "CURLCOOKIE");
Then execute curl init witch saves the cookie as a temporary file:
$ch = curl_init ("http://uri.com/");
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec ($ch);
Or visit a page using the cookie stored in the temporary file:
$ch = curl_init ("http://somedomain.com/cookiepage.php");
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec ($ch);
This will initialize the cookie for the page:
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ckfile);
Here you can find some useful info about cURL & cookies http://docstore.mik.ua/orelly/webprog/pcook/ch11_04.htm .
You can also use this well done method https://github.com/alixaxel/phunction/blob/master/phunction/Net.php#L89 like a function:
function CURL($url, $data = null, $method = 'GET', $cookie = null, $options = null, $retries = 3)
{
$result = false;
if ((extension_loaded('curl') === true) && (is_resource($curl = curl_init()) === true))
{
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FAILONERROR, true);
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
if (preg_match('~^(?:DELETE|GET|HEAD|OPTIONS|POST|PUT)$~i', $method) > 0)
{
if (preg_match('~^(?:HEAD|OPTIONS)$~i', $method) > 0)
{
curl_setopt_array($curl, array(CURLOPT_HEADER => true, CURLOPT_NOBODY => true));
}
else if (preg_match('~^(?:POST|PUT)$~i', $method) > 0)
{
if (is_array($data) === true)
{
foreach (preg_grep('~^#~', $data) as $key => $value)
{
$data[$key] = sprintf('#%s', rtrim(str_replace('\\', '/', realpath(ltrim($value, '#'))), '/') . (is_dir(ltrim($value, '#')) ? '/' : ''));
}
if (count($data) != count($data, COUNT_RECURSIVE))
{
$data = http_build_query($data, '', '&');
}
}
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
}
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, strtoupper($method));
if (isset($cookie) === true)
{
curl_setopt_array($curl, array_fill_keys(array(CURLOPT_COOKIEJAR, CURLOPT_COOKIEFILE), strval($cookie)));
}
if ((intval(ini_get('safe_mode')) == 0) && (ini_set('open_basedir', null) !== false))
{
curl_setopt_array($curl, array(CURLOPT_MAXREDIRS => 5, CURLOPT_FOLLOWLOCATION => true));
}
if (is_array($options) === true)
{
curl_setopt_array($curl, $options);
}
for ($i = 1; $i <= $retries; ++$i)
{
$result = curl_exec($curl);
if (($i == $retries) || ($result !== false))
{
break;
}
usleep(pow(2, $i - 2) * 1000000);
}
}
curl_close($curl);
}
return $result;
}
And pass this as $cookie parameter:
$cookie_jar = tempnam('/tmp','cookie');
You can define different cookies for every user with CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR. Make different file for every user so each one would have it's own cookie-based session on remote server.
Solutions which are described above, even with unique CookieFile names, can cause a lot of problems on scale.
We had to serve a lot of authentications with this solution and our server went down because of high file read write actions.
The solution for this was to use Apache Reverse Proxy and omit CURL requests at all.
Details how to use Proxy on Apache can be found here:
https://httpd.apache.org/docs/2.4/howto/reverse_proxy.html
I know what you are thinking "there are many similar questions", but no, you are wrong.
It is true that there are many questions that seek a PHP code that is able to obtain the last URL, I have based on those questions and answers to make my code, but it does not work for all cases.
I need a function that always returns the last url (even if there are 1000 redirects) and even if the redirection was done with JavaScript, PHP, Apache or other technology.
What I'm trying to say is that my code does not return the last one in all cases, only in some cases. I have read a lot about this and I have not found a solution, I am with this problem for a month and need your help.
My code is the following:
function get_real_url($comparator, $url){
$out = "";
$final_url = get_final_url($url);
if(strpos($final_url, 'url=') !== FALSE){ //Si devuelve un string
parse_str($final_url, $out);
if(!empty($out["url"]))
return $out["url"];
else
return false;
}else
return $final_url; //Si devuelve una url completa
}
function get_final_url($url, $timeout = 5)
{
//$url = str_replace( "&", "&", urldecode(trim($url)) );
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_ENCODING, "" );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
$content = curl_exec( $ch );
$response = curl_getinfo( $ch );
curl_close ( $ch );
if ($response['http_code'] == 301 || $response['http_code'] == 302)
{
ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
$headers = #get_headers($response['url']);
if(!$headers) return $url;
$location = "";
foreach($headers as $value)
{
if (substr(strtolower($value), 0, 9) == "location:")
return get_final_url( trim( substr( $value, 9, strlen($value) ) ) );
}
}
if(preg_match("/window\.location\.replace\('(.*)'\)/i", $content, $value) || preg_match("/window\.location\=\"(.*)\"/i", $content, $value))
{
return get_final_url($value[1]);
}
else
{
return $response['url'];
}
}
What technology should I use so that my code executes (if necessary) javascript redirection code and get the last url?
If you also want to check for browser-based redirects, you should not use PHP. This will get really tricky if you have to parse and evaluate not only JS code that is directly embedded in the markup, but also module-based code that is only loaded after running some JS.
Why not use something like Selenium or a headless browser for this?
I looked around a lot before posting this question so my apologies if it is on another post and this is only my second quesiton on here so apologies if I don't format this question correctly.
I have a really simple web service that I have created that needs to take post values and return a JSON encoded array. That all worked fine until I was told I would need to post the form data with a content-type of application/json. Since then I cannot return any values from the web service and it is definitely something to do with how I am filtering their post values.
Basically in my local setup I have created a test page that does the following -
$curl = curl_init();
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Content-Length: ' . strlen($data))
);
curl_setopt($curl, CURLOPT_URL, 'http://webservice.local/'); // Set the url path we want to call
$result = curl_exec($curl);
//see the results
$json=json_decode($result,true);
curl_close($curl);
print_r($json);
On the webservice I have this (I have stripped out some of the functions) -
<?php
header('Content-type: application/json');
/* connect to the db */
$link = mysql_connect('localhost','root','root') or die('Cannot connect to the DB');
mysql_select_db('webservice',$link) or die('Cannot select the DB');
if(isset($_POST['action']) && $_POST['action'] == 'login') {
$statusCode = array('statusCode'=>1, 'statusDescription'=>'Login Process - Fail');
$posts[] = array('status'=>$statusCode);
header('Content-type: application/json');
echo json_encode($posts);
/* disconnect from the db */
}
#mysql_close($link);
?>
Basically I know that it is due to the $_POST values not being set but I can't find what I need to put instead of the $_POST. I tried
json_decode($_POST), file_get_contents("php://input") and a number of other ways but I was shooting in the dark a bit.
Any help would be greatly appreciated.
Thanks, Steve
Thanks Michael for the help, that was a definite step forward I now have at least got a repsonse when I echo the post....even if it is null
updated CURL -
$curl = curl_init();
curl_setopt($curl, CURLOPT_HTTPHEADER, array('Content-Type: application/json'));
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($curl, CURLOPT_URL, 'http://webservice.local/');
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_POSTFIELDS, json_encode($data));
updated php on the page that the data is posted to -
$inputJSON = file_get_contents('php://input');
$input= json_decode( $inputJSON, TRUE ); //convert JSON into array
print_r(json_encode($input));
As I say at least I see a response now wheras prior it was returning a blank page
You have empty $_POST. If your web-server wants see data in json-format you need to read the raw input and then parse it with JSON decode.
You need something like that:
$json = file_get_contents('php://input');
$obj = json_decode($json);
Also you have wrong code for testing JSON-communication...
CURLOPT_POSTFIELDS tells curl to encode your parameters as application/x-www-form-urlencoded. You need JSON-string here.
UPDATE
Your php code for test page should be like that:
$data_string = json_encode($data);
$ch = curl_init('http://webservice.local/');
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_POSTFIELDS, $data_string);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Content-Type: application/json',
'Content-Length: ' . strlen($data_string))
);
$result = curl_exec($ch);
$result = json_decode($result);
var_dump($result);
Also on your web-service page you should remove one of the lines header('Content-type: application/json');. It must be called only once.
Hello this is a snippet from an old project of mine that uses curl to get ip information from some free ip databases services which reply in json format. I think it might help you.
$ip_srv = array("http://freegeoip.net/json/$this->ip","http://smart-ip.net/geoip-json/$this->ip");
getUserLocation($ip_srv);
Function:
function getUserLocation($services) {
$ctx = stream_context_create(array('http' => array('timeout' => 15))); // 15 seconds timeout
for ($i = 0; $i < count($services); $i++) {
// Configuring curl options
$options = array (
CURLOPT_RETURNTRANSFER => true, // return web page
//CURLOPT_HEADER => false, // don't return headers
CURLOPT_HTTPHEADER => array('Content-type: application/json'),
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle compressed
CURLOPT_USERAGENT => "test", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 5, // timeout on connect
CURLOPT_TIMEOUT => 5, // timeout on response
CURLOPT_MAXREDIRS => 10 // stop after 10 redirects
);
// Initializing curl
$ch = curl_init($services[$i]);
curl_setopt_array ( $ch, $options );
$content = curl_exec ( $ch );
$err = curl_errno ( $ch );
$errmsg = curl_error ( $ch );
$header = curl_getinfo ( $ch );
$httpCode = curl_getinfo ( $ch, CURLINFO_HTTP_CODE );
curl_close ( $ch );
//echo 'service: ' . $services[$i] . '</br>';
//echo 'err: '.$err.'</br>';
//echo 'errmsg: '.$errmsg.'</br>';
//echo 'httpCode: '.$httpCode.'</br>';
//print_r($header);
//print_r(json_decode($content, true));
if ($err == 0 && $httpCode == 200 && $header['download_content_length'] > 0) {
return json_decode($content, true);
}
}
}
you can put your json in a parameter and send it instead of put only your json in header:
$post_string= 'json_param=' . json_encode($data);
//open connection
$ch = curl_init();
//set the url, number of POST vars, POST data
curl_setopt($ch,CURLOPT_POST, 1);
curl_setopt($ch,CURLOPT_POSTFIELDS, $post_string);
curl_setopt($curl, CURLOPT_URL, 'http://webservice.local/'); // Set the url path we want to call
//execute post
$result = curl_exec($curl);
//see the results
$json=json_decode($result,true);
curl_close($curl);
print_r($json);
on the service side you can get your json string as a parameter:
$json_string = $_POST['json_param'];
$obj = json_decode($json_string);
then you can use your converted data as object.
I'm working on a PHP/HTML script.
I want to get the current song of a Shoutcast stream with autorefresh when song changes.
Shoutcast doesn't permit HTTP request, so I'm using this script to get the page in another file:
<?php
$lurl=get_fcontent("http://5.135.39.189:8000/");
echo"cid:".$lurl[0]."<BR>";
function get_fcontent( $url, $javascript_loop = 0, $timeout = 5 ) {
$url = str_replace( "&", "&", urldecode(trim($url)) );
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_ENCODING, "" );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false ); # required for https urls
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
$content = curl_exec( $ch );
$response = curl_getinfo( $ch );
curl_close ( $ch );
if ($response['http_code'] == 301 || $response['http_code'] == 302) {
ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
if ( $headers = get_headers($response['url']) ) {
foreach( $headers as $value ) {
if ( substr( strtolower($value), 0, 9 ) == "location:" )
return get_url( trim( substr( $value, 9, strlen($value) ) ) );
}
}
}
if ( ( preg_match("/>[[:space:]]+window\.location\.replace\('(.*)'\)/i", $content, $value) || preg_match("/>[[:space:]]+window\.location\=\"(.*)\"/i", $content, $value) ) && $javascript_loop < 5) {
return get_url( $value[1], $javascript_loop+1 );
} else {
return array( $content, $response );
}
}
?>
Then I use another file to read this one with an HTML DOM Parser, but I can't find the correct string to read. What is the better way to do this?
It is much easier to request the current track title from a special Shoutcast page called 7.html:
http://5.135.39.189:8000/7.html
split that data by commas and you will get the number of listeners, bitrate and the current track title
I have tried file_get_content and curl but both don't seem to work on the website. I have used both on previous projects.
Website: https://colruyt.collectandgo.be/cogo/nl/zoeken?z=5030
Anyone has a working solution. Been looking and testing for hours now :).
Curl also does not seem to work.
HTTP/1.1 200 OK Content-Length: 5395 Pragma: no-cache Cache-Control: no-cache Content-Type: text/html
Redirects to my own main domain name
I used this code:
<?php
function geturl($url){
(function_exists('curl_init')) ? '' : die('cURL Must be installed for geturl function to work. Ask your host to enable it or uncomment extension=php_curl.dll in php.ini');
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; CrawlBot/1.0.0)');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT , 5);
curl_setopt($ch, CURLOPT_TIMEOUT, 5);
//curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_ENCODING, "");
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false); # required for https urls
curl_setopt($ch, CURLOPT_MAXREDIRS, 15);
$html = curl_exec($ch);
$status = curl_getinfo($ch);
curl_close($ch);
if($status['http_code']!=200){
if($status['http_code'] == 301 || $status['http_code'] == 302) {
list($header) = explode("\r\n\r\n", $html, 2);
$matches = array();
preg_match("/(Location:|URI:)[^(\n)]*/", $header, $matches);
$url = trim(str_replace($matches[1],"",$matches[0]));
$url_parsed = parse_url($url);
return (isset($url_parsed))? geturl($url):'';
}
}
return $html;
}
echo geturl("https://colruyt.collectandgo.be/cogo/nl/zoeken?z=5030");
?>
Take a look at request
var request = require('request');
request('https://colruyt.collectandgo.be/cogo/nl/zoeken?z=5030', function (error, response, body) {
console.log(body)
})
This prints the body of the response.