Where does document.referrer come from?

Where does document.referrer come from? - javascript

I have the following script at http://localhost/test.html:
<script>
alert(document.referrer);
</script>
If I access it directly the result is an empty alert, which isn't surprising.
If I link from a different document at http://example.com/different.html, the alert will be that URL, again, not surprising.
What is suprising to me is that, if I intercept the HTTP request and change the Referer Header:
GET /test.html HTTP/1.1
Host: localhost
Referer: test
Then the alert will still alert the original URL, not test.
So where does document.referrer come from if not from the referer HTTP Header? Is it not influenced by the HTTP request at all? Is there a standard for this, or do different browsers handle it differently? And is there a way to influence it, without creating a new file linking to the code myself?

Referrer header your have intercepted is the request done by the client to the server. The client already know which is the referring page, you cannot fool it.

Per MDN documentation:
document.referrer:
Returns the URI of the page that linked to this page.
Further notes on why it displays empty to you:
The value is an empty string if the user navigated to the page
directly (not through a link, but, for example, via a bookmark). Since
this property returns only a string, it does not give you DOM access
to the referring page.
More info can be found at: MDN
Now looking at the developer tools from both Chrome, Firefox and IE I can see the header is being set to: Referer:https://www.google.com/ whenever I hit a search result from google and this value is being set automatically by the browser. How it's set depends on browser implementor but this is the corresponding document describing the header value RFC 7231
The "Referer" [sic] header field allows the user agent to specify a
URI reference for the resource from which the target URI was obtained
(i.e., the "referrer", though the field name is misspelled).

The value is set by the browser, I mean the browser is setting the value "test" when you are doing the http request.

Related

How to modify request bodies by chrome extension

I've tried use chrome.webRequest API and finally found out that it looks like google don't allow us to modify requestBodies of POST requests? I could only cancel it or modify its headers.
So is there any other way to modify the raw (not form) body of a post request? I know that a proxy server could do that but I want to deal with it using extension.

This works for some cases: first, save the body of the request in a variable in the onBeforeRequest listener. Then, in onBeforeSendHeaders you can either cancel or redirect the original request (sorry, Chrome gives you only two options to deal with the original). Also in onBeforeSendHeaders, you issue a new request (say, jquery ajax) to which you attach the old body from the variable, and the old headers - both can be modified/rewritten as needed.
(Minor catch: it won't let you set all of the headers for "security reasons", so you may add one more onBeforeSendHeaders listener to add the sensitive headers to the new request as well).
Works for cases when the request issuer is happy with a redirect or cancel as a response. If the request issuer expects the full actual response, intact, with no redirects, then it gets harder.

Chrome extension Netify allows request modification, including POST body

Can I use javascript to get a request header in http get request? [duplicate]

I want to capture the HTTP request header fields, primarily the Referer and User-Agent, within my client-side JavaScript. How may I access them?
Google Analytics manages to get the data via JavaScript that they have you embed in you pages, so it is definitely possible.
Related:
Accessing the web page's HTTP Headers in JavaScript

If you want to access referrer and user-agent, those are available to client-side Javascript, but not by accessing the headers directly.
To retrieve the referrer, use document.referrer.
To access the user-agent, use navigator.userAgent.
As others have indicated, the HTTP headers are not available, but you specifically asked about the referer and user-agent, which are available via Javascript.

Almost by definition, the client-side JavaScript is not at the receiving end of a http request, so it has no headers to read. Most commonly, your JavaScript is the result of an http response. If you are trying to get the values of the http request that generated your response, you'll have to write server side code to embed those values in the JavaScript you produce.
It gets a little tricky to have server-side code generate client side code, so be sure that is what you need. For instance, if you want the User-agent information, you might find it sufficient to get the various values that JavaScript provides for browser detection. Start with navigator.appName and navigator.appVersion.

This can be accessed through Javascript because it's a property of the loaded document, not of its parent.
Here's a quick example:
<script type="text/javascript">
document.write(document.referrer);
</script>
The same thing in PHP would be:
<?php echo $_SERVER["HTTP_REFERER"]; ?>

Referer and user-agent are request header, not response header.
That means they are sent by browser, or your ajax call (which you can modify the value), and they are decided before you get HTTP response.
So basically you are not asking for a HTTP header, but a browser setting.
The value you get from document.referer and navigator.userAgent may not be the actual header, but a setting of browser.

One way to obtain the headers from JavaScript is using the WebRequest API, which allows us to access the different events that originate from http or websockets, the life cycle that follows is this:
WebRequest Lifecycle
So in order to access the headers of a page it would be like this:
browser.webRequest.onHeadersReceived.addListener(
(headersDetails)=> {
console.log("Request: " + headersDetails);
},
{urls: ["*://hostName/*"]}
);`
The issue is that in order to use this API, it must be executed from the browser, that is, the browser object refers to the browser itself (tabs, icons, configuration), and the browser does have access to all the Request and Reponse of any page , so you will have to ask the user for permissions to be able to do this (The permissions will have to be declared in the manifest for the browser to execute them)
And also being part of the browser you lose control over the pages, that is, you can no longer manipulate the DOM, (not directly) so to control the DOM again it would be done as follows:
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
code: 'console.log("Headers success")',
});
});
or if you want to run a lot of code
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
file: './headersReveiced.js',
});
});
Also by having control over the browser we can inject CSS and images
Documentation: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onHeadersReceived

I would imagine Google grabs some data server-side - remember, when a page loads into your browser that has Google Analytics code within it, your browser makes a request to Google's servers; Google can obtain data in that way as well as through the JavaScript embedded in the page.

var ref = Request.ServerVariables("HTTP_REFERER");
Type within the quotes any other server variable name you want.

When Is redirectStart populated in NavigationTiming object?

I'm using the NavigationTiming object to monitor performance of my website.
According to the W3C document on the redirectStart property of the NavigationTiming object:
If there are HTTP redirects or equivalent when navigating and if all the redirects or equivalent are from the same origin, this attribute must return the starting time of the fetch that initiates the redirect.
My site currently has a login page, which submits POST and receives a 302 redirect to a welcome page. I expected the NavigationTiming object on the welcome page to include the redirectStart and redirectEnd properties to be populated, but they are 0.
When should they be populated, if not then?

Are the login page and welcome pages different origins, for example, different schemes (HTTP/HTTPS) or servers (foo.com vs www.foo.com)?
NavigationTiming will zero-out the redirectStart and redirectEnd attributes if the origin differs between the two addresses in any way.
(Note the POST method is not part of the origin calculation and thus should not cause the property to zero-out just because it's a POST 302 to a GET)
If you care to share your site I could investigate further.
You can also verify your browser properly supports the redirect case by running the W3C test cases here.

How can I access HTTP request headers from Javascript? [duplicate]

I want to capture the HTTP request header fields, primarily the Referer and User-Agent, within my client-side JavaScript. How may I access them?
Google Analytics manages to get the data via JavaScript that they have you embed in you pages, so it is definitely possible.
Related:
Accessing the web page's HTTP Headers in JavaScript

If you want to access referrer and user-agent, those are available to client-side Javascript, but not by accessing the headers directly.
To retrieve the referrer, use document.referrer.
To access the user-agent, use navigator.userAgent.
As others have indicated, the HTTP headers are not available, but you specifically asked about the referer and user-agent, which are available via Javascript.

Almost by definition, the client-side JavaScript is not at the receiving end of a http request, so it has no headers to read. Most commonly, your JavaScript is the result of an http response. If you are trying to get the values of the http request that generated your response, you'll have to write server side code to embed those values in the JavaScript you produce.
It gets a little tricky to have server-side code generate client side code, so be sure that is what you need. For instance, if you want the User-agent information, you might find it sufficient to get the various values that JavaScript provides for browser detection. Start with navigator.appName and navigator.appVersion.

This can be accessed through Javascript because it's a property of the loaded document, not of its parent.
Here's a quick example:
<script type="text/javascript">
document.write(document.referrer);
</script>
The same thing in PHP would be:
<?php echo $_SERVER["HTTP_REFERER"]; ?>

Referer and user-agent are request header, not response header.
That means they are sent by browser, or your ajax call (which you can modify the value), and they are decided before you get HTTP response.
So basically you are not asking for a HTTP header, but a browser setting.
The value you get from document.referer and navigator.userAgent may not be the actual header, but a setting of browser.

One way to obtain the headers from JavaScript is using the WebRequest API, which allows us to access the different events that originate from http or websockets, the life cycle that follows is this:
WebRequest Lifecycle
So in order to access the headers of a page it would be like this:
browser.webRequest.onHeadersReceived.addListener(
(headersDetails)=> {
console.log("Request: " + headersDetails);
},
{urls: ["*://hostName/*"]}
);`
The issue is that in order to use this API, it must be executed from the browser, that is, the browser object refers to the browser itself (tabs, icons, configuration), and the browser does have access to all the Request and Reponse of any page , so you will have to ask the user for permissions to be able to do this (The permissions will have to be declared in the manifest for the browser to execute them)
And also being part of the browser you lose control over the pages, that is, you can no longer manipulate the DOM, (not directly) so to control the DOM again it would be done as follows:
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
code: 'console.log("Headers success")',
});
});
or if you want to run a lot of code
browser.webRequest.onHeadersReceived.addListener(
browser.tabs.executeScript({
file: './headersReveiced.js',
});
});
Also by having control over the browser we can inject CSS and images
Documentation: https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/webRequest/onHeadersReceived

I would imagine Google grabs some data server-side - remember, when a page loads into your browser that has Google Analytics code within it, your browser makes a request to Google's servers; Google can obtain data in that way as well as through the JavaScript embedded in the page.

var ref = Request.ServerVariables("HTTP_REFERER");
Type within the quotes any other server variable name you want.

Hard refresh and XMLHttpRequest caching in Internet Explorer/Firefox

I make an Ajax request in which I set the response cacheability and last modified headers:
if (!String.IsNullOrEmpty(HttpContext.Current.Request.Headers["If-Modified-Since"]))
{
HttpContext.Current.Response.StatusCode = 304;
HttpContext.Current.Response.StatusDescription = "Not Modified";
return null;
}
HttpContext.Current.Response.Cache.SetCacheability(HttpCacheability.Public);
HttpContext.Current.Response.Cache.SetLastModified(DateTime.UtcNow);
This works as expected. The first time I make the Ajax request, I get 200 OK. The second time I get 304 Not Modified.
When I hard refresh in Chrome (Ctrl+F5), I get 200 OK - fantastic!
When I hard refresh in Internet Explorer/Firefox, I get 304 Not Modified. However, every other resource (JS/CSS/HTML/PNG) returns 200 OK.
The reason is because the "If-Not-Modified" header is sent for XMLHttpRequest's regardless of hard refresh in those browsers. I believe Steve Souders documents it here.
I have tried setting an ETag and conditioning on "If-None-Match" to no avail (it was mentioned in the comments on Steve Souders page).
Has anyone got any gems of wisdom here?
Thanks,
Ben
Update
I could check the "If-Modified-Since" against a stored last modified date. However, hopefully this question will help other SO users who find the header to be set incorrectly.
Update 2
Whilst the request is sent with the "If-Modified-Since" header each time. Internet Explorer won't even make the request if an expiry isn't set or is set to a future date. Useless!
Update 3
This might as well be a live blog now. Internet Explorer doesn't bother making the second request when localhost. Using a real IP or the loopback will work.

Prior to IE10, IE does not apply the Refresh Flags (see http://blogs.msdn.com/b/ieinternals/archive/2010/07/08/technical-information-about-conditional-http-requests-and-the-refresh-button.aspx) to requests that are not made as a part of loading of the document.
If you want, you can adjust the target URL to contain a nonce to prevent the cached copy from satisfying a future request. Alternatively, you can send max-age=0 to force IE to conditionally revalidate the resource before each reuse.
As for why the browser reuses a cached resource that didn't specify a lifetime, please see http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx

The solution i came upon for consistent control was managing the cache headers for all request types.
So, I forced standard requests the same as XMLHttpRequests, which was telling IE to use the following cache policy: Cache-Control: private, max-age=0.
For some reason, IE was not honoring headers for various requests types. For example, my cache policy for standard requests defaulted to the browser and for XMLHttpRequests, it was set to the aforementioned control policy. However, making a request to something like /url as a standard get request, render the result properly. Unfortunately, making the same request to /url as an XMLHttpRequest, would not even hit the server because the get request was cached and the XMLHttpRequest was hitting the same url.
So, either force your cache policy on all fronts or make sure you're using different access points (uri's) for your request types. My solution was the former.

Develop Reference

JavaScript is the programming language of the Web.