URL change between onBeforeNavigate and onCommitted of chrome.webNavigation

URL change between onBeforeNavigate and onCommitted of chrome.webNavigation - javascript

When using the chrome.webNavigation API, the following code (used in background page of an extension):
chrome.webNavigation.onCommitted.addListener(function(data) {
console.log('onCommitted', data.tabId, data.url);
});
chrome.webNavigation.onBeforeNavigate.addListener(function(data) {
console.log('onBeforeNavigate', data.tabId, data.url);
});
produces this output when navigating to, say, 'http://drive.google.com'
newTest.js:18 onBeforeNavigate 606 http://drive.google.com/
newTest.js:18 onCommitted 606 https://drive.google.com/
Somewhere, even before the request was sent to the server, Chrome changed the url from http to https.
This behaviour is also exhibited in other cases. For instance for 'http://getpocket.com', where it also adds a new path:
newTest.js:18 onBeforeNavigate 626 http://getpocket.com/
newTest.js:18 onCommitted 626 https://getpocket.com/beta/
The server side redirects all come after onCommitted, but this is one case where Chrome modifies urls even before it sends a request to the server.
Is this behaviour documented somewhere, so I can predictably handle it?

For Google Drive, it's HTTP Strict Transport Security kicking in.
After it's set up, the browser will automatically redirect everything to HTTPS.
You can look under the hood at net-internals, e.g. chrome://net-internals/#hsts
static_sts_domain: drive.google.com
static_upgrade_mode: STRICT
In case of Pocket, this seems to be a 301 Moved Permanently redirect.
By design, browsers cache this response permanently (at least Chrome does) and rewrite links automatically without hitting the server until said cache is cleared.

Related

What makes a browser load content from an http URL even when frontend source has an https URL?

my vue component is loading external content in an iframe
<iframe src="https://external-site" />
works fine locally, but once I deploy to my https site
Mixed Content: The page at 'https://my-site' was
loaded over HTTPS, but requested an insecure frame
'http://external-site'. This request has been blocked; the
content must be served over HTTPS.
network tab shows 2 requests, both have status (cancelled), and both have request url is HTTPS..

For general cases like redirecting URLs with no trailing slash to corresponding URLs with trailing slash added, some servers have broken configurations with http: hardcoded in the redirect — even if the server has other configuration that subsequently redirects all http URLs to https.
For example, the case in the question had a URL https://tithe.ly/give?c=1401851 (notice the missing trailing slash) that was redirecting to http://tithe.ly/give/?c=1401851 (notice the http, no-https). So that’s where the browser stopped and reported a mixed-content error.
That http://tithe.ly/give/?c=1401851 redirected to https://tithe.ly/give/?c=1401851 (https) in this case. So the fix for the problem in the question would be to change the request URL in the source to https://tithe.ly/give/?c=1401851 (with trailing slash included).
If you were to open https://tithe.ly/give?c=1401851 (no trailing slash) directly in a browser, the chain of redirects described in this answer just happens transparently and so it looks superficially like the original URL is OK. That can leave you baffled about why it doesn’t work.
Also: when you check the Network pane in browser devtools, it’s not going to readily show you the redirect chain, because as noted above, browsers follow redirects transparently — except when the chain has a non-https URL, causing the browser to stop, breaking the chain.
So the general troubleshooting/debugging tip for this kind of problem is: Check the request URL using a command-line HTTP client like curl, and step through each of the redirects it reports, looking carefully at the Location response-header values; like this:
$ curl -i https://tithe.ly/give?c=1401851
…
location: http://tithe.ly/give/?c=1401851
…
$ curl -i http://tithe.ly/give/?c=1401851
…
Location: https://tithe.ly/give/?c=1401851

Service Worker failure - Redirected response while RedirectMode is not "follow"

Browser: Firefox 58.0.2 (64-bit)
I'm trying to write a very simple service worker to cache content for offline mode, following the guidance here and here.
When I load the page the first time, the service worker is installed properly. I can confirm it's running by looking in about:debugging#workers.
However, at this point if I attempt to refresh the page (whether online or offline), or navigate to any other page in the site, I get the following error:
The site at https://[my url] has experienced a network protocol
violation that cannot be repaired.
The page you are trying to view cannot be shown because an error in
the data transmission was detected.
Please contact the website owners to inform them of this problem.
The console shows this error:
Failed to load ‘https://[my url]’. A ServiceWorker passed a redirected Response to FetchEvent.respondWith() while RedirectMode is not ‘follow’.
In Chrome, I get this:
Uncaught (in promise) TypeError: Failed to execute 'fetch' on 'ServiceWorkerGlobalScope': Cannot construct a Request with a Request whose mode is 'navigate' and a non-empty RequestInit.
Per this thread, I added the { redirect: "follow" } parameter to the fetch() function, but to no avail.
(Yes I did manually uninstall the Service Worker from the about:debugging page after making the change.)
From what I understand, however, it's the response, not the fetch, that's causing the problem, right? And this is due to my server issuing a redirect when serving the requested content?
So how do I deal with redirects in the service worker? There are obviously going to be some, and I still want to cache the data.

Partly spun off from https://superuser.com/a/1388762/84988
I sometimes get the problem with Gmail with Waterfox 56.2.6 on FreeBSD-CURRENT. (Waterfox 56 was based on Firefox 56.0.2.) Sometimes when simply reloading the page; sometimes when loading the page in a restored session; and so on.
FetchEvent.respondWith() | MDN begins with an alert:
This is an experimental technology …
At a glance, the two bugs found by https://bugzilla.mozilla.org/buglist.cgi?quicksearch=FetchEvent.respondWith%28%29 are unrelated.
Across the Internet there are numerous reports, from users of Gmail with Firefox, of Corrupted Content Error, network protocol violation etc.. Found:
Mozilla bug 1495275 - Corrupted Content Error for gmail

Where does document.referrer come from?

I have the following script at http://localhost/test.html:
<script>
alert(document.referrer);
</script>
If I access it directly the result is an empty alert, which isn't surprising.
If I link from a different document at http://example.com/different.html, the alert will be that URL, again, not surprising.
What is suprising to me is that, if I intercept the HTTP request and change the Referer Header:
GET /test.html HTTP/1.1
Host: localhost
Referer: test
Then the alert will still alert the original URL, not test.
So where does document.referrer come from if not from the referer HTTP Header? Is it not influenced by the HTTP request at all? Is there a standard for this, or do different browsers handle it differently? And is there a way to influence it, without creating a new file linking to the code myself?

Referrer header your have intercepted is the request done by the client to the server. The client already know which is the referring page, you cannot fool it.

Per MDN documentation:
document.referrer:
Returns the URI of the page that linked to this page.
Further notes on why it displays empty to you:
The value is an empty string if the user navigated to the page
directly (not through a link, but, for example, via a bookmark). Since
this property returns only a string, it does not give you DOM access
to the referring page.
More info can be found at: MDN
Now looking at the developer tools from both Chrome, Firefox and IE I can see the header is being set to: Referer:https://www.google.com/ whenever I hit a search result from google and this value is being set automatically by the browser. How it's set depends on browser implementor but this is the corresponding document describing the header value RFC 7231
The "Referer" [sic] header field allows the user agent to specify a
URI reference for the resource from which the target URI was obtained
(i.e., the "referrer", though the field name is misspelled).

The value is set by the browser, I mean the browser is setting the value "test" when you are doing the http request.

Hard refresh and XMLHttpRequest caching in Internet Explorer/Firefox

I make an Ajax request in which I set the response cacheability and last modified headers:
if (!String.IsNullOrEmpty(HttpContext.Current.Request.Headers["If-Modified-Since"]))
{
HttpContext.Current.Response.StatusCode = 304;
HttpContext.Current.Response.StatusDescription = "Not Modified";
return null;
}
HttpContext.Current.Response.Cache.SetCacheability(HttpCacheability.Public);
HttpContext.Current.Response.Cache.SetLastModified(DateTime.UtcNow);
This works as expected. The first time I make the Ajax request, I get 200 OK. The second time I get 304 Not Modified.
When I hard refresh in Chrome (Ctrl+F5), I get 200 OK - fantastic!
When I hard refresh in Internet Explorer/Firefox, I get 304 Not Modified. However, every other resource (JS/CSS/HTML/PNG) returns 200 OK.
The reason is because the "If-Not-Modified" header is sent for XMLHttpRequest's regardless of hard refresh in those browsers. I believe Steve Souders documents it here.
I have tried setting an ETag and conditioning on "If-None-Match" to no avail (it was mentioned in the comments on Steve Souders page).
Has anyone got any gems of wisdom here?
Thanks,
Ben
Update
I could check the "If-Modified-Since" against a stored last modified date. However, hopefully this question will help other SO users who find the header to be set incorrectly.
Update 2
Whilst the request is sent with the "If-Modified-Since" header each time. Internet Explorer won't even make the request if an expiry isn't set or is set to a future date. Useless!
Update 3
This might as well be a live blog now. Internet Explorer doesn't bother making the second request when localhost. Using a real IP or the loopback will work.

Prior to IE10, IE does not apply the Refresh Flags (see http://blogs.msdn.com/b/ieinternals/archive/2010/07/08/technical-information-about-conditional-http-requests-and-the-refresh-button.aspx) to requests that are not made as a part of loading of the document.
If you want, you can adjust the target URL to contain a nonce to prevent the cached copy from satisfying a future request. Alternatively, you can send max-age=0 to force IE to conditionally revalidate the resource before each reuse.
As for why the browser reuses a cached resource that didn't specify a lifetime, please see http://blogs.msdn.com/b/ie/archive/2010/07/14/caching-improvements-in-internet-explorer-9.aspx

The solution i came upon for consistent control was managing the cache headers for all request types.
So, I forced standard requests the same as XMLHttpRequests, which was telling IE to use the following cache policy: Cache-Control: private, max-age=0.
For some reason, IE was not honoring headers for various requests types. For example, my cache policy for standard requests defaulted to the browser and for XMLHttpRequests, it was set to the aforementioned control policy. However, making a request to something like /url as a standard get request, render the result properly. Unfortunately, making the same request to /url as an XMLHttpRequest, would not even hit the server because the get request was cached and the XMLHttpRequest was hitting the same url.
So, either force your cache policy on all fronts or make sure you're using different access points (uri's) for your request types. My solution was the former.

jQuery response is empty in browser, though curl works

I'm using jQuery's .ajax() to call a server (actually local Django runserver) and get a response.
On the server console, I can see that the JSON request comes in, he proper JSON response is made, and everything looks OK.
But in my browser (tested on Firefox 3.6 and Safari 4.0.4, and I'm using jQuery 1.4.2), it seems the response body is empty (the response code is 200, and the headers otherwise look OK).
Testing the response from the command line, I get the answer I expect.
$ curl http://127.0.0.1:8000/api/answers/answer/1 --data "text=test&user_name=testy&user_location=testyville&site=http%3A%2F%2Flocalhost%3A8888%2Fcs%2Fjavascript_answer_form.html&email_address="
{"answer_permalink": "http://127.0.0.1:8000/questions/1", "answer_id": 16, "question_text": "What were the skies like when you were young?", "answer_text": "test", "question_id": "1"}
I am making the request from an HTML file on my local machine that is not being served by a web browser. It's just addressed using file://. The django server is also local, at 127.0.0.1:8000, the default location.
Thanks for any suggestions!
-Jim

Unless you specifically allow your browser alternate settings for local files, everything remains bound by the cross-domain security policy. Files not on a domain (like localhost) can not request files from that domain.
I'm not sure how cross-domain policy works with ports; you may be able to put this file in your port-80-accessible localhost folder (if you have one) and get the job done. Otherwise, you're stuck, unless you can change browser settings to make exceptions (and even then I'm not sure this is doable in any standard browsers).

Add an "error: function(data){alert(data);}" to see if your $.ajax is failing.

Change 'complete' to 'success' in your .ajax() call. 'complete' is used to signal when the ajax operation is done but does not provide the response data. 'success' is called with a successful request and receives the response. 'error' is the counterpart to 'success', used for error handling.
I think browsers (at least some, like Safari, for me) treat files served off the file system as trusted sources in terms of the same-origin policy. So that turned out to be a red herring here.

Develop Reference

JavaScript is the programming language of the Web.