Check if a request is a subresource integrity in a Chrome extension - javascript

Is it possible to check if a script/stylesheet is integrity protected via subresource-integrity (SRI) from a Chrome extension?
I want to know this before the request is initiated, so this should be done with chrome.webRequest.onBeforeRequest. But it gives no hints about the request as SRI is browser side. Everything happens after the request has finished.
From my point of view the only way to get this information is to access the DOM directly. This would mean I have to stall all requests until the HTML is completely parsed, which doesn't seem the way to go.
Maybe SRI is just too new to be accessible to extensions, as I didn't find it anywhere in the Chrome extension docs.

Yes, you can determine if a resource is protected by subresource-integrity, prior to the request for the resource being made, by checking for the appropriate attribute(s) (i.e. integrity) on the element specifying the resource as the element is added to the DOM. You can have a content script that is executed at document_start (either specified in manifest.json (run_at), or injected using tabs.executeScript()1 (runAt)). That script could then set up a MutationObserver to watch elements being placed in the DOM. Each appropriate element type (i.e. <script> and <link>) would then need to be checked for using subresource-integrity. This check/determination will occur prior to the webRequest.onBeforeRequest event.
Doing this does not stall all requests until the HTML is fully parsed. It performs the check as each element specifying a resource is entered into the DOM. On the other hand, obviously, any additional processing you introduce through the use of the MutationObserver does add some additional time to parsing the HTML, creating the DOM and loading all resources.
Getting the timing correct to have a script executed at document_start using tabs.executeScript() is non-trivial. How to do so would be a separate question.

Related

Webextension inline install chrome.runtime.connect issues

I'm having a really weird issue, I've developped a webextension that uses messaging between content script and background script (using chrome.runtime.connect) and nativemessaging.
The issue i'm facing is that when I install the extension (manually from the store beforehand and then connect to my website, everything works as expected, the chrome.runtime.connect works and returns a valid port to the background script.
But when i do an inline install of the extension from my website to get around the fact to have to navigate to have the content script in the webpage, i manually inject the content script into my page using
function injectContentScript() {
var s = document.createElement("script");
s.setAttribute("src", "chrome-extension://<extensionid>/content.js");
document.head.appendChild(s);
}
and the exact same content script but manually injected doesn't behave the same. chrome.runtime.connect returns a null object and chrome.runtime.lastError gives me
Could not establish connection. Receiving end does not exist.
I'm calling on the sender side (content.js - manually injected content script) chrome.runtime.connect(extensionID) where extension id is the id of the extension generated by the chrome webstore. And on the receiving side (background.js - extension background script) chrome.runtime.onConnect.addListener(onPortConnected);
I'm not really sure how to debug this issue, maybe it's a timing issue?
The background script is well executed even with the inline install (i've added logs and debugged it through the background.html in chrome extension manager)
Any help would be greatly appreciated!
You have two scenarios.
Your content script content.js is executed as normal upon navigation, as a content script defined in the manifest.
In this case, it executes in a special JS context attached to the page and reserved for your content scripts. See Execution Environment docs section for explanation. It is isolated from the webpage and is considered part of the extension (albeit with lower privileges).
When you connect from a content script, chrome.runtime.connect() is treated as internal communication between parts of the extension. So while you can provide the extension ID, it is not needed.
More importantly, the event raised in this case is chrome.runtime.onConnect.
Your supposed "inject content script immediately" code called from the webpage does something completely different.
Instead of creating a new execution context, the code is instead added directly to the page; it is not considered part of the extension and has no access to extension API.
Normally, a call to chrome.runtime.connect() would simply fail, as this is not a function exposed to webpages; however, you also declared externally_connectable, so it is exposed specifically to your webpage.
In this case, passing the extension ID is mandatory for the connect. You were doing this already, so the call was succeeding.
However, and that's what made it fail: the corresponding event is no longer onConnect, but onConnectExternal!
What you should be doing is:
Not mixing code that is run in very different contexts.
If you need communication from the webpage to background, always do it from the webpage, not sometimes-from-content-sometimes-from-page.
That way you only have to listen to onConnectExternal and it cuts out the need for a content script (if it was its only function).
See the docs as well: Sending messages from web pages.
You don't have to source the code from chrome-extension://<extensionid>/; you can directly add this to your website's code and potentially avoid web_accessible_resources.
And if you actually want to inject content scripts on first run, see for example this answer.
Related reading: How to properly handle chrome extension updates from content scripts

If the async attribute is used to load non blocking <script>s in the <head> are they guaranteed to be loaded before body.readyState == 'loaded'

Without the async or defer attribute the loading of JavaScript blocks the browser and any scripts loaded in the <head> are always loaded before the dom is loaded and body.readyState == 'loaded'.
My question is specific to the use of the async attribute to allow a non blocking <script> in the <head>. Some browsers can then start rendering the DOM while the javascript is still being retrieved. I have found situations where at least Chrome definitely does render prior to the async JavaScript load in the <head> completing.
Are these async loaded scripts and as a result the <head> guaranteed to be loaded before body.readyState == 'loaded' and traditional dom ready javascript is executed?
I can confirm in waterfalls using Chrome, Firefox and IE11 that in practice in my test cases the onload processing always occurred after all the JavaScript had loaded, and immediately afterwards in some cases, giving the impression that in current browsers the async does not break the assumption that JavaScript has loaded before the body state changes.
This however is anecdotal evidence and what I am looking for is a standard reference or reference/reasoning regarding the browser architecture that gives comfort that for a large javascript loaded in the <head> with async and a small <body> I will not find situations where the <body> completes loading and has a state of loaded before the <head> due to the use of non blocking async script loads.
readyState is actually implemented by document and as a result the <body> only has a readyState of 'loaded' when the complete document (Both the <head> and the <body>) have loaded. Using the async attribute to load JavaScript in the <head> is safe in that all JavaScript will have loaded before the <body>, or actually anything, appears to be 'loaded'.
When the <body> is asked for its readyState the request is being responded to by its parentNode which is the document. Document implements readyState.
The code initiating the on-load processing was:
var body = document.getElementsByTagName('BODY')[0];
if (body && body.readyState == 'loaded')
{ ... }
So yes, it is guaranteed that even async JavaScript files will be fully loaded before any on-load processing that tests a <body>'s readyState receives 'loaded'.
It however, would be far clearer, just to ask the document for its readyState.
Many thanks to #Bergi for pointing out, in a comment to the question, that it is actually document that is implementing readyState and for locating the documentation:
https://developer.mozilla.org/en-US/docs/Web/API/Document/readyState
There are many things to consider when you want to know wich part of your html will be rendered first.
Size of the script being loaded
Size of the rest of the document
What tecniques the browser uses to optimize the loading of content
Some browsers use speculative parsing to continue loading even if it finds a blocking script. Depending if the speculative loading succeeds the browser will use the content already retreived instead of continuing rendering the requested document.
You can read more about speculative parsing in the mozilla docs
If for example you have a huge script, say a complete spa aplication using an async attribute, is very likely that your document will render before the loading of the script finishes but this does not mean a loaded event will be triggered.
The readyState of the body it will be changed to 'loaded' after all assets (scripts, styles, etc) are loaded. Read loaded as downloaded or failed (also includes timeout). Your page is also considered one of this assets so this event only happens if all this conditions are true.
The head of your document is also part of your page and needs to be downloaded before the documents is declared as ready and the onload event is triggered.
On the other side javascript execution is a different matter. Using the async attribute only garantees that the browser will continue to parse the document and execute the script after it has finished downloading which could be in any moment. Your document might be already downloaded or not, that depends on the ammount of content remaining. Also there is a chance that there are already other scripts being executed wich will delay the execution of your script.
The defer attribute is not a standard and can not be used as a garantee either.
From the docs
This Boolean attribute is set to indicate to a browser that the script is meant to be executed after the document has been parsed. Since this feature hasn't yet been implemented by all other major browsers, authors should not assume that the script’s execution will actually be deferred. The defer attribute shouldn't be used on scripts that don't have the src attribute. Since Gecko 1.9.2, the defer attribute is ignored on scripts that don't have the src attribute. However, in Gecko 1.9.1 even inline scripts are deferred if the defer attribute is set.
Finally you should note that the async attribute is not supported in all browsers and will have no effect if your page is being requested from an older browser.

Identify requests coming from PageWorker

Is it possible, from within the "http-on-modify-request" event, to identify which requests are coming from a PageWorker object, as opposed to those coming from visible tabs/windows?
Note: Because of redirects and subresources, the URL here is NOT the same URL as the pageWorkers contentURL property.
require("sdk/system/events").on("http-on-modify-request", function(e) {
var httpChannel = e.subject.QueryInterface(Ci.nsIHttpChannel),
url = httpChannel.URI.spec,
origUrl = httpChannel.originalURI.spec;
...
});
I don't know of any way to actually distinguish page-worker requests from "regular" ones.
Current, page workers are implemented like this:
The SDK essentially creates an <iframe> in the hiddenWindow (technically, in sdk/addon/window, which creates a hidden window in the hiddenWindow). The hiddenWindow in mozilla applications is more or less an always-present top-level XUL or HTML window that is simply hidden.
The worker page is loaded into that iframe.
The page-worker will then operate on the DOM on that iframe.
It is possible to identify requests originating from the hidden window and the document within the hidden window.
But identifying if the request or associated document belongs to a page-worker, let alone which page-worker instance, doesn't seem possible, judging from the code. The SDK itself could map the document associated with a request back to a page-worker, as it keeps some WeakMaps around to do so, but that is internal stuff you cannot access.
You only can say that a request is not coming from a page-worker when it is not coming from the hiddenWindow.
Also, keep in mind that there are tons of requests originating neither from a tab nor page-worker: Other (XUL) windows, add-ons, js modules and components, etc...
If it a page-worker created by your add-on that you're interested in: The contentURL property should reflect the final URI once the page is loaded.

Safe to create cookie before document ready?

I'm currently saving a cookie in jQuery's document ready event handler, like:
$(function() {
document.cookie = <cookie with info not dependent on DOM>
});
Is it possible and safe to save a cookie even earlier, e.g. as a JavaScript statement outside any event handler that executes as the JavaScript file is being interpreted? Any browsers that may not be reliable to do in?
It is 100% ok to read and write to cookies before the DOM has completed loading if you are not dependent on values from the DOM. If you use the Ghostery extension for Chrome and go to any website you can have a look at the tracking tags that load before the DOM is ready, most of which will be using normal cookies and that will give you an idea of how common it is to do this.

defer script execution for some time

Is it possible, in javascript, to defer the execution of a self-execution function that resides at the top level closure? (For you to understand, jQuery uses this. It has a self-execution function at global scope)
My objective here is to make the whole external script to execute at a later time without deferring it's download, such that it downloads as soon as possible and executes after an event (I decide when that event happens).
I found this also:
https://developer.mozilla.org/en-US/docs/DOM/element.onbeforescriptexecute
Sounds like a "part 1" of what I want. If I cancel, the script is not executed. but then... How do I effectively execute it later?
Note1: I have absolutely no control over that external script, so I cannot change it.
Note2: Both me and who requested this does not care if there's slowness in IE due to stuff like this (yeah, true! (s)he's anti-IE like me)
Edit (additional information):
#SLaks The server sends the default headers that apache send (so-it-seems) plus a cache-control header ordering to cache for 12 hours.
From the POV of the rules about domains, it is cross-domain.
I'm not allowed to change the definitions about the other server. I can try to asking for it, though.
Unfortunately it's not possible to do with the <script> tag because you can't get the text of an external script included that way to wrap it in code to delay it. If you could, that would open up a whole new world of possibilities with Cross Site Request Forgery (CSRF).
However, you can fetch the script with an AJAX request as long as it's hosted at the same domain name. Then you can delay its execution.
$.ajax({
url: '/path/to/myscript.js',
success: function(data) {
// Replace this with whatever you're waiting for to delay it
$.ready(function() {
eval(data);
});
}
});
EDIT
I've never used $.getScript() myself since I use RequireJS for including scripts, so I don't know if it will help with what you need, but have a look at this: http://api.jquery.com/jQuery.getScript/
FYI
If you want to know what I'm talking about with the CSRF attacks, Check this out.

Categories

Resources