I know the same origin rule. I wonder if there is an exception such that a parent document script can read or write a child (iframe) document content but the latter cannot do it for the former, or viceversa (a iframe document script can read or write the parent document content but the latter cannot do the same).
What happens when one url is of data: type? Wikipedia is not so clear.
Two documents (i.e. frames) from the same origin can equally change one another.
If there are such exceptions, it is probably a browser specific behavior or a bug.
Wikipedia term described some behaviors that are well known (e.g. loading script from a different domain), and might been by-design. You can also change the window.name (read-write) or location (write-only) from one frame to another, even when the origins are different.
I do not know the quirks wrt data URIs.
To recap: even if you will find some odd exception - don't expect it to work across multiple browsers + browser versions. Better work on a more solid solution to your problem.
Related
Suppose I have a regular A tag with a proper URL in its href attribute (foo.com). Now, suppose I intercept the click (or tap) event of the A with a JavaScript function so, instead of performing the default action (navigate to foo.com) I can do something else first (fade out the content, for example), then navigate to a different URL (otherfoo.com).
Would this affect SEO? (meaning, will the bots get confused?)
Would the bot follow the url in the href attribute (foo.com) or would it follow through the JavaScript function (and navigate to otherfoo.com)?
By "bots" and "SEO" I am mostly referring to Google, but broader answers would be great as well.
tl:dr
No, typical crawler doesn't evaluate javascript (to that extent) as it, in the effect, serves no real purpose - only more confusion.
Disclaimer
Crawlers can probably scan javascript or CSS to obtain other info, which could affect sorting - but this is irrelevant in the scope of this questions.
Further more google crawler parses javascript but this seems to have effect on generated DOM tree, not callbacks which can be infinitely complex and would leave a gap for unknown state of the output - google wouldn't know if it has the links right or not.
In many javascript interpreters I have noticed a javascript until onload scheme used to get proper DOM tree. HTML to PDF creators (the better ones) can be an example. Google will be similar I guess.
Thoughts on that
Ultimately, this is up to the implementation. But while it's technically possible to evaluate the javascript on your site and index javascript redirects, it's highly error prone and would not serve a practical purpose.
If nothing else, you could bypass check that by passing fake javascript files to the crawler, which usually identifies itself by User-Agent header.
Definitely do not rely on this. I think you should make a hidden (to the user) link to any content you want to index but is linked by javascript. Use hiding by CSS class though, in case some search engine thought it's smart to ignore inline style display:block nodes.
Javascript linked page
Or just generally don't use javascript on links as it makes no sense in sane scenarios.
I want to redirect the user to an external page and simultaneously break out of a frameset.
The outermost frameset is most likely within the same domain as the page doing the redirection, but there is the possibility that it will span domains. In development, the outermost frameset might not even exist at all. Ideally, I want to cover all those situations.
The innermost page (the one that has the breakout code) is going to be served over HTTPS. The target URL may be HTTP or HTTPS. It is acceptable for the redirection to fail (there is a fallback link "click here to continue" to cover for that scenario) but the redirection should work in the majority of cases. Particularly for the purposes of this question, I'd hate for it to be more browser-dependent than necessary.
The web application itself is ASP.NET.
Because of the framesets, I can't simply use a HTTP redirect.
So far I have this Javascript code, which is registered as a startup script from within a class subclassing Page, where ... represents the redirection target URL:
((window.top == null) ? (window) : (window.top)).location = '...';
What bothers me is what MDN has to say about window.top and window.parent, respectively. Particularly, the documentation for window.parent spells out explicitly that
If a window does not have a parent, its parent property is a reference to itself.
which means that I can assert window.parent != null. But there is nothing similar about the value of window.top.
For the purposes of my question, you can assume that neither of these have been reassigned.
All that to lead up to the actual question: Does window.top make guarantees similar to that of window.parent? (I'm a bit concerned about the conditional expression. In my so far limited testing it works, but that doesn't prove it correct.)
As far as I can tell, at least MDN doesn't say for certain either way.
The HTML5 spec defines window.top like so:
The top IDL attribute on the Window object of a Document in a browsing context b must return the WindowProxy object of its top-level browsing context (which would be its own WindowProxy object if it was a top-level browsing context itself), if it has one, or its own WindowProxy object otherwise (e.g. if it was a detached nested browsing context).
So the top attribute must always refer to a window. The spec also defines top as readonly so it is not posisble to change it to point to something else (if the spec is implemented properly).
Something is very bad if window.top == null!
This question already has answers here:
Can a website know if I am running a userscript?
(2 answers)
Closed 9 years ago.
If you have a website, can you somehow find out if visitors are modifying your site with javascript userscripts?
In short: EEEEEEK! Don't do it! Rather, decide what needs to be guarded, and guard that. Avoid polling (periodical checking) at all costs. Especially, avoid periodical heavy checks of anything.
Not every change is possible to track. Most changes are just extremely hard to track, since there are so many things that could change.
Changes to the DOM (new nodes, removed nodes, changed attributes) can be detected. The other answer suggests checking innerHTML periodically, but it's better to use mutation observers (supported by Firefox, Chrome) or the older mutation events (DOMSubtreeModified et al.) (support varies by event) instead.
Changes to standard methods cannot be reliably detected, except by comparing every single method and property manually (eeeek). This includes the need to reference tons of objects including, say, Array.prototype.splice (and Array and Array.prototype as well, of course), and run a heavy script periodically. However, this is not what a userscript typically does.
The state of an input is a property, not an attribute. This means that the document HTML won't change. If the state is changed by a script, the change event won't fire either. Again, the only solution is to poll every single input manually (eeek).
There is no reliable way to detect if an event handler has been attached. For starters, you would need to guard the onX attributes (paragraph #2), detect any call to addEventListener (ek) (without tripping the paragraph #2 check), detect any calls to the respective methods by your library (jQuery.bind and several others).
One thing that plays in your favor, and possibly the only one: user scripts run on page load (never sooner), so you have plenty of time to prepare your defenses. not even that plays in your favor (thanks Brock Adams for noting and the link)
You can detect a standard method has been called by replacing it with your own (ek). There are many methods that you would need to instrument this way (eek), some by the browser, some by your framework. The fact that IE (and even firefox can be instructed to, thanks #Brock) won't let you touch the prototypes of the DOM classes adds another "e" or two to the "eek". The fact that some methods can only be obtained via a method call (return value, callback arguments) adds another "e" or two, for a total of "eeeek". The idea of crawling across the entirety of window will be foiled by security exceptions and uncatchable security exceptions. That is, unless you don't use iFrames and you are not within an iFrame.
Even if you detect every method call, DOM can be changed by writing to innerHTML. Firefox and Chrome support Mutation Observers, so you can use these.
Even if you detect every method call to a pre-existing method and listen to mutations, most properties are reflected by neither, so you need to watch all properties of every object as well. Pray someone does not add a non-enumerable property with a key you would never guess. Incidentally, this will catch DOM mutations as well. In ES6, it will be possible to observe an object's property set. I'm not sure if you can attach a setter to an existing object property in ES5 (while adhering to ES3 syntax). Polling every property is eeeek.
Of course, you should allow your own scripts to do some changes. The work flow would be to set a flag (not accessible from the global scope!) "I'm legit", do your job, and clear the flag - remember to flank all your callbacks as well. The method observers will then check the flag is set. The property watchdogs will have a harder time detecting if a change is valid, but they could be notified from the script of every legit change (manually; again make sure the userscripts cannot see that notification stream). Eeek.
There's an entirely different problem that I didn't realise at first: Userscripts run at page load, but they can create an iFrame as well. It's not entirely inconcievable (but still unlikely now) that a userscript would: 1) detect your script blocker, 2) nuke the page from the orbit (you can't prevent document.body.innerHTML =, at least not without heavily tampering with document.body), 3) insert a single iframe with the original URL (prevent double loads server-side?) and 4) have a plenty of time to act on that empty iframe before your protection is even loaded.
Also, see the duplicate found by Brock Adams, which shows several other checks that I didn't think of that should be done.
If you don't have script yourself that changes things you cold compare document.body.innerHTML and document.head.innerHTL with what it was.
When you do change DOM in your script you can update the values to compare it with. Use setInterval to compare periodically.
Is there any way to randomly access the URLs in Javascript's History object in Safari? I'm writing an extension where I need to, on a specifically-formatted page request, capture the URL of the previous page. From what I've been able to find, the History object definition is non-standard across browsers. Safari only seems to expose its length property and the standard methods that actually move within the history. Where other implementations expose current, previous and next properties, I can't see anything that tells me Safari does the same.
I've also tried document.referrer, but that doesn't appear to get populated in this case.
I just need to display the previously accessed URL on a given page. Is there any other way to access that URL?
Thanks.
You can't really do this, at least in any white-hat way. By design. You can step the user backward and forward, but you can't see the URLs.
Less scrupulous script-writers have of course taken this as a challenge. I believe the closest they've come is to dynamically write a bunch of known comparison links to the page, and then inspect them to see if they're showing in the "visited" color state. Perhaps if you're working in a closed and predictable environment (an intranet app?), with a known set of URLs, this might be a valid approach for you. Then again, in such an environment you could deal with this on the server side with session management.
What does Mozilla Firefox's XPCSafeJSObject wrapper actually do?
MDC's documentation is as follows:
This wrapper was created to address some problems with XPCNativeWrapper. In particular, some extensions want to be able to safely access non-natively-implemented content defined objects (and to access the underlying JavaScript object under an XPCNativeWrapper without its strong behavior guarantees). XPCSJOW act as a buffer between the chrome code.
This doesn't tell me a lot. In particular, I can't tell how accessing objects via XPCSafeObject is any different to accessing them directly.
Edit: I understand that the purpose of the wrappers in general is to protect privileged code from unprivileged code. What I don't understand (and doesn't seem to be documented) is how exactly XPCSafeJSObject does this.
Does it just drop privileges before accessing a property?
Actually XPCSafeJSObjectWrapper is used for all content objects, including windows and documents (which is in fact where it's most usually needed.) I believe it was invented mainly to stop XSS attacks automatically turning into privilege escalation attacks (by doing XSS against the browser itself). At least now if an XSS attack is found (and people will unfortunately keep looking) it doesn't compromise the whole browser. It's a natural development from the XPCNativeWrapper which was originally a manual (and therefore prone to accidental misuse by extensions) way for the browser to defend itself from XSS attacks.
The wrapper just ensures that any code that gets evaluated gets evaluated without chrome privileges. Accessing objects directly without this wrapper can allow for code to run with chrome privileges, which then lets that code do just about anything.
The purpose of the wrappers in general is to protect Privileged code when interacting with unprivileged code. The author of the unprivileged code might redefine a JavaScript object to do something malicious, like redefine the getter of a property to execute something bad as a side effect. When the privileged code tries to access the property it would execute the bad code as privileged code. The wrapper prevents this. This page describes the idea.
XPCSafeJSObject provide a wrapper for non-natively implemented JavaScript objects (i.e. not window, document, etc. but user defined objects.)
Edit: For how it's implemented, check out the source code (it's not loading completely for me at the moment.) Also search for XPCSafeJSObject on DXR for other relevant source files.