How to get anything following the domain in a url, using Javascript

How to get anything following the domain in a url, using Javascript - javascript

What is the best way to get the "anything" part after the domain part, using Javascript:
http://www.domain.com/anything
http://www.domain.com/#anything
http://www.domain.com/any/thing
For http://www.domain.com/#anything I would have to use window.location.hash. But for http://www.domain.com/anything I would have to use window.location.pathname.
I'm using:
window.location.href.replace(window.location.origin, "").slice(1)
Are there any caveats with this solution? Is there a better way?

Caveats:
location.origin is not supported by IE.
Other improvements: .slice is actually calling Array.prototype.slice. A method call that requires a prototype lookup is bound to be slower than accessing the element you need directly, escpeciallly in your case, where the slice method is returning an array with just 1 element anyway. So:
You could use location.pathname, but be weary: the standard reads:
pathname
This attribute represents the path component of the Location's URI which consists of everything after the host and port up to and excluding the first question mark (?) or hash mark (#).
but I think the easiest, most X-browser way of getting what you want is actually simply doing this:
var queryString = location.href.split(location.host)[1];
//optionally removing the leading `/`
var queryString = location.href.split(location.host)[1].replace(/^\//,'');
It's very similar to what you have now, except for the fact that I'm not using location.origin, which, as shown on MDN is not supported by MS's IE...
Another benefit is that I'm not calling Array.prototype.slice, which returns an array, and requires a prototype-lookup, which is marginally slower, too...

window.location.pathname + window.location.search + window.location.hash
I think this one is a little bit better. You dont have to use any functions here...

Related

Replacing hash through location.replace considered harmful?

In an application that I encounter hash replacement is done through:
var loc = window.location + ''
loc = loc.substr(0, loc.indexOf('#'))
loc += '#somehash'
location.replace(loc)
instead of:
location.hash = '#somehash'
Now DOMinator Pro gives a 'URL Redirection JSExecution' warning about this as we call location.replace with data that comes from window.location. What I'm wondering now is whether this is a real threat, as I can't think of a way an attacker can abuse this to perform a URL redirect exploit.
Is this a real attack vector, or a false positive?

I agree with the other answers that this is probably a false positive. Still, it's certainly easy to make mistakes with such code, as Bergi's slice example shows.
In modern browsers, this code could be more cleanly written as:
var loc = new URL("#somehash", location);
location.replace(loc);
This avoids any possibility of subtly messing up the URL manipulation.

I'm gonna say this is a false positive attack vector according to this article, the only known way to inject the website is throught adding the #maliciouscode into your URL bar, however your function ignores the # fragments, so no code is really pasted in the website.

There is an attack vector, though I cannot think of a way it cannot be used in your case.
However, the suggested code is definitely clearer, less error-prone, and therefore safer.
I could contrive an example attack if you had used the slice method instead of substr. Should make no difference, should it? Then try it out with the location http://example.com (no trailing slash). If you'd execute the script with
loc = loc.slice(0, loc.indexOf('#'));
then you'll end up at http://example.co#somehash, which might be registered by the attacker.

Recovering built-in methods that have been overwritten

Let's say that our script is included in a web-page, and a prior script (that already executed) did this:
String.prototype.split = function () {
return 'U MAD BRO?';
};
So, the split string method has been overwritten.
We would like to use this method, so we need to recover it somehow. Of course, we could just define our own implementation of this method and use that instead. However, for the sake of this question, let's just say that we really wanted to recover the browser's implementation of that method.
So, the browser has an implementation of the split method (in native code, I believe), and this implementation is assigned to String.prototype.split whenever a new web-page is loaded.
We want that implementation! We want it back in String.prototype.split.
Now, I already came up with one solution - it's a hack, and it appears to be working, but it may have flaws, I would have to test a bit... So, in the meantime, can you come up with a solution to this problem?

var iframe = document.createElement("iframe");
document.documentElement.appendChild(iframe);
var _window = iframe.contentWindow;
String.prototype.split = _window.String.prototype.split;
document.documentElement.removeChild(iframe);
Use iframes to recover methods from host objects.
Note there are traps with this method.
"foo".split("") instanceof Array // false
"foo".split("") instanceof _window.Array // true
The best way to fix this is to not use instanceof, ever.
Also note that var _split = String.prototype.split as a <script> tag before the naughty script or not including the naughty script is obvouisly a far better solution.

How to read href attribute with JQuery

I use $(this).attr('href') in JQuery to read the string in href="" attribute.
I have those kind of links:
Firefox and Chrome return me the code correctly. IE return me: http://127.0.0.1/1
How can i do?

You don't need to use the attr function when you're accessing a native property. You could use the anchor element's properties directly to get the pathname (i.e. the HREF without the domain or query string)
Basically: this.pathname
There's a bit of an inconsistency between browsers (some will show a leading forward slash in pathname and others won't). To get around this, just get rid of any potential leading slashes:
this.pathname.replace(/^\//,'')
A working example: http://jsfiddle.net/jonathon/3ET6p/
Even if you choose the other answers, I recommend you use the native .href on the object.
Just as an extra note to this. I fired up a VM of IE6 and all is well :)

Try via DOM element:
$(this)[0].href; or $(this)[0].getAttribute("href");
If the result is still the same, I suggest you using something not starting with the number.
.

I'd suggest checking which browser is being used. If it's IE, check if the current domain is "127.0.0.1". If it's not, do what #Saul suggested.

REGEX / replace only works once

I'm using REGEX and js replace to dynamically populate a variable in a href. The following code works the first time it is used on the page, but if a different variable is passed to the function, it does not replace ANYTHING.
function change(fone){
$("a[href^='/application']").each(function(){
this.href = this.href.replace(/device=.*/,"device="+ fone);
});
}

The problem is that this.href actually returns a full absolute URL. So even your HTML is <a href="/foo"> the .href property will return http://mydomain.com/foo.
So your href attributes is being populated with a full absolute URL, and then the a[href^='/application'] selector doesn't match anymore, because the href attribute starts with the domain name, instead of /application.

.href returns a fully qualified URL, e.g. `http://www.mydomain.com/application/...'. So the selector doesn't work the 2nd time around since your domain relative URL has been replaced with a full URL, and it's looking for things that start with "/application".
Use $(this).attr('href').replace... instead.
Fiddle here: http://jsfiddle.net/pcm5K/3/

As Squeegy says, you're changing the href the first time around so it no longer begins with /application - the second time around it begins with http://.
You can use the jQuery Attribute Contains Selector to get the links, and it's probably also better practice to use a capture group to do the replacement. Like so:
$("a[href*='/application']").each(function(){
this.href = this.href.replace(/(device=)\w*/, "$1" + fone);
});

You'll need to add the g flag to match all instances of the pattern, so your regular expression will look like this:
/device=.*/g
and your code will look like:
this.href = this.href.replace(/device=.*/g,"device="+ fone);

The reason is that unless all you links start as "/device=." the regex wont work.
you need to use /.*device=.*/
the lack of global flag is not the problem. its the backslash in your pattern.

Best way to safely read query string parameters?

We have a project that generates a code snippet that can be used on various other projects. The purpose of the code is to read two parameters from the query string and assign them to the "src" attribute of an iframe.
For example, the page at the URL http://oursite/Page.aspx?a=1&b=2 would have JavaScript in it to read the "a" and "b" parameters. The JavaScript would then set the "src" attribute of an iframe based on those parameters. For example, "<iframe src="http://someothersite/Page.aspx?a=1&b=2" />"
We're currently doing this with server-side code that uses Microsoft's Anti Cross-Scripting library to check the parameters. However, a new requirement has come stating that we need to use JavaScript, and that it can't use any third-party JavaScript tools (such as jQuery or Prototype).
One way I know of is to replace any instances of "<", single quote, and double quote from the parameters before using them, but that doesn't seem secure enough to me.
One of the parameters is always a "P" followed by 9 integers.
The other parameter is always 15 alpha-numeric characters.
(Thanks Liam for suggesting I make that clear).
Does anybody have any suggestions for us?
Thank you very much for your time.

Upadte Sep 2022: Most JS runtimes now have a URL type which exposes query parameters via the searchParams property.
You need to supply a base URL even if you just want to get URL parameters from a relative URL, but it's better than rolling your own.
let searchParams/*: URLSearchParams*/ = new URL(
myUrl,
// Supply a base URL whose scheme allows
// query parameters in case `myUrl` is scheme or
// path relative.
'http://example.com/'
).searchParams;
console.log(searchParams.get('paramName')); // One value
console.log(searchParams.getAll('paramName'));
The difference between .get and .getAll is that the second returns an array which can be important if the same parameter name is mentioned multiple time as in /path?foo=bar&foo=baz.
Don't use escape and unescape, use decodeURIComponent.
E.g.
function queryParameters(query) {
var keyValuePairs = query.split(/[&?]/g);
var params = {};
for (var i = 0, n = keyValuePairs.length; i < n; ++i) {
var m = keyValuePairs[i].match(/^([^=]+)(?:=([\s\S]*))?/);
if (m) {
var key = decodeURIComponent(m[1]);
(params[key] || (params[key] = [])).push(decodeURIComponent(m[2]));
}
}
return params;
}
and pass in document.location.search.
As far as turning < into <, that is not sufficient to make sure that the content can be safely injected into HTML without allowing script to run. Make sure you escape the following <, >, &, and ".
It will not guarantee that the parameters were not spoofed. If you need to verify that one of your servers generated the URL, do a search on URL signing.

Using a whitelist-approach would be better I guess.
Avoid only stripping out "bad" things. Strip out anything except for what you think is "safe".
Also I'd strongly encourage to do a HTMLEncode the Parameters. There should be plenty of Javascript functions that can this.

you can use javascript's escape() and unescape() functions.

Several things you should be doing:
Strictly whitelist your accepted values, according to type, format, range, etc
Explicitly blacklist certain characters (even though this is usually bypassable), IF your whitelist cannot be extremely tight.
Encode the values before output, if youre using Anti-XSS you already know that a simple HtmlEncode is not enough
Set the src property through the DOM - and not by generating HTML fragment
Use the dynamic value only as a querystring parameter, and not for arbitrary sites; i.e. hardcode the name of the server, target page, etc.
Is your site over SSL? If so, using a frame may cause inconsistencies with SSL UI...
Using named frames in general, can allow Frame Spoofing; if on a secure site, this may be a relevant attack vector (for use with phishing etc.)

You can use regular expressions to validate that you have a P followed by 9 integers and that you have 15 alphanumeric values. I think that book that I have at my desk of RegEx has some examples in JavaScript to help you.
Limiting the charset to only ASCII values will help, and follow all the advice above (whitelist, set src through DOM, etc.)

Develop Reference

JavaScript is the programming language of the Web.

How to get anything following the domain in a url, using Javascript - javascript

window.location.pathname + window.location.search + window.location.hash I think this one is a little bit better. You dont have to use any functions here...

Related

Replacing hash through location.replace considered harmful?

Recovering built-in methods that have been overwritten

How to read href attribute with JQuery

REGEX / replace only works once

Best way to safely read query string parameters?

Categories

Resources