Generate canonical / real URL based on base.href or location

Generate canonical / real URL based on base.href or location - javascript

Is there a method/function to get the canonical / transformed URL, respecting any base.href setting of the page?
I can get the base URL via (in jQuery) using $("base").attr("href") and I could use string methods to parse the URL meant to being made relative to this, but
$("base").attr("href") has no host, path etc attributes (like window.location has)
manually putting this together is rather tedious
E.g., given a base.href of "http://example.com/foo/" and a relative URL "/bar.js", the result should be: "http://example.com/bar.js"
If base.href is not present, the URL should be made relative to window.location.
This should handle non-existing base.href (using location as base in this case).
Is there a standard method available for this already?
(I'm looking for this, since jQuery.getScript fails when using a relative URL like "/foo.js" and BASE tag is being used (FF3.6 makes an OPTIONS request, and nginx cannot handle this). When using the full URL (base.href.host + "/foo.js", it works).)

Does this do the trick?
function resolveUrl(url){
if (!url) {
throw new Error("url is undefined or empty");
}
var reParent = /[\-\w]+\/\.\.\//, // matches a foo/../ expression
reDoubleSlash = /([^:])\/\//g; // matches // anywhere but in the protocol
// replace all // except the one in proto with /
url = url.replace(reDoubleSlash, "$1/");
var base = (document.getElementsByTagName('BASE')[0] && document.getElementsByTagName('BASE')[0].href) || "";
// If the url is a valid url we do nothing
if (!url.match(/^(http||https):\/\//)) {
// If this is a relative path
var path = (url.substring(0, 1) === "/") ? base : location.pathname;
if (path.substring(path.length - 1) !== "/") {
path = path.substring(0, path.lastIndexOf("/") + 1);
}
if (!url.match(/^(http||https):\/\//)) {
url = location.protocol + "//" + location.host + path + url;
}
}
// reduce all 'xyz/../' to just ''
while (reParent.test(url)) {
url = url.replace(reParent, "");
}
return url;
}
It is modified from some code I had around, so it hasn't been tested

js-uri is a useful javascript library to accomplish this: http://code.google.com/p/js-uri/
You would still need to handle the base.href vs window.location parts, but the rest could be accomplished with the library.

Related

window.location not replaced but concatenated

I have this code:
$(window).ready(function() {
var url = window.location.href;
if (url.includes("#/projet/")) {
projectId = url.substring(url.indexOf("#")+1).split("/").slice(2, 3).toString();
window.location.href = "projects/" + projectId;
};
})
I'm redirected but the window.location is not replaced, just concatenated.
For instance, if my URL is localhost:3000/users/212323/dashboard, after the javascript redirection, I get localhost:3000/users/212323/projects/123456 instead of localhost:3000/projects/123456
I don't understand why the href is concatenated and not replaced, do you have an idea?

window.location.href = 'someurl' works the same way as clicking that someurl in a <a> tag.
When using a relative path (i.e. without / in the beginning), your browser will concatenate the URL to the existing URL.
Simple fix in your case is to prepend the /:
window.location.href = "/projects/" + projectId;
Note though, that this will cause the site possibly not work anymore if it is moved to another location. That is why many web frameworks use full URLs and some kind of base-url to get the linking correctly.

You need to add another / to the beginning of the url, otherwise the browser interprets the url as a relative url to the curent url.
window.location.href = "/projects/" + projectId;
The extra / at the start tells the browser to start from the root url.

Getting the proper URL in an angular project

I have a working Js plugin thats written in jQuery. For my plugin to work, I need to get the URL, but without the anchor tag that references an element by id. That is, only getting http://example.com/content/1/ instead of say http://example.com/content/1/#comments.
I am doing this with the following function :
var getProperURL = function() {
return window.location.protocol + '//' + window.location.hostname + window.location.pathname;
}
This works most of the time. However, I ran this on an angular project, and I only get the protocol and the hostname. How do I do this for AngularJs?

To retrieve only the path without the hash, you can use $location.path().
See the official documentation: location.path()
This method is getter / setter.
Return path of current url when called without any parameter.
Change path when called with parameter and return $location.
Note: Path should always begin with forward slash (/), this method will add the forward slash if it is missing.
Return full url representation with all segments encoded according to rules specified in RFC 3986.
Inject location and retrieve all datas you need. In the guide is plenty of examples:
// given url http://example.com/#/some/path?foo=bar&baz=xoxo
var path = $location.path();
// => "/some/path"
and in your website:
// given url http://example.com/content/1/#comments
var path = $location.path();
// => "/content/1"

Ensure URL is relative before navigating via JavaScript's location.replace()

I have a login page https://example.com/login#destination where destination is the target URL the user was trying to navigate to when they were required to log in.
(i.e. https://example.com/destination)
The JavaScript I was thinking about using was
function onSuccessfulLogin() {
location.replace(location.hash.substring(1) || 'default')
}
This would result in an XSS vulnerability, by an attacker providing the link
https://example.com/login#javascript:..
Also I need to prevent navigation to a lookalike site after login.
https://example.com/login#https://looks-like-example.com
or https://example.com/login#//looks-like-example.com
How can I adjust onSuccessfulLogin to ensure the URL provided in the hash # portion is a relative URL, and not starting with javascript:, https:, // or any other absolute navigation scheme?
One thought is to evaluate the URL, and see if location.origin remains unchanged before navigating. Can you suggest how to do this, or a better approach?

From OWASP recommendations on Preventing Unvalidated Redirects and Forwards:
It is recommended that any such destination input be mapped to a value, rather than the actual URL or portion of the URL, and that server side code translate this value to the target URL.
So a safe approach would be mapping some keys to actual URLs:
// https://example.com/login#destination
var keyToUrl = {
destination: 'https://example.com/destination',
defaults: 'https://example.com/default'
};
function onSuccessfulLogin() {
var hash = location.hash.substring(1);
var url = keyToUrl[hash] || keyToUrl.defaults;
location.replace(url);
}
You could also consider providing only path part of the URL and appending it with a hostname in the code:
// https://example.com/login#destination
function onSuccessfulLogin() {
var path = location.hash.substring(1);
var url = 'https://example.com/' + path;
location.replace(url);
}
I would stick to the mapping though.

That is a very good point about the XSS vulnerability.
I believe all protocols only use English alphabetic characters, so a regex like /^[a-z]+:/i would check for those. Alternately if we're feeling more inclusive, /^[^:\/?]+:/ allows anything but a / or ? followed by a :. Then we can combine that with /^\/\/ to test for a protocol-free URL, which gives us:
// Either
var rexIsProtocol = /(?:^[a-z]+:)|(?:^\/\/)/i;
// Or
var rexIsProtocol = /(?:^[^:\/?]+:)|(?:^\/\/)/i;
Then the test is like this:
var url = location.hash.substring(1).trim(); // trim to deal with whitespace
if (rexIsProtocol.test(url)) {
// It starts with a protocol
} else {
// It doesn't
}
That said, the only one I think you need to be particularly bothered by is the javascript: pseudo-protcol, so you might just test for that.

javascript window.open without http://

I have a small tool build with Delphi that collects url's from a file or from the clipboard, and than builds a file called test.htm with a content like this :
<!DOCTYPE html>
<html>
<body>
<p>Click the button retrieve the links....</p>
<button onclick="myFunction()">Click me</button>
<p id="demo"></p>
<script>
function myFunction() {
window.open('http://www.speedtest.net/', '_blank');
window.open('www.speedtest.net/', '_blank');
and so on...
}
</script>
</body>
</html>
The idea is to click on the button, and then a new tab (or window) is created for every url inside myFunction.
This works, but with one small problem.
In the code example there are 2 url's, one with the http:// prefix and one without it. The first url works as expected and creates a new tab (or window) with the following url:
http://www.speedtest.net
The second 'window.open' does not work as I expected. This 'window.open' will create the following url in the new tab (or window)
file:///c:/myApplicaton/www.speedtest.net
As you have already figured out, the application is an executable in c:\myApplication
So my question(s) is, is there a way to use 'window.open' to create a new tab (or window) without putting the path of the application in front of the url ?
If this is not possible with 'window.open', is there another way to do this ?
Or is the only way to do this to have the application put the http:// in front of every url that does not have it already ?

As you suggested, the only way is to add the http protocol to each URL which is missing it. It's a pretty simple and straightforward solution with other benefits to it.
Consider this piece of code:
function windowOpen(url, name, specs) {
if (!url.match(/^https?:\/\//i)) {
url = 'http://' + url;
}
return window.open(url, name, specs);
}
What I usually do is to also add the functionality of passing specs as an object, which is much more manageable, in my opinion, than a string, even setting specs defaults if needed, and you can also automate the name creation and make the argument optional in case it's redundant to your cause.
Here's an example of how the next stage of this function may look like.
function windowOpen(url, name, specs) {
if (!url.match(/^https?:\/\//i)) {
url = 'http://' + url;
}
// name is optional
if (typeof name === 'object') {
specs = name;
name = null;
}
if (!name) {
name = 'window_' + Math.random();
}
if (typeof specs === 'object') {
for (var specs_keys = Object.keys(specs), i = 0, specs_array = [];
i < specs_keys.length; i++) {
specs_array.push(specs_keys[i] + '=' + specs[specs_keys[i]]);
}
specs = specs_array.join(',');
}
return window.open(url, name, specs);
}

I think the best way would be to add "//" + url
In this case - it isn't important, what protocol (http or https) you expect to receive as a result.
url = url.match(/^https?:/) ? url : '//' + url;
window.open(url, '_blank');

The only way to do this is to have the application put the http:// in front of every url that does not have it already.

For the behavior you're describing, you have to include your protocol with window.open. You could use a tertiary operator to simply include the protocol if it doesn't already exist:
url = url.match(/^http[s]?:\/\//) ? url : 'http://' + url;
Note that you'll need to use the SSL protocol sometimes, so this is not a complete solution.

I made small changes function form answered by iMoses which worked for me.
Check for both https OR http protocol
if (!url.match(/^http?:\/\//i) || !url.match(/^https?:\/\//i)) {
url = 'http://' + url;
}
Hope it make more accurate for other situation !

Expand partial URL into full (like in image.src)

I'm trying to write an onerror handler for images that replaces them with a loading image and then periodically tries to reload them. The problem I'm having is that if the loading image fails to load, it goes into an infinite loop of failure. I'm trying to deal with this by checking if the URL is the loading image:
if(photo.src != loadingImage) {
// Try to reload the image
}
Unfortunately, loadingImage can be a relative URL (/images/loadingImage.jpg), but photo.src is always a full URL (http://example.com/images/loadingImage.jpg). Is there any way to generate this full URL without passing the function any more information? Obviously I could pass it the host name, or require full URLs, but I'd like to keep this function's interface as simple as possible.
EDIT:
Basically what I want is to guarantee that if I do photo.src = loadingImage, that this will be true: photo.src === loadingImage. The constraint is that I know nothing about loadingImage except that it's a valid URL (it could be absolute, relative to the server, or relative to the current page). photo.src can be any (absolute) URL, so not necessarily on the same domain as loadingImage.

Here's a couple methods people have used to convert relative URLs to absolute ones in javascript:
StackOverflow - Getting an absolute URL from a relative one
Debuggable.com - Relative URLs in Javascript
Alternatively, have you considered doing the opposite - converting the absolute URL to a relative one? If loadingimage always contains the entire path section of the URL, then something like this would probably work:
var relativePhotoSrc = photo.src;
if (relativePhotoSrc.indexOf("/") > 0 && relativePhotoSrc.indexOf("http://") == 0) {
relativePhotoSrc = relativePhotoSrc.replace("http://", "");
relativePhotoSrc = relativePhotoSrc.substring(relativePhotoSrc.indexOf("/"), relativePhotoSrc.length);
}
alert(relativePhotoSrc);
if (relativePhotoSrc != loadingImage && photo.src != loadingImage) {
// Try to reload the image
}
There's probably a slightly more efficient/reliable way to do the string manipulation with a regular expression, but this seems to get the job done in my tests.

How about this? The photo should either be a full URL or relative to the current document.
var url;
// There probably other conditions to add here to make this bullet proof
if (photo.src.indexOf("http://") == 0 ||
photo.src.indexOf("https://") == 0 ||
photo.src.indexOf("//") == 0) {
url = photo.src;
} else {
url = location.href.substring(0, location.href.lastIndexOf('/')) + "/" + photo.src;
}

Just check if they end with the same string:
var u1 = '/images/loadingImage.jpg';
var u2 = 'http://example.com/images/loadingImage.jpg'
var len = Math.min(u1.length, u2.length);
var eq = u1.substring(u1.length - len) == u2.substring(u2.length - len);

Develop Reference

JavaScript is the programming language of the Web.

Generate canonical / real URL based on base.href or location - javascript

js-uri is a useful javascript library to accomplish this: http://code.google.com/p/js-uri/ You would still need to handle the base.href vs window.location parts, but the rest could be accomplished with the library.

Related

window.location not replaced but concatenated

Getting the proper URL in an angular project

Ensure URL is relative before navigating via JavaScript's location.replace()

javascript window.open without http://

Expand partial URL into full (like in image.src)

Categories

Resources