How to extract the hostname portion of a URL in JavaScript - javascript

Is there a really easy way to start from a full URL:
document.location.href = "http://aaa.bbb.ccc.com/asdf/asdf/sadf.aspx?blah"
And extract just the host part:
aaa.bbb.ccc.com
There's gotta be a JavaScript function that does this reliably, but I can't find it.

Suppose that you have a page with this address: http://sub.domain.com/virtualPath/page.htm.
Use the following in page code to achieve those results:
Property
Result
window.location.host
sub.domain.com:8080 or sub.domain.com:80
window.location.hostname
sub.domain.com
window.location.protocol
http:
window.location.port
8080 or 80
window.location.pathname
/virtualPath
window.location.origin
http://sub.domain.com (Might include :port too*****)
Update: about the .origin
***** As the ref states, browser compatibility for window.location.origin is not clear. I've checked it in chrome and it returned http://sub.domain.com:port if the port is anything but 80, and http://sub.domain.com if the port is 80.
Special thanks to #torazaburo for mentioning that to me.

You could concatenate the location protocol and the host:
var root = location.protocol + '//' + location.host;
For a url, let say 'http://stackoverflow.com/questions', it will return 'http://stackoverflow.com'

The accepted answer didn't work for me since wanted to be able to work with any arbitary url's, not just the current page URL.
Take a look at the URL object:
var url = new URL("http://aaa.bbb.ccc.com/asdf/asdf/sadf.aspx?blah");
url.protocol; // "http:"
url.hostname; // "aaa.bbb.ccc.com"
url.pathname; // "/asdf/asdf/sadf.aspx"
url.search; // "?blah"

Use document.location object and its host or hostname properties.
alert(document.location.hostname); // alerts "stackoverflow.com"

There are two ways. The first is a variant of another answer here, but this one accounts for non-default ports:
function getRootUrl() {
var defaultPorts = {"http:":80,"https:":443};
return window.location.protocol + "//" + window.location.hostname
+ (((window.location.port)
&& (window.location.port != defaultPorts[window.location.protocol]))
? (":"+window.location.port) : "");
}
But I prefer this simpler method (which works with any URI string):
function getRootUrl(url) {
return url.toString().replace(/^(.*\/\/[^\/?#]*).*$/,"$1");
}

Let's suppose you have this url path:
http://localhost:4200/landing?query=1#2
So, you can serve yourself by the location values, as follow:
window.location.hash: "#2"
​
window.location.host: "localhost:4200"
​
window.location.hostname: "localhost"
​
window.location.href: "http://localhost:4200/landing?query=1#2"
​
window.location.origin: "http://localhost:4200"
​
window.location.pathname: "/landing"
​
window.location.port: "4200"
​
window.location.protocol: "http:"
window.location.search: "?query=1"
Now we can conclude you're looking for:
window.location.hostname

Try
document.location.host
or
document.location.hostname

use
window.location.origin
and for: "http://aaa.bbb.ccc.ddd.com/sadf.aspx?blah"
you will get: http://aaa.bbb.ccc.ddd.com/

There is another hack I use and never saw in any StackOverflow response :
using "src" attribute of an image will yield the complete base path of your site.
For instance :
var dummy = new Image;
dummy.src = '$'; // using '' will fail on some browsers
var root = dummy.src.slice(0,-1); // remove trailing '$'
On an URL like http://domain.com/somesite/index.html,
root will be set to http://domain.com/somesite/.
This also works for localhost or any valid base URL.
Note that this will cause a failed HTTP request on the $ dummy image.
You can use an existing image instead to avoid this, with only slight code changes.
Another variant uses a dummy link, with no side effect on HTTP requests :
var dummy = document.createElement ('a');
dummy.href = '';
var root = dummy.href;
I did not test it on every browser, though.

Check this:
alert(window.location.hostname);
this will return host name as www.domain.com
and:
window.location.host
will return domain name with port like www.example.com:80
For complete reference check Mozilla developer site.

I know this is a bit late, but I made a clean little function with a little ES6 syntax
function getHost(href){
return Object.assign(document.createElement('a'), { href }).host;
}
It could also be writen in ES5 like
function getHost(href){
return Object.assign(document.createElement('a'), { href: href }).host;
}
Of course IE doesn't support Object.assign, but in my line of work, that doesn't matter.

I would like to specify something. If someone want to get the whole url with path like I need, can use:
var fullUrl = window.location.protocol + "//" + window.location.hostname + window.location.pathname;

Regex provides much more flexibility.
//document.location.href = "http://aaa.bbb.ccc.com/asdf/asdf/sadf.aspx?blah
//1.
var r = new RegExp(/http:\/\/[^/]+/);
var match = r.exec(document.location.href) //gives http://aaa.bbb.ccc.com
//2.
var r = new RegExp(/http:\/\/[^/]+\/[^/]+/);
var match = r.exec(document.location.href) //gives http://aaa.bbb.ccc.com/asdf

My solution works in all web browsers including Microsoft Internet Explorer and doesn't use any regular expression, it's inspired of Noah Cardoza and Martin Konecny solutions:
function getHostname(href) {
if (typeof URL === 'object') {
// workaround for MS IE 11 (Noah Cardoza's solution but without using Object.assign())
var dummyNode = document.createElement('a');
dummyNode.href = href;
return dummyNode.hostname;
} else {
// Martin Konecny's solution
return new URL(href).hostname;
}
}

You can split the URL string using /
const exampleURL = "Https://exampleurl.com/page1/etc/etc"
const URLsplit = exampleURL.split("/")
console.log(URLsplit)
console.log(URLsplit[2])
Result. exampleurl.com

Related

How to split url to get url path in JavaScript

I have constructed a url path that are pointing to different hostname www.mysite.com, so for example:
var myMainSite = 'www.mymainsite.com' + '/somepath';
so this is equivalent to www.mymainsite.com/path/path/needthispath/somepath.
How I'm doing it now is like the code below and this gives me a bunch of indexes of the url in the console.log.
var splitUrl = myMainSite.split('/');
console.log looks like:
0: http://
1: www.
2: mysite.com
3: path
4: path
5: needthispath
6: somepath
and I concat them like splitUrl[5]+'/'+splitUrl[6] and it doesn't look pretty at all.
So my question is how to split/remove url location http://www.mymainsite.com/ to get the url path needthispath/somepath in js? Is there a quicker and cleaner way of doing this?
First solution (URL object)
The URL object can be used for parsing, constructing, normalizing, encoding URLs, and so on.
var url = 'http://www.mymainsite.com/somepath/path2/path3/path4';
var pathname = new URL(url).pathname;
console.log(pathname);
The URL interface represents an object providing static methods used
for creating object URLs.
See the documentation for URL interface on Mozilla MDN
The Browser support is pretty good in 2017 (~ 90% but not IE11 nor below)
Second solution (a kind of a hack)
var urlHack = document.createElement('a');
urlHack.href = 'http://www.mymainsite.com/somepath/path2/path3/path4';
console.log(urlHack.pathname);
// you can even call this object with these properties:
// protocol, host, hostname, port, pathname, hash, search, origin
Why don't you use the split function and work from there.
The split function will break your URL out fully and from there you just need to look for the second last and last items.
Here is an example:
var initial_url = 'http://www.mymainsite.com/path/path/needthispath/somepath';
var url = initial_url .split( '/' );
var updated_url= document.location.hostname + '/' + url[ url.length - 2 ] + '/' + url[ url.length - 1 ];
You can use the URL API, though support is variable.
Alternatively, you could use URI.js.
Both allow you to get different parts of an URL, as well as build new URLs from parts.
function url($url) {
var url = $url.split( '//' );
if (url[0] === "http:" || url[0] === "https:") {
var protocol = url[0] + "//";
var host = url[1].split( '/' )[0];
url = protocol + host;
var path = $url.split(url)[1];
return {
protocol: protocol,
host: host,
path: path
};
}
}
var $url = url("http://www.mymainsite.com/path/path/needthispath/somepath");
console.log($url.protocol); // http://
console.log($url.host); // www.mymainsite.com
console.log($url.path); // /path/path/needthispath/somepath

how to check port number in url string?

Can I check whether the port number is existing in a given URL string or not?
Like sometimes a user can type 202.567.89.254:8088 or http://202.567.89.254:8088/ or http://202.567.89.254.
Out of all the above options, if the port number is existing, then do nothing otherwise append 8080 by default with an end slash 8080/.
Is it possible in JavaScript?
You can try to use the location object and use:
location.port
The HTMLHyperlinkElementUtils.port property is a USVString containing
the port number of the URL.
Here you go the easiest way, set the href attribute.
var parser = document.createElement('a');
parser.href = "http://example.com:3000/pathname/?search=test#hash";
console.log(parser.protocol); // => "http:"
console.log(parser.hostname); // => "example.com"
console.log(parser.port); // => "3000"
console.log(parser.pathname); // => "/pathname/"
console.log(parser.host); // => "example.com:3000"
read more here
You can use location.port property.
if(location.port){//then there is port}
else{//No port}
Try below code
urlSplit = url.split(":")
if(urlSplit.length==3){
port = urlSplit[2];
}
else{
port = 80;
}
You can check Location
Location.port will serve your purpose
See if this works for you:
function appendPort(url){
if(!url.match(/\:\d+$/)){
return url + ":8080";
}
}
If you want to do that on user entered location:
if(!location.port){
location.port = 8080; // doing this will take care of rest of the URL component
}
:)
use location.port. Sample example below.
function appendPort(){
if(location.port.length === 0){
location.host = location.host + ":8080/";
}
}
Use this:
if(location.port){
//then there is port
//you may alert() if you want
}
else{
location.port=8080;
}
You can use Location Object of js
location.port
Just use location in debugger you will get host hostname href origin pathname port protocol and many more values

How to get base url with jquery or javascript?

In joomla php there I can use $this->baseurl to get the base path, but I wanted to get the base path in jquery.
The base path may be any of the following example:
http://www.example.com/
http://localhost/example
http://www.example.com/sub/example
The example may also change.
I think this will work well for you:
var base_url = window.location.origin;
var host = window.location.host;
var pathArray = window.location.pathname.split( '/' );
This one will help you...
var getUrl = window.location;
var baseUrl = getUrl .protocol + "//" + getUrl.host + "/" + getUrl.pathname.split('/')[1];
This will get base url
var baseurl = window.location.origin+window.location.pathname;
document.baseURI returns base URL also respecting the value in <base/> tag
https://developer.mozilla.org/en-US/docs/Web/API/Node/baseURI
This is an extremely old question, but here are the approaches I personally use ...
Get Standard/Base URL
As many have already stated, this works for most situations.
var url = window.location.origin;
Get Absolute Base URL
However, this simple approach can be used to strip off any port numbers.
var url = "http://" + location.host.split(":")[0];
Edit: To address the concern, posed by Jason Rice, the following can be used to automatically insert the correct protocol type ...
var url = window.location.protocol + "//" + location.host.split(":")[0];
Set Base URL
As a bonus -- the base URL can then be redefined globally.
document.head.innerHTML = document.head.innerHTML + "<base href='" + url + "' />";
the easiest way to get base url in JavaScript
window.location.origin
This is not possible from javascript, because this is a server-side property. Javascript on the client cannot know where joomla is installed. The best option is to somehow include the value of $this->baseurl into the page javascript and then use this value (phpBaseUrl).
You can then build the url like this:
var loc = window.location;
var baseUrl = loc.protocol + "//" + loc.hostname + (loc.port? ":"+loc.port : "") + "/" + phpBaseUrl;
var getUrl = window.location;
var baseurl = getUrl.origin; //or
var baseurl = getUrl.origin + '/' +getUrl.pathname.split('/')[1];
But you can't say that the baseurl() of CodeIgniter(or php joomla) will return the same value, as it is possible to change the baseurl in the .htaccess file of these frameworks.
For example :
If you have an .htaccess file like this in your localhost :
RewriteEngine on
RewriteBase /CodeIgniter/
RewriteCond $1 !^(index.php|resources|robots.txt)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L,QSA]
The $this->baseurl() will return http://localhost/CodeIgniter/
Had the same issue a while ago, my problem was, I simply needed the base url. There are a lot of detailed options here but to get around this, simply use the window.location object. Actually type this in the browser console and hit enter to select your options there. Well for my case it was simply:
window.location.origin
I've run into this need on several Joomla project. The simplest way I've found to address is to add a hidden input field to my template:
<input type="hidden" id="baseurl" name="baseurl" value="<?php echo $this->baseurl; ?>" />
When I need the value in JavaScript:
var baseurl = document.getElementById('baseurl').value;
Not as fancy as using pure JavaScript but simple and gets the job done.
You can easily get it with:
var currentUrl = window.location.href;
or, if you want the original URL, use:
var originalUrl = window.location.origin;
The format is hostname/pathname/search
So the url is :
var url = window.location.hostname + window.location.pathname + window.location.hash
For your case
window.location.hostname = "stackoverflow.com"
window.location.pathname ="/questions/25203124/how-to-get-base-url-with-jquery-or-javascript"
window.location.hash = ""
So basically the baseurl = hostname = window.location.hostname
I am surprised that non of the answers consider the base url if it was set in <base> tag. All current answers try to get the host name or server name or first part of address. This is the complete logic which also considers the <base> tag (which may refer to another domain or protocol):
function getBaseURL(){
var elem=document.getElementsByTagName("base")[0];
if (typeof(elem) != 'undefined' && elem != null){
return elem.href;
}
return window.location.origin;
}
Jquery format:
function getBaseURL(){
if ($("base").length){
return $("base").attr("href");
}
return window.location.origin;
}
Without getting involved with the logic above, the shorthand solution which considers both <base> tag and window.location.origin:
Js:
var a=document.createElement("a");
a.href=".";
var baseURL= a.href;
Jquery:
var baseURL= $('<a href=".">')[0].href
Final note: for a local file in your computer (not on a host) the window.location.origin only returns the file:// but the sorthand solution above returns the complete correct path.
Here's a short one:
const base = new URL('/', location.href).href;
console.log(base);
I would recommend for everyone to create HTML base tag in development, then assign the href dynamically, so in production whatever host a client uses, it will automacically addapt to it:
<html>
<title>Some page title</titile>
<script type="text/javascript">
var head = document.getElementsByTagName('head')[0];
var base = document.createElement("base");
base.href = window.document.location.origin;
head.appendChild(base);
</script>
</head>
...
So if you are in localhot:8080, you will reach every linked or referenced file from the base, eg: http://localhost:8080/some/path/file.html
If you are in www.example.com, it will be http://www.example.com/some/path/file.html
Also note that, every location you're on, you should not use references like globs in hrefs, eg: Parent location causes http://localhost:8080/ not http://localhost:8080/some/path/.
Pretent you reference all hyperlinks as full sentenced without the bas url.
window.location.origin+"/"+window.location.pathname.split('/')[1]+"/"+page+"/"+page+"_list.jsp"
almost same as Jenish answer but a little shorter.
I was just on the same stage and this solution works for me
In the view
<?php
$document = JFactory::getDocument();
$document->addScriptDeclaration('var base = \''.JURI::base().'\'');
$document->addScript('components/com_name/js/filter.js');
?>
In js file you access base as a variable for example in your scenario:
console.log(base) // will print
// http://www.example.com/
// http://localhost/example
// http://www.example.com/sub/example
I do not remember where I take this information to give credit, if I find it I will edit the answer
A simpler answer is here, window.location.href = window.location.origin;
Easy
$('<img src=>')[0].src
Generates a img with empty src-name forces the browser to calculate the base-url by itself, no matter if you have /index.html or anything else.
var getUrl = window.location;
var baseUrl = getUrl .protocol + "//" + getUrl.host + "/" + getUrl.pathname.split('/')[1];
Here's something quick that also works with file:// URLs.
I came up with this one-liner:
[((1!=location.href.split(location.href.split("/").pop())[0].length?location.href.split(location.href.split("/").pop())[0]:(location.protocol,location.protocol+"//" + location.host+"/"))).replace(location.protocol+"//"+location.protocol+"//"+location.protocol+"://")]
You mentioned that the example.com may change so I suspect that actually you need the base url just to be able to use relative path notation for your scripts. In this particular case there is no need to use scripting - instead add the base tag to your header:
<head>
<base href="http://www.example.com/">
</head>
I usually generate the link via PHP.
In case anyone would like to see this broken out into a very robust function
function getBaseURL() {
var loc = window.location;
var baseURL = loc.protocol + "//" + loc.hostname;
if (typeof loc.port !== "undefined" && loc.port !== "") baseURL += ":" + loc.port;
// strip leading /
var pathname = loc.pathname;
if (pathname.length > 0 && pathname.substr(0,1) === "/") pathname = pathname.substr(1, pathname.length - 1);
var pathParts = pathname.split("/");
if (pathParts.length > 0) {
for (var i = 0; i < pathParts.length; i++) {
if (pathParts[i] !== "") baseURL += "/" + pathParts[i];
}
}
return baseURL;
}
Split and join the URL:
const s = 'http://free-proxy.cz/en/abc'
console.log(s.split('/').slice(0,3).join('/'))
Getting the base url
|Calls controller from js
function getURL() {
var windowurl = window.location.href;
var baseUrl = windowurl.split('://')[1].split('/')[0]; //split function
var xhr = new XMLHttpRequest();
var url='http://'+baseUrl+'/url from controller';
xhr.open("GET", url);
xhr.send(); //object use to send
xhr.onreadystatechange=function() {
if(xhr.readyState==4 && this.status==200){
//console.log(xhr.responseText); //the response of the request
document.getElementById("id from where you called the function").innerHTML = xhr.responseText;
}
}
}
Put this in your header, so it will be available whenever you need it.
var base_url = "<?php echo base_url();?>";
You will get http://localhost:81/your-path-file or http://localhost/your-path-file.

remove url parameters with javascript or jquery

I am trying to use the youtube data api to generate a video playlist.
However, the video urls require a format of:
youtube.com/watch?v=3sZOD3xKL0Y
but what the api generates is:
youtube.com/watch?v=3sZOD3xKL0Y&feature=youtube_gdata
So what I need to do is be able to select everything after and including the ampersand(&) and remove it from the url.
Any way to do this with javascript and some sort of regular expression?
What am I missing?
Why not:
url.split('?')[0]
Hmm... Looking for better way... here it is
var onlyUrl = window.location.href.replace(window.location.search,'');
Example: http://jsfiddle.net/SjrqF/
var url = 'youtube.com/watch?v=3sZOD3xKL0Y&feature=youtube_gdata';
url = url.slice( 0, url.indexOf('&') );
or:
Example: http://jsfiddle.net/SjrqF/1/
var url = 'youtube.com/watch?v=3sZOD3xKL0Y&feature=youtube_gdata';
url = url.split( '&' )[0];
Use this function:
var getCleanUrl = function(url) {
return url.replace(/#.*$/, '').replace(/\?.*$/, '');
};
// get rid of hash and params
console.log(getCleanUrl('https://sidanmor.com/?firstname=idan&lastname=mor'));
If you want all the href parts, use this:
var url = document.createElement('a');
url.href = 'https://developer.mozilla.org/en-US/search?q=URL#search-results-close-container';
console.log(url.href); // https://developer.mozilla.org/en-US/search?q=URL#search-results-close-container
console.log(url.protocol); // https:
console.log(url.host); // developer.mozilla.org
console.log(url.hostname); // developer.mozilla.org
console.log(url.port); // (blank - https assumes port 443)
console.log(url.pathname); // /en-US/search
console.log(url.search); // ?q=URL
console.log(url.hash); // #search-results-close-container
console.log(url.origin); // https://developer.mozilla.org
//user113716 code is working but i altered as below. it will work if your URL contain "?" mark or not
//replace URL in browser
if(window.location.href.indexOf("?") > -1) {
var newUrl = refineUrl();
window.history.pushState("object or string", "Title", "/"+newUrl );
}
function refineUrl()
{
//get full url
var url = window.location.href;
//get url after/
var value = url = url.slice( 0, url.indexOf('?') );
//get the part after before ?
value = value.replace('#System.Web.Configuration.WebConfigurationManager.AppSettings["BaseURL"]','');
return value;
}
This worked for me:
window.location.replace(window.location.pathname)
No splits.. :) The correct/foolproof way is to let the native browser BUILT-IN functions do the heavy lifting using urlParams, the heavy lifting is done for you.
//summary answer - this one line will correctly replace in all current browsers
window.history.replaceState({}, '', `${location.pathname}?${params}`);
// 1 Get your URL
let url = new URL('https://tykt.org?unicorn=1&printer=2&scanner=3');
console.log("URL: "+ url.toString());
// 2 get your params
let params = new URLSearchParams(url.search);
console.log("querys: " + params.toString());
// 3 Delete the printer param, Query string is now gone
params.delete('printer');
console.log("Printer Removed: " + params.toString());
// BELOW = Add it back to the URL, DONE!
___________
NOW Putting it all together in your live browser
// Above is a breakdown of how to get your params
// 4 then you simply replace those in your current browser!!
window.history.replaceState({}, '', `${location.pathname}?${params}`);
Sample working Javascript Fiddle here
You could use a RegEx to match the value of v and build the URL yourself since you know the URL is youtube.com/watch?v=...
http://jsfiddle.net/akURz/
var url = 'http://youtube.com/watch?v=3sZOD3xKL0Y';
alert(url.match(/v\=([a-z0-9]+)/i));
Well, I am using this:
stripUrl(urlToStrip){
let stripped = urlToStrip.split('?')[0];
stripped = stripped.split('&')[0];
stripped = stripped.split('#')[0];
return stripped;
}
or:
stripUrl(urlToStrip){
return urlToStrip.split('?')[0].split('&')[0].split('#')[0];
}
For example we have:
example.com/list/search?q=Somethink
And you need use variable url like this by window.location.href:
example.com/list/edit
From url:
example.com/list/search?q=Somethink
example.com/list/
var url = (window.location.href);
url = url.split('/search')[0];
url = (url + '/edit');
This is simple solution:-)

Trying to Validate URL Using JavaScript

I want to validate a URL and display message. Below is my code:
$("#pageUrl").keydown(function(){
$(".status").show();
var url = $("#pageUrl").val();
if(isValidURL(url)){
$.ajax({
type: "POST",
url: "demo.php",
data: "pageUrl="+ url,
success: function(msg){
if(msg == 1 ){
$(".status").html('<img src="images/success.gif"/><span><strong>SiteID:</strong>12345678901234456</span>');
}else{
$(".status").html('<img src="images/failure.gif"/>');
}
}
});
}else{
$(".status").html('<img src="images/failure.gif"/>');
}
});
function isValidURL(url){
var RegExp = /(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/;
if(RegExp.test(url)){
return true;
}else{
return false;
}
}
My problem is now it will show an error message even when entering a proper URL until it matches regular expression, and it return true even if the URL is something like "http://wwww".
I appreciate your suggestions.
Someone mentioned the Jquery Validation plugin, seems overkill if you just want to validate the url, here is the line of regex from the plugin:
return this.optional(element) || /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i.test(value);
Here is where they got it from: http://projects.scottsplayground.com/iri/
Pointed out by #nhahtdh This has been updated to:
// Copyright (c) 2010-2013 Diego Perini, MIT licensed
// https://gist.github.com/dperini/729294
// see also https://mathiasbynens.be/demo/url-regex
// modified to allow protocol-relative URLs
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?#)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
source: https://github.com/jzaefferer/jquery-validation/blob/c1db10a34c0847c28a5bd30e3ee1117e137ca834/src/core.js#L1349
It's not practical to parse URLs using regex. A full implementation of the RFC1738 rules would result in an enormously long regex (assuming it's even possible). Certainly your current expression fails many valid URLs, and passes invalid ones.
Instead:
a. use a proper URL parser that actually follows the real rules. (I don't know of one for JavaScript; it would probably be overkill. You could do it on the server side though). Or,
b. just trim away any leading or trailing spaces, then check it has one of your preferred schemes on the front (typically ‘http://’ or ‘https://’), and leave it at that. Or,
c. attempt to use the URL and see what lies at the end, for example by sending it am HTTP HEAD request from the server-side. If you get a 404 or connection error, it's probably wrong.
it return true even if url is something like "http://wwww".
Well, that is indeed a perfectly valid URL.
If you want to check whether a hostname such as ‘wwww’ actually exists, you have no choice but to look it up in the DNS. Again, this would be server-side code.
function validateURL(textval) {
var urlregex = /^(https?|ftp):\/\/([a-zA-Z0-9.-]+(:[a-zA-Z0-9.&%$-]+)*#)*((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}|([a-zA-Z0-9-]+\.)*[a-zA-Z0-9-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(:[0-9]+)*(\/($|[a-zA-Z0-9.,?'\\+&%$#=~_-]+))*$/;
return urlregex.test(textval);
}
This can return true for URLs like:
http://stackoverflow.com/questions/1303872/url-validation-using-javascript
or:
http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2
I written also a URL validation function base on rfc1738 and rfc3986 to check http and https urls. I try to hold this modular, so it can be better maintained and adapted to own requirements.
The RegExp in one line is show at end of this post.
The RegExp accept HTTP and HTTPS URLs with some international domain or IPv4 number. IPv6 is not supported yet.
window.isValidURL = (function() {// wrapped in self calling function to prevent global pollution
//URL pattern based on rfc1738 and rfc3986
var rg_pctEncoded = "%[0-9a-fA-F]{2}";
var rg_protocol = "(http|https):\\/\\/";
var rg_userinfo = "([a-zA-Z0-9$\\-_.+!*'(),;:&=]|" + rg_pctEncoded + ")+" + "#";
var rg_decOctet = "(25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])"; // 0-255
var rg_ipv4address = "(" + rg_decOctet + "(\\." + rg_decOctet + "){3}" + ")";
var rg_hostname = "([a-zA-Z0-9\\-\\u00C0-\\u017F]+\\.)+([a-zA-Z]{2,})";
var rg_port = "[0-9]+";
var rg_hostport = "(" + rg_ipv4address + "|localhost|" + rg_hostname + ")(:" + rg_port + ")?";
// chars sets
// safe = "$" | "-" | "_" | "." | "+"
// extra = "!" | "*" | "'" | "(" | ")" | ","
// hsegment = *[ alpha | digit | safe | extra | ";" | ":" | "#" | "&" | "=" | escape ]
var rg_pchar = "a-zA-Z0-9$\\-_.+!*'(),;:#&=";
var rg_segment = "([" + rg_pchar + "]|" + rg_pctEncoded + ")*";
var rg_path = rg_segment + "(\\/" + rg_segment + ")*";
var rg_query = "\\?" + "([" + rg_pchar + "/?]|" + rg_pctEncoded + ")*";
var rg_fragment = "\\#" + "([" + rg_pchar + "/?]|" + rg_pctEncoded + ")*";
var rgHttpUrl = new RegExp(
"^"
+ rg_protocol
+ "(" + rg_userinfo + ")?"
+ rg_hostport
+ "(\\/"
+ "(" + rg_path + ")?"
+ "(" + rg_query + ")?"
+ "(" + rg_fragment + ")?"
+ ")?"
+ "$"
);
// export public function
return function (url) {
if (rgHttpUrl.test(url)) {
return true;
} else {
return false;
}
};
})();
RegExp in one line:
var rg = /^(http|https):\/\/(([a-zA-Z0-9$\-_.+!*'(),;:&=]|%[0-9a-fA-F]{2})+#)?(((25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])(\.(25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])){3})|localhost|([a-zA-Z0-9\-\u00C0-\u017F]+\.)+([a-zA-Z]{2,}))(:[0-9]+)?(\/(([a-zA-Z0-9$\-_.+!*'(),;:#&=]|%[0-9a-fA-F]{2})*(\/([a-zA-Z0-9$\-_.+!*'(),;:#&=]|%[0-9a-fA-F]{2})*)*)?(\?([a-zA-Z0-9$\-_.+!*'(),;:#&=\/?]|%[0-9a-fA-F]{2})*)?(\#([a-zA-Z0-9$\-_.+!*'(),;:#&=\/?]|%[0-9a-fA-F]{2})*)?)?$/;
In a similar situation I got away with this:
someUtils.validateURL = function(url) {
var parser = document.createElement('a');
try {
parser.href = url;
return !!parser.hostname;
} catch (e) {
return false;
}
};
i.e. why invent the wheel if browsers can do it for you? But, of course, this will only work in the browser.
there are various parts of parsed URL exactly how browser would interpret it:
parser.protocol; // => "http:"
parser.hostname; // => "example.com"
parser.port; // => "8080"
parser.pathname; // => "/path/"
parser.search; // => "?search=test"
parser.hash; // => "#hash"
parser.host; // => "example.com:3000"
Using these you can improve your validating function depending on the requirements. The only drawback is that it will accept relative URLs and use current page server's host and port. But you can use it for your advantage, by re-assembling the URL from parts and always passing it in full to your AJAX service.
What validateURL won't accept is invalid URL, e.g. http:\:8883 will return false, but :1234 is valid and is interpreted as http://pagehost.example.com/:1234 i.e. as a relative path.
UPDATE
This approach is no longer working with Chrome and other WebKit browsers. Even when URL is invalid, hostname is filled with some value, e.g. taken from base. It still helps to parse parts of URL, but will not allow to validate one.
Possible better no-own-parser approach is to use var parsedURL = new URL(url) and catch exceptions. See e.g. URL API. Supported by all major browsers and NodeJS, although still marked experimental.
best regex I found from http://angularjs.org/
var urlregex = /^(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?$/;
This is what worked for me:
function validateURL(value) {
return /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i.test(value);
}
from there is is just a matter of calling the function to get a true or false back:
validateURL(urltovalidate);
I know it's quite an old question but since it does not have any accepted answer, I suggest you to use the URI.js framework: https://github.com/medialize/URI.js
You can use it to check for malformed URI using a try/catch block:
function isValidURL(url)
{
try {
(new URI(url));
return true;
}
catch (e) {
// Malformed URI
return false;
}
}
Of course it will consider something like "%#" as a well formed relative URI... So I suggest you read the URI.js API to perform more checks, for example if you want to make sure that the user entered a well formed absolute URL you may do like this:
function isValidURL(url)
{
try {
var uri = new URI(url);
// URI has a scheme and a host
return (!!uri.scheme() && !!uri.host());
}
catch (e) {
// Malformed URI
return false;
}
}
Import in an npm package like
https://www.npmjs.com/package/valid-url
and use it to validate your url.
You can use the URL API that is recently standard. Browser support is sketchy at best, see the link. new URL(str) is guaranteed to throw TypeError for invalid URLs.
As stated above, http://wwww is a valid URL.
The URL API can be used to validate the structure of a URL string.
An error is thrown when trying to serialise an invalid URL string into a URL object. This could be abstracted into a helper function (Typescript snippet below):
function isValidURL(URL: string) : boolean {
try {
new URL(string);
return true;
} catch (err) { return false; }
}
isValidURL('https://www.google.com'); // returns true
isValidURL('localhost:3000'); // returns true
isValidURL('not-a-valid-url'); // returns false
isValidURL('google.com'); // returns false (see footnote)
If you strictly want HTTP / web links to be valid, we can simply add a condition to the return statement:
...
const url = new URL(string);
return url.protocol === 'https:' || url.protocol === 'http:';
...
Granted, this approach comes with a few caveats:
No support for the URL API in Internet Explorer (could be fixed with a polyfill)
Without additional checks, URLs without either a protocol or port are seen as invalid (e.g. google.com is invalid but google.com:3000 is OK). This may be an unintended behaviour for some usecases.
If you're looking for a more reliable regex, check out RegexLib. Here's the page you'd probably be interested in:
http://regexlib.com/Search.aspx?k=url
As for the error messages showing while the person is still typing, change the event from keydown to blur and then it will only check once the person moves to the next element.
var RegExp = (/^HTTP|HTTP|http(s)?:\/\/(www\.)?[A-Za-z0-9]+([\-\.]{1}[A-Za-z0-9]+)*\.[A-Za-z]{2,40}(:[0-9]{1,40})?(\/.*)?$/);
My solution:
function isValidUrl(t)
{
return t.match(/^(http|https|ftp):\/\/(([A-Z0-9][A-Z0-9_-]*)(\.[A-Z0-9][A-Z0-9_-]*)+)(:(\d+))?\/?/i)
}
Demo : http://jsbin.com/uzimeb/1/edit
function checkURL(value) {
var urlregex = new RegExp("^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*#)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*$");
if (urlregex.test(value)) {
return (true);
}
return (false);
}
I have found a great resource for comparing different solutions:
https://mathiasbynens.be/demo/url-regex
According to that page, only solution from diegoperini passes all tests. Here is that regex:
_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS
I checked a lot of url validators in google and no one works for me. For example I'd like to see valid on links like 'aa.com'. I like silly check for dot sign in string.
function isValidUri(str) {
var dotIndex = str.indexOf('.');
return (dotIndex > 0 && dotIndex < str.length - 2);
}
It should not stay on beginning and end of string (for now we don't have top level domain names with one character).
Here's a regular expression which might fit the bill (it's very long):
/^(?:\u0066\u0069\u006C\u0065\u003A\u002F{2}(?:\u002F{2}(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*\u0040)?(?:\u005B(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){6}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){5}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){4}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A)?[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){3}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,3}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,4}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,5}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,6}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2})\u005D|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?\u002E)+[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?))(?:\u003A(?:\u0030-\u0035\u0030-\u0039{0,4}|\u0036\u0030-\u0034\u0030-\u0039{3}|\u0036\u0035\u0030-\u0034\u0030-\u0039{2}|\u0036\u0035\u0035\u0030-\u0032\u0030-\u0039|\u0036\u0035\u0035\u0033\u0030-\u0035))?(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*|\u002F(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)|[\u0041-\u005A\u0061-\u007A][\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002B\u002D\u002E]*\u003A(?:\u002F{2}(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*\u0040)?(?:\u005B(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){6}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){5}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){4}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A)?[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){3}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,3}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,4}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,5}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,6}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2})\u005D|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?\u002E)+[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?))(?:\u003A(?:\u0030-\u0035\u0030-\u0039{0,4}|\u0036\u0030-\u0034\u0030-\u0039{3}|\u0036\u0035\u0030-\u0034\u0030-\u0039{2}|\u0036\u0035\u0035\u0030-\u0032\u0030-\u0039|\u0036\u0035\u0035\u0033\u0030-\u0035))?(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*|\u002F(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)(?:\u003F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040\u002F\u003F]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)?(?:\u0023(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040\u002F\u003F]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)?)$/
There are some caveats to its usage, namely it does not validate URIs which contain additional information after the user name (e.g. "username:password"). Also, only IPv6 addresses can be contained within the IP literal syntax and the "IPvFuture" syntax is currently ignored and will not validate against this regular expression. Port numbers are also constrained to be between 0 and 65,535. Also, only the file scheme can use triple slashes (e.g. "file:///etc/sysconfig") and can ignore both the query and fragment parts of a URI. Finally, it is geared towards regular URIs and not IRIs, hence the extensive focus on the ASCII character set.
This regular expression could be expanded upon, but it's already complex and long enough as it is. I also cannot guarantee it's going to be "100% accurate" or "bug free", but it should correctly validate URIs for all schemes.
You will need to do additional verification for any scheme-specific requirements or do URI normalization as this regular expression will validate a very broad range of URIs.
Try edit your isValidURL function as follows:
function isValidURL(url) {
var encodedURL = encodeURIComponent(url);
var isValid = false;
$.ajax({
url: "http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22" + encodedURL + "%22&format=json",
type: "get",
async: false,
dataType: "json",
success: function(data) {
isValid = data.query.results != null;
},
error: function(){
isValid = false;
}
});
return isValid;
}
This should do the trick.

Categories

Resources