Please read - this isn't just simply doing json_encode from php to javascript.
Our billing system uses Authorize.net. Using the Authorize.net API, in PHP we create a token request. The data in the request passes customer info, balance, billing address, etc - that data is sent directly with PHP. In response we get a token to be processed in PHP - and then that token is embedded into the HTML form input.
After the credit card form is submitted, we get a javascript json response back to the form to be processed by other JS functions. All works fine until we have a customer with the & in their company name (IE: Bar & Grill)
The & triggers an error only in the json response back to the form - specifically the error we get is: unexpected end of json input which causes the rest of the scripts to error out.
So, the issue is, does the customer data in the PHP token request need to be urlencoded - or is there a special way to handle the json response? From what I can tell, Authorize simply returns the exact customer data in the json response - so if we url encode it on the front end (before the token request is sent), then does that mean we also need to url decode the json response.
Its kind of a chicken and an egg which came first problem.
Authorize.net Token Request (in PHP):
$customerData = new AnetAPI\CustomerDataType();
$customerData->setType("business");
$customerData->setId($ss['authCustID']);
$customerData->setEmail($ii['cEmail']);
// Set the Bill To info for new payment type
$billTo = new AnetAPI\CustomerAddressType();
$billTo->setFirstName($ii['cFirstName']);
$billTo->setLastName($ii['cLastName']);
$billTo->setCompany($ii['cName']); // #1 <----- IE: "Bar & Grill"
$billTo->setAddress($ii['cAddress']. " " .$ii['cAddress1']);
$billTo->setCity($ii['cCity']);
$billTo->setState($ii['cState']);
$billTo->setZip($ii['cZip']);
$billTo->setCountry("USA");
// set shipping profile
$shipTo = clone $billTo ;
// extend billTop profile
$billTo->setFaxNumber('8005551212') ;
$billTo->setPhoneNumber($ii['cPhone']);
//create a transaction
$transactionRequestType = new AnetAPI\TransactionRequestType();
$transactionRequestType->setTransactionType("authCaptureTransaction");
$transactionRequestType->setAmount($ss['balance']);
$transactionRequestType->setCustomer($customerData) ;
$transactionRequestType->setOrder($order) ;
$transactionRequestType->setBillTo($billTo) ;
$transactionRequestType->setShipTo($shipTo) ;
// Build transaction request
$request = new AnetAPI\GetHostedPaymentPageRequest();
$request->setMerchantAuthentication($merchantAuthentication);
$request->setRefId($refId);
$request->setTransactionRequest($transactionRequestType);
//execute request
$controller = new AnetController\GetHostedPaymentPageController($request);
$response = $controller->executeWithApiResponse(\net\authorize\api\constants\ANetEnvironment::PRODUCTION); // SANDBOX or PRODUCTION
$gToken=[] ;
if (($response != null) && ($response->getMessages()->getResultCode() == "Ok")) {
//echo $response->getToken()."\n";
$gToken["Error"] = 0 ;
$gToken["Token"] = $response->getToken() ;
} else {
//echo "ERROR : Failed to get hosted payment page token\n";
$errorMessages = $response->getMessages()->getMessage();
//echo "RESPONSE : " . $errorMessages[0]->getCode() . " " .$errorMessages[0]->getText() . "\n";
$gToken["Error"] = 1 ;
$gToken["errMsg"] = $errorMessages[0]->getCode() . ": " .$errorMessages[0]->getText() ;
}
Form/JSON response handler:
AuthorizeNetPopup.onReceiveCommunication = function (querystr) {
var params = parseQueryString(querystr);
//console.log(params) ;
switch (params["action"]) {
case "successfulSave":
AuthorizeNetPopup.closePopup();
break;
case "cancel":
AuthorizeNetPopup.closePopup();
break;
case "transactResponse":
// 'response' is a string value
// encode it as an object, to be passed to global function
// that will decode it again for PHP
console.log(params["response"]) ;
var response = JSON.parse(params["response"]); // #2 <--- ERROR: unexpected end of json input
//var response = params["response"];
httpReceipt(response) ;
AuthorizeNetPopup.closePopup();
break;
case "resizeWindow":
var w = parseInt(params["width"]);
var h = parseInt(params["height"]);
var ifrm = document.getElementById("iframeAuthorizeNet");
ifrm.style.width = w.toString() + "px";
ifrm.style.height = h.toString() + "px";
centerPopup();
break;
}
};
Do I url encode data at #1 (in php section) or do I url encode the json response before processing it (at #2) - or both? Very confused on how to handle this. Do we need to encode the data of eachparameter being added to the token request - or can we just encode the entire request before its submitted? Regardless of which end of the communication gets encoding/decoding applied, what are the proper encoding/decoding calls?
UPDATE
To illustrate flow:
paymentPage.html ->PHP generates token, embeds token into page form, also has iframe for paymentPageFrame.html
paymentPageFrame.html -> iFrame communicator page, relays msgs between paymentPage.html and authorize.net
paymentPage.html -> javascript onReceiveCommunication to process messages coming from paymentPageFrame.html
Turns out that authorize.net is returning a URL string to paymentPageFrame.html - the string is not urlencoded. The string is then getting passed back to the parent onReceiveCommunication at which point its being parsed with a custom parser:
function parseQueryString(str) {
var vars = [];
var arr = str.split('&');
var pair;
for (var i = 0; i < arr.length; i++) {
pair = arr[i].split('=');
vars.push(pair[0]);
vars[pair[0]] = unescape(pair[1]);
}
return vars;
}
This custom parser is then causing a data value that has the & in it (IE: "company" : "Bar & Grill") to split the name in half ultimately leading to the invalid/unexpected end of json input.
The string being passsed back to the iFrame, which in turn gets passed to the parent is:
action=transactResponse&response={"accountType":"Visa","accountNumber":"XXXX0623","transId":"62830720474","responseCode":"1","authorization":"185778","shipTo":{"firstName":"Bob","lastName":"Jones","company":"Backyard Bar & Grill","address":"123 Main St ","city":"Tampa","state":"FL","zip":"33606","country":"USA"},"orderDescription":"2020-12-01 bill for All Regions/All Sites","customerId":"1002-0","totalAmount":"1.50","orderInvoiceNumber":"11386-0","dateTime":"2/2/2021 4:57:53 PM","refId":"ref1612285039"}
Now that the string is in the parent page, I am trying to figure the best way to encode it then parse it so it won't break on parsing when a data value has & in it (IE: Backyard Bar & Grill)
So far I am trying, without sucess:
var urlstr = encodeURIComponent(response) ; // encode it first
var newUrl = new URLSearchParams(urlstr) ; // then proper parse it
But when I try to access the parameters, it returns null:
consoole.log(newUrl.get('action')) ;
> null
Your problem is that you split on the & characters which is also inside the JSON response. For a quick solution I suggest something like:
var str = 'action=transactResponse&response={"accountType":"Visa","accountNumber":"XXXX0623","transId":"62830720474","responseCode":"1","authorization":"185778","shipTo":{"firstName":"Bob","lastName":"Jones","company":"Backyard Bar & Grill","address":"123 Main St ","city":"Tampa","state":"FL","zip":"33606","country":"USA"},"orderDescription":"2020-12-01 bill for All Regions/All Sites","customerId":"1002-0","totalAmount":"1.50","orderInvoiceNumber":"11386-0","dateTime":"2/2/2021 4:57:53 PM","refId":"ref1612285039"}';
var response = JSON.parse(str.replace('action=transactResponse&response=', ''));
console.log(response);
console.log(response.shipTo.company);
JS fiddle: https://jsfiddle.net/vx9zr1b7/
This just replaces away the first parameter and the name of the second, so only the value of the second (and thereby the value of "response") will remain. Note that this only works under the assumption that the response is always in this very same format, and does not have any additional parameters in the response.
I suggest you put it inside your case "transactResponse": block.
This will parse you the JSON:
{
accountNumber: "XXXX0623",
accountType: "Visa",
authorization: "185778",
customerId: "1002-0",
dateTime: "2/2/2021 4:57:53 PM",
orderDescription: "2020-12-01 bill for All Regions/All Sites",
orderInvoiceNumber: "11386-0",
refId: "ref1612285039",
responseCode: "1",
shipTo: {
address: "123 Main St ",
city: "Tampa",
company: "Backyard Bar & Grill",
country: "USA",
firstName: "Bob",
lastName: "Jones",
state: "FL",
zip: "33606"
},
totalAmount: "1.50",
transId: "62830720474"
}
Note that the & became an & in the process of parsing the JSON. Therefore, if you are doing anything else with it than just displaying it in the browser, I suggest you html_entity_decode() the value on the PHP side.
After much tinkering around, i finally wrote my own parser that:
A: if exists, capture the response JSON first
B: then parse the remainder of the URL params:
C: return all params and response object as a single object
var str = 'action=transactResponse&response={"accountType":"Visa","accountNumber":"XXXX0623","transId":"62830720474","responseCode":"1","authorization":"185778","shipTo":{"firstName":"Preston","lastName":"Rolinger","company":"Backyard & Grill","address":"2808 W Foster Ave S ","city":"Tampa","state":"FL","zip":"33611","country":"USA"},"orderDescription":"2020-12-01 bill for All Regions/All Sites","customerId":"1002-0","totalAmount":"1.50","orderInvoiceNumber":"11386-0","dateTime":"2/2/2021 4:57:53 PM","refId":"ref1612285039"}&moreKey="other value"' ;
function parseResponse(str) {
var resInfo = [], res = {} ;
if (str.match("&response=")) {
var params = str.split("&response=") ; // leading params & response obj
if (params[1].match("}&")) {
resInfo = params[1].split("}&") ; // splits off anything at end of obj
res.response = JSON.parse(resInfo[0] + "}") ; // add } to complete obj string again
} else {
res.response = JSON.parse(params[1]) ; // else it already is the json string
}
if (resInfo[1]) {
params = params[0]+ "&" +resInfo[1] ; // join url params back together
} else {
params = params[0] ;
}
} else {
params = str ; // else the str has no "response" in it, process just params
}
params = new URLSearchParams(encodeURI(params)) ; //encode then parse
params.forEach(function(value, key) {
res[key] = value ;
}) ;
return res ;
}
var response = parseResponse(str) ;
console.log(response) ;
I want to validate a URL and display message. Below is my code:
$("#pageUrl").keydown(function(){
$(".status").show();
var url = $("#pageUrl").val();
if(isValidURL(url)){
$.ajax({
type: "POST",
url: "demo.php",
data: "pageUrl="+ url,
success: function(msg){
if(msg == 1 ){
$(".status").html('<img src="images/success.gif"/><span><strong>SiteID:</strong>12345678901234456</span>');
}else{
$(".status").html('<img src="images/failure.gif"/>');
}
}
});
}else{
$(".status").html('<img src="images/failure.gif"/>');
}
});
function isValidURL(url){
var RegExp = /(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/;
if(RegExp.test(url)){
return true;
}else{
return false;
}
}
My problem is now it will show an error message even when entering a proper URL until it matches regular expression, and it return true even if the URL is something like "http://wwww".
I appreciate your suggestions.
Someone mentioned the Jquery Validation plugin, seems overkill if you just want to validate the url, here is the line of regex from the plugin:
return this.optional(element) || /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i.test(value);
Here is where they got it from: http://projects.scottsplayground.com/iri/
Pointed out by #nhahtdh This has been updated to:
// Copyright (c) 2010-2013 Diego Perini, MIT licensed
// https://gist.github.com/dperini/729294
// see also https://mathiasbynens.be/demo/url-regex
// modified to allow protocol-relative URLs
return this.optional( element ) || /^(?:(?:(?:https?|ftp):)?\/\/)(?:\S+(?::\S*)?#)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,})).?)(?::\d{2,5})?(?:[/?#]\S*)?$/i.test( value );
source: https://github.com/jzaefferer/jquery-validation/blob/c1db10a34c0847c28a5bd30e3ee1117e137ca834/src/core.js#L1349
It's not practical to parse URLs using regex. A full implementation of the RFC1738 rules would result in an enormously long regex (assuming it's even possible). Certainly your current expression fails many valid URLs, and passes invalid ones.
Instead:
a. use a proper URL parser that actually follows the real rules. (I don't know of one for JavaScript; it would probably be overkill. You could do it on the server side though). Or,
b. just trim away any leading or trailing spaces, then check it has one of your preferred schemes on the front (typically ‘http://’ or ‘https://’), and leave it at that. Or,
c. attempt to use the URL and see what lies at the end, for example by sending it am HTTP HEAD request from the server-side. If you get a 404 or connection error, it's probably wrong.
it return true even if url is something like "http://wwww".
Well, that is indeed a perfectly valid URL.
If you want to check whether a hostname such as ‘wwww’ actually exists, you have no choice but to look it up in the DNS. Again, this would be server-side code.
function validateURL(textval) {
var urlregex = /^(https?|ftp):\/\/([a-zA-Z0-9.-]+(:[a-zA-Z0-9.&%$-]+)*#)*((25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]?)(\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9]?[0-9])){3}|([a-zA-Z0-9-]+\.)*[a-zA-Z0-9-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(:[0-9]+)*(\/($|[a-zA-Z0-9.,?'\\+&%$#=~_-]+))*$/;
return urlregex.test(textval);
}
This can return true for URLs like:
http://stackoverflow.com/questions/1303872/url-validation-using-javascript
or:
http://regexlib.com/DisplayPatterns.aspx?cattabindex=1&categoryId=2
I written also a URL validation function base on rfc1738 and rfc3986 to check http and https urls. I try to hold this modular, so it can be better maintained and adapted to own requirements.
The RegExp in one line is show at end of this post.
The RegExp accept HTTP and HTTPS URLs with some international domain or IPv4 number. IPv6 is not supported yet.
window.isValidURL = (function() {// wrapped in self calling function to prevent global pollution
//URL pattern based on rfc1738 and rfc3986
var rg_pctEncoded = "%[0-9a-fA-F]{2}";
var rg_protocol = "(http|https):\\/\\/";
var rg_userinfo = "([a-zA-Z0-9$\\-_.+!*'(),;:&=]|" + rg_pctEncoded + ")+" + "#";
var rg_decOctet = "(25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])"; // 0-255
var rg_ipv4address = "(" + rg_decOctet + "(\\." + rg_decOctet + "){3}" + ")";
var rg_hostname = "([a-zA-Z0-9\\-\\u00C0-\\u017F]+\\.)+([a-zA-Z]{2,})";
var rg_port = "[0-9]+";
var rg_hostport = "(" + rg_ipv4address + "|localhost|" + rg_hostname + ")(:" + rg_port + ")?";
// chars sets
// safe = "$" | "-" | "_" | "." | "+"
// extra = "!" | "*" | "'" | "(" | ")" | ","
// hsegment = *[ alpha | digit | safe | extra | ";" | ":" | "#" | "&" | "=" | escape ]
var rg_pchar = "a-zA-Z0-9$\\-_.+!*'(),;:#&=";
var rg_segment = "([" + rg_pchar + "]|" + rg_pctEncoded + ")*";
var rg_path = rg_segment + "(\\/" + rg_segment + ")*";
var rg_query = "\\?" + "([" + rg_pchar + "/?]|" + rg_pctEncoded + ")*";
var rg_fragment = "\\#" + "([" + rg_pchar + "/?]|" + rg_pctEncoded + ")*";
var rgHttpUrl = new RegExp(
"^"
+ rg_protocol
+ "(" + rg_userinfo + ")?"
+ rg_hostport
+ "(\\/"
+ "(" + rg_path + ")?"
+ "(" + rg_query + ")?"
+ "(" + rg_fragment + ")?"
+ ")?"
+ "$"
);
// export public function
return function (url) {
if (rgHttpUrl.test(url)) {
return true;
} else {
return false;
}
};
})();
RegExp in one line:
var rg = /^(http|https):\/\/(([a-zA-Z0-9$\-_.+!*'(),;:&=]|%[0-9a-fA-F]{2})+#)?(((25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])(\.(25[0-5]|2[0-4][0-9]|[0-1][0-9][0-9]|[1-9][0-9]|[0-9])){3})|localhost|([a-zA-Z0-9\-\u00C0-\u017F]+\.)+([a-zA-Z]{2,}))(:[0-9]+)?(\/(([a-zA-Z0-9$\-_.+!*'(),;:#&=]|%[0-9a-fA-F]{2})*(\/([a-zA-Z0-9$\-_.+!*'(),;:#&=]|%[0-9a-fA-F]{2})*)*)?(\?([a-zA-Z0-9$\-_.+!*'(),;:#&=\/?]|%[0-9a-fA-F]{2})*)?(\#([a-zA-Z0-9$\-_.+!*'(),;:#&=\/?]|%[0-9a-fA-F]{2})*)?)?$/;
In a similar situation I got away with this:
someUtils.validateURL = function(url) {
var parser = document.createElement('a');
try {
parser.href = url;
return !!parser.hostname;
} catch (e) {
return false;
}
};
i.e. why invent the wheel if browsers can do it for you? But, of course, this will only work in the browser.
there are various parts of parsed URL exactly how browser would interpret it:
parser.protocol; // => "http:"
parser.hostname; // => "example.com"
parser.port; // => "8080"
parser.pathname; // => "/path/"
parser.search; // => "?search=test"
parser.hash; // => "#hash"
parser.host; // => "example.com:3000"
Using these you can improve your validating function depending on the requirements. The only drawback is that it will accept relative URLs and use current page server's host and port. But you can use it for your advantage, by re-assembling the URL from parts and always passing it in full to your AJAX service.
What validateURL won't accept is invalid URL, e.g. http:\:8883 will return false, but :1234 is valid and is interpreted as http://pagehost.example.com/:1234 i.e. as a relative path.
UPDATE
This approach is no longer working with Chrome and other WebKit browsers. Even when URL is invalid, hostname is filled with some value, e.g. taken from base. It still helps to parse parts of URL, but will not allow to validate one.
Possible better no-own-parser approach is to use var parsedURL = new URL(url) and catch exceptions. See e.g. URL API. Supported by all major browsers and NodeJS, although still marked experimental.
best regex I found from http://angularjs.org/
var urlregex = /^(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?$/;
This is what worked for me:
function validateURL(value) {
return /^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i.test(value);
}
from there is is just a matter of calling the function to get a true or false back:
validateURL(urltovalidate);
I know it's quite an old question but since it does not have any accepted answer, I suggest you to use the URI.js framework: https://github.com/medialize/URI.js
You can use it to check for malformed URI using a try/catch block:
function isValidURL(url)
{
try {
(new URI(url));
return true;
}
catch (e) {
// Malformed URI
return false;
}
}
Of course it will consider something like "%#" as a well formed relative URI... So I suggest you read the URI.js API to perform more checks, for example if you want to make sure that the user entered a well formed absolute URL you may do like this:
function isValidURL(url)
{
try {
var uri = new URI(url);
// URI has a scheme and a host
return (!!uri.scheme() && !!uri.host());
}
catch (e) {
// Malformed URI
return false;
}
}
Import in an npm package like
https://www.npmjs.com/package/valid-url
and use it to validate your url.
You can use the URL API that is recently standard. Browser support is sketchy at best, see the link. new URL(str) is guaranteed to throw TypeError for invalid URLs.
As stated above, http://wwww is a valid URL.
The URL API can be used to validate the structure of a URL string.
An error is thrown when trying to serialise an invalid URL string into a URL object. This could be abstracted into a helper function (Typescript snippet below):
function isValidURL(URL: string) : boolean {
try {
new URL(string);
return true;
} catch (err) { return false; }
}
isValidURL('https://www.google.com'); // returns true
isValidURL('localhost:3000'); // returns true
isValidURL('not-a-valid-url'); // returns false
isValidURL('google.com'); // returns false (see footnote)
If you strictly want HTTP / web links to be valid, we can simply add a condition to the return statement:
...
const url = new URL(string);
return url.protocol === 'https:' || url.protocol === 'http:';
...
Granted, this approach comes with a few caveats:
No support for the URL API in Internet Explorer (could be fixed with a polyfill)
Without additional checks, URLs without either a protocol or port are seen as invalid (e.g. google.com is invalid but google.com:3000 is OK). This may be an unintended behaviour for some usecases.
If you're looking for a more reliable regex, check out RegexLib. Here's the page you'd probably be interested in:
http://regexlib.com/Search.aspx?k=url
As for the error messages showing while the person is still typing, change the event from keydown to blur and then it will only check once the person moves to the next element.
var RegExp = (/^HTTP|HTTP|http(s)?:\/\/(www\.)?[A-Za-z0-9]+([\-\.]{1}[A-Za-z0-9]+)*\.[A-Za-z]{2,40}(:[0-9]{1,40})?(\/.*)?$/);
My solution:
function isValidUrl(t)
{
return t.match(/^(http|https|ftp):\/\/(([A-Z0-9][A-Z0-9_-]*)(\.[A-Z0-9][A-Z0-9_-]*)+)(:(\d+))?\/?/i)
}
Demo : http://jsbin.com/uzimeb/1/edit
function checkURL(value) {
var urlregex = new RegExp("^(http|https|ftp)\://([a-zA-Z0-9\.\-]+(\:[a-zA-Z0-9\.&%\$\-]+)*#)*((25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9])\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[1-9]|0)\.(25[0-5]|2[0-4][0-9]|[0-1]{1}[0-9]{2}|[1-9]{1}[0-9]{1}|[0-9])|([a-zA-Z0-9\-]+\.)*[a-zA-Z0-9\-]+\.(com|edu|gov|int|mil|net|org|biz|arpa|info|name|pro|aero|coop|museum|[a-zA-Z]{2}))(\:[0-9]+)*(/($|[a-zA-Z0-9\.\,\?\'\\\+&%\$#\=~_\-]+))*$");
if (urlregex.test(value)) {
return (true);
}
return (false);
}
I have found a great resource for comparing different solutions:
https://mathiasbynens.be/demo/url-regex
According to that page, only solution from diegoperini passes all tests. Here is that regex:
_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?#)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS
I checked a lot of url validators in google and no one works for me. For example I'd like to see valid on links like 'aa.com'. I like silly check for dot sign in string.
function isValidUri(str) {
var dotIndex = str.indexOf('.');
return (dotIndex > 0 && dotIndex < str.length - 2);
}
It should not stay on beginning and end of string (for now we don't have top level domain names with one character).
Here's a regular expression which might fit the bill (it's very long):
/^(?:\u0066\u0069\u006C\u0065\u003A\u002F{2}(?:\u002F{2}(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*\u0040)?(?:\u005B(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){6}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){5}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){4}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A)?[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){3}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,3}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,4}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,5}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,6}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2})\u005D|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?\u002E)+[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?))(?:\u003A(?:\u0030-\u0035\u0030-\u0039{0,4}|\u0036\u0030-\u0034\u0030-\u0039{3}|\u0036\u0035\u0030-\u0034\u0030-\u0039{2}|\u0036\u0035\u0035\u0030-\u0032\u0030-\u0039|\u0036\u0035\u0035\u0033\u0030-\u0035))?(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*|\u002F(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)|[\u0041-\u005A\u0061-\u007A][\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002B\u002D\u002E]*\u003A(?:\u002F{2}(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*\u0040)?(?:\u005B(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){6}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){5}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){4}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A)?[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){3}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,3}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,4}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035]))|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,5}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}|(?:(?:[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4}\u003A){0,6}[\u0030-\u0039\u0041-\u0046\u0061-\u0066]{1,4})?\u003A{2})\u005D|(?:(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])\u002E){3}(?:[\u0030-\u0039]|[\u0031-\u0039][\u0030-\u0039]|\u0031[\u0030-\u0039]{2}|\u0032[\u0030-\u0034][\u0030-\u0039]|\u0032\u0035[\u0030-\u0035])|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?\u002E)+[\u0041-\u005A\u0061-\u007A\u0030-\u0039](?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D]+)?[\u0041-\u005A\u0061-\u007A\u0030-\u0039])?))(?:\u003A(?:\u0030-\u0035\u0030-\u0039{0,4}|\u0036\u0030-\u0034\u0030-\u0039{3}|\u0036\u0035\u0030-\u0034\u0030-\u0039{2}|\u0036\u0035\u0035\u0030-\u0032\u0030-\u0039|\u0036\u0035\u0035\u0033\u0030-\u0035))?(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*|\u002F(?:(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)?|(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])+(?:\u002F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)*)(?:\u003F(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040\u002F\u003F]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)?(?:\u0023(?:[\u0041-\u005A\u0061-\u007A\u0030-\u0039\u002D\u002E\u005F\u007E\u0021\u0024\u0026\u0027\u0028\u0029\u002A\u002B\u002C\u003B\u003D\u003A\u0040\u002F\u003F]|\u0025[\u0030-\u0039\u0041-\u0046\u0061-\u0066][\u0030-\u0039\u0041-\u0046\u0061-\u0066])*)?)$/
There are some caveats to its usage, namely it does not validate URIs which contain additional information after the user name (e.g. "username:password"). Also, only IPv6 addresses can be contained within the IP literal syntax and the "IPvFuture" syntax is currently ignored and will not validate against this regular expression. Port numbers are also constrained to be between 0 and 65,535. Also, only the file scheme can use triple slashes (e.g. "file:///etc/sysconfig") and can ignore both the query and fragment parts of a URI. Finally, it is geared towards regular URIs and not IRIs, hence the extensive focus on the ASCII character set.
This regular expression could be expanded upon, but it's already complex and long enough as it is. I also cannot guarantee it's going to be "100% accurate" or "bug free", but it should correctly validate URIs for all schemes.
You will need to do additional verification for any scheme-specific requirements or do URI normalization as this regular expression will validate a very broad range of URIs.
Try edit your isValidURL function as follows:
function isValidURL(url) {
var encodedURL = encodeURIComponent(url);
var isValid = false;
$.ajax({
url: "http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%3D%22" + encodedURL + "%22&format=json",
type: "get",
async: false,
dataType: "json",
success: function(data) {
isValid = data.query.results != null;
},
error: function(){
isValid = false;
}
});
return isValid;
}
This should do the trick.