Regular expression failed in URL address test - javascript

I try to build an Regular expression to check valid URL address. for now I tested different address and all was good , but those next (valid) address's failed:
url = "http://example.com/tr/vvf/index.php/docs/po/trf"
//url = "http://example-a.mydomain.com/test/ny" also not working
var pattern = new RegExp("(https|ftp|http)://[\w-]+(\.[\w-]+)+([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?");
pattern.test(url)
I think because of the index.php/doc... Any ideas how to fix it

Just use regex literal instead of RegExp object:
var pattern = /(https|ftp|http):\/\/[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?/;
RegExp works with a string, that requires you to do double escaping so \w becomes \\w in it.
See it working here

Related

Regexp match everything after a word

I am trying to extract the base64 string from a data-url. The string looks like this, so I am trying to extract everything after the word base64
test = ''
So I want to extract the following from the above string
,/9j/4AAQSkZJRgABAQAAAQAB
Here is my regex
const base64rgx = new RegExp('(?<=base64)(?s)(.*$)');
console.log(test.match(base64rgx))
But this fails with the error:
VM3616:1 Uncaught SyntaxError: Invalid regular expression: /(?<=base64)(?s)(.*$)/: Invalid group
var test = '';
var regex = /(?<=base64).+/;
var r = test.match(regex);
console.log(r);
Here's the regex: https://regex101.com/r/uzyu0a/1
It appears that lookbehinds are not being supported. But the good news is that you don't need a lookaround here, the following pattern should work:
test = ''
var regex = /.*?base64(.*?)/g;
var match = regex.exec(test);
console.log(match[1]);
I'm not sure precisely how much of the string after base64 you want to capture. If, for example, you don't want the comma, or the 9j portion, then we can easily modify the pattern to handle that.
You may not need regular expression: just find base64 string with indexOf and use substr
var test = '';
var data = test.substr(test.indexOf('base64')+6); // 6 is length of "base64" string
console.log(data);

Why is this javascript url validator failing?

I have the following code to validate if a person entered a "valid" url in a textbox:
function validateURL(textval) {
var urlregex = new RegExp(
"^(http:\/\/www.|https:\/\/www.|ftp:\/\/www.|www.){1}([0-9A-Za-z]+\.)");
return urlregex.test(textval);
}
a user is getting an error where this is returning false for what seems like a valid urL
http://a.website.com/issues/i#browse/TEST-111
Can someone confirm why this example wouldn't pass the "valid url" test?
Can someone confirm why this example wouldn't pass the "valid url" test?
The main trouble with the regex is that www. part is obligatory in the pattern.
If you want to make it optional, use a ? modifier with a group around it ((?:www\.)?):
^(?:(?:(?:ftp|https?):\/\/)?)(?:www\.)?[0-9A-Za-z]+(?:\.[0-9A-Za-z]+)*
This will match http://a.website.com part. To match the whole string, you can use:
^(?:(?:(?:ftp|https?):\/\/)?)(www\.)?[0-9A-Za-z]+(?:\.[0-9A-Za-z]+)*(?:\/[^\/]*)*$
See demo
var re = /^(?:(?:(?:ftp|https?):\/\/)?)(www\.)?[0-9A-Za-z]+(?:\.[0-9A-Za-z]+)*(?:\/[^\/]*)*$/;
var str = 'http://a.website.com/issues/i#browse/TEST-111';
if ((m = re.exec(str)) !== null) {
document.getElementById("res").innerHTML = m[0];
}
<div id="res"/>
Your regex requires that the host name portion starts with www. (this is not a requirement for URLs in general). The URL you are testing does not include www..
There are many other reasons why the regex is broken (you don't test past the first character after www., your attempt to do so bans many characters that are allowed in URLs, etc) but that is why the URL you have isn't passing.

Regex expression to match the First url after a space followed

I want to match the First url followed by a space using regex expression while typing in the input box.
For example :
if I type www.google.com it should be matched only after a space followed by the url
ie www.google.com<SPACE>
Code
$(".site").keyup(function()
{
var site=$(this).val();
var exp = /^http(s?):\/\/(\w+:{0,1}\w*)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/;
var find = site.match(exp);
var url = find? find[0] : null;
if (url === null){
var exp = /[-\w]+(\.[a-z]{2,})+(\S+)?(\/|\/[\w#!:.?+=&%#!\-\/])?/g;
var find = site.match(exp);
url = find? 'http://'+find[0] : null;
}
});
Fiddle
Please help, Thanks in advance
you should be using a better regex to correctly match the query & fragment parts of your url. Have a look here (What is the best regular expression to check if a string is a valid URL?) for a correct IRI/URI structured Regex test.
But here's a rudimentary version:
var regex = /[-\w]+(\.[a-z]{2,})+(\/?)([^\s]+)/g;
var text = 'test google.com/?q=foo basdasd www.url.com/test?q=asdasd#cheese something else';
console.log(text.match(regex));
Expected Result:
["google.com/?q=foo", "www.url.com/test?q=asdasd#cheese"]
If you really want to check for URLs, make sure you include scheme, port, username & password checks just to be safe.
In the context of what you're trying to achieve, you should really put in some delay so that you don't impact browser performance. Regex tests can be expensive when you use complex rules especially so when running the same rule every time a new character is entered. Just think about what you're trying to achieve and whether or not there's a better solution to get there.
With a lookahead:
var exp = /[-\w]+(\.[a-z]{2,})+(\S+)?(\/|\/[\w#!:.?+=&%#!\-\/])?(?= )/g;
I only added this "(?= )" to your regex.
Fiddle

How to use inline regex modifier in VB.NET

I'm using jquery validate for client side email validation.
Of course I also have server side validation and want to use the same regex as jquery does to validate the input on the server side.
I found this regex in the source of jquery validate:
// http://docs.jquery.com/Plugins/Validation/Methods/email
email: function( value, element ) {
// contributed by Scott Gonzalez: http://projects.scottsplayground.com/email_address_validation/
return this.optional(element) || /^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$/i.test(value);
},
Note the /iat the end of the regex to make the whole thing case insensitive.
I have a complex site with c# libraries Ánd VB libraries.
In both libraries I need to implement this email validation.
In C# I'm using this code:
Regex RegexEmailAddress = new Regex(#"^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$", RegexOptions.Compiled | RegexOptions.Singleline | RegexOptions.IgnoreCase);
Note RegexOptions.IgnoreCase at the end.
However, In my VB code I need a string that hold the regex pattern.
Public Const MAIL_REGEX As String = "^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$"
So far I coudn't find any working regex adjustment to make this regex case insensitive.
I tried adding (?i) in front of the string but it's logged as invalid regex when using it on my website.
also add /i at the end of the pattern gives an invalid regex.
Update:
I tried another method with inline regex modifier I found in this SO question.
Case sensitive: ^[0-9]\s(lbs|kg|kgs)$
Case insensitive: (?i:^[0-9]\s(lbs|kg|kgs)$)
But it's also not working.
Here's the javascript error I get:
Uncaught SyntaxError: Invalid regular expression: /(?i:^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$)/: Invalid group
Update 2
I'm using RegularExpressionValidator
RegularExpressionValidator rxvalEmail = new RegularExpressionValidator();
rxvalEmail.ID = "rxvalEmail";
rxvalEmail.ValidationExpression = SomeHelperInVB.MAIL_REGEX;
rxvalEmail.ControlToValidate = "txtEmail";
So how can I make my regex case insensitive using a inline regex modifier? Of any other sollution to solve this?
I made a simple console application and by using (?i) it works just fine:
Module Module1
Public Const MAIL_REGEX As String = "^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$"
Public Const MAIL_REGEX_I As String = "(?i)^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))$"
Sub Main()
Dim r_sensitive As New System.Text.RegularExpressions.Regex(MAIL_REGEX)
Dim r_insensitive As New System.Text.RegularExpressions.Regex(MAIL_REGEX_I)
' Returns true
Console.WriteLine("ok#ok.com: " + r_sensitive.IsMatch("ok#ok.com").ToString)
' Returns false
Console.WriteLine("NOT_ok#ok.com: " + r_sensitive.IsMatch("NOT_ok#ok.com").ToString)
' Returns true
Console.WriteLine("ok#ok.com: " + r_insensitive.IsMatch("ok#ok.com").ToString)
' Returns true
Console.WriteLine("NOT_ok#ok.com: " + r_insensitive.IsMatch("NOT_ok#ok.com").ToString)
Console.Read()
End Sub
End Module
Update:
OK, so you are using RegularExpressionValidator to do validation on both client and server. ..
Since the syntax is different between VB-RegEx and Javascript-Regex I'm not sure it's possible to do what you want. In VB you use (?i) in the beginning of the expression whereas in Javascript you need to add /i as a modifier when construction the expression (http://www.w3schools.com/jsref/jsref_obj_regexp.asp).
Maybe you need to use another expression, which doesn't require a "case-insensitive modifier". This one is from Expresso (http://www.ultrapico.com/expresso.htm):
([a-zA-Z0-9_\-\.]+)#((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.)|(([a-zA-Z0-9\-]+\.)+))([a-zA-Z]{2,4}|[0-9]{1,3})
Or, another option would be to replace [a-z] with [a-zA-Z] in your current expression.
Neither of the two expressions works when using Internationalized Domain Names, like "any.name#domännamn.se" (notice the "ä" in the domain name).
Some information that might could help others:
I found out why the regex modifiers aren't working for RegularExpressionValidator when debugging the client side code used for validation.
The code below shows how client validation for regex is done using RegularExpressionValidator control:
function RegularExpressionValidatorEvaluateIsValid(val) {
var value = ValidatorGetValue(val.controltovalidate);
if (ValidatorTrim(value).length == 0)
return true;
var rx = new RegExp(val.validationexpression); // Error: Invalid regular expression:....
var matches = rx.exec(value);
return (matches != null && value == matches[0]);
}
new RegExp("pattern/i") will be formatted as: /pattern/i/
and that's indeed no valid regex. The valid regex would be: /pattern/i
=> Conclusion: Modifiers can't work using RegularExpressionValidator. I'll probably write use CustomValidator control instead with custom server and client side code.
Another option would be to replace [a-z] with [a-zA-Z] in the used regex

Javascript/RegExp: Lookbehind Assertion is causing a "Invalid group" error

I'm doing a simple Lookbehind Assertion to get a segment of the URL (example below) but instead of getting the match I get the following error:
Uncaught SyntaxError: Invalid regular expression: /(?<=\#\!\/)([^\/]+)/: Invalid group
Here is the script I'm running:
var url = window.location.toString();
url == http://my.domain.com/index.php/#!/write-stuff/something-else
// lookbehind to only match the segment after the hash-bang.
var regex = /(?<=\#\!\/)([^\/]+)/i;
console.log('test this url: ', url, 'we found this match: ', url.match( regex ) );
the result should be write-stuff.
Can anyone shed some light on why this regex group is causing this error? Looks like a valid RegEx to me.
I know of alternatives on how to get the segment I need, so this is really just about helping me understand what's going on here rather than getting an alternative solution.
Thanks for reading.
J.
I believe JavaScript does not support positive lookbehind. You will have to do something more like this:
<script>
var regex = /\#\!\/([^\/]+)/;
var url = "http://my.domain.com/index.php/#!/write-stuff/something-else";
var match = regex.exec(url);
alert(match[1]);
</script>
Javascript doesn't support look-behind syntax, so the (?<=) is what's causing the invalidity error. However, you can mimick it with various techniques: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript
Also you could use String.prototype.match() instead of RegExp.prototype.exec() in the case of global(/g) or sticky flags(/s) are not set.
var regex = /\#\!\/([^\/]+)/;
var url = "http://my.domain.com/index.php/#!/write-stuff/something-else";
var match = url.match(regex); // ["#!/write-stuff", "write-stuff", index: 31, etc.,]
console.log(match[1]); // "write-stuff"

Categories

Resources