Why quantifier doesn't work [duplicate] - javascript

This question already has an answer here:
Restricting character length in a regular expression
(1 answer)
Closed 6 years ago.
I have the following regex:
^(?:[\w]+?\/)?[\w]+?#[\w]+?\.[\w]+?$
Now I need to limit the overall string length by 25 characters:
I tried the following:
^((?:[\w]+?\/)?[\w]+?#[\w]+?\.[\w]+?){0,25}$
But it still matches regexes over 25 characters, why?

A quantifier is applied to the token preceding it. In your case, that's an entire group, and that group can match much more than a single character.
Do the length check separately, using a positive lookahead assertion:
^(?=.{0,25}$)(?:\w+\/)?\w+#\w+\.\w+$
As you can see, your regex can also be simplified quite a bit (no need for lazy quantifiers and character classes).

why not simply
if ( inputStr.length <= 25 && /^(?:[\w]+?\/)?[\w]+?#[\w]+?\.[\w]+?$/.test( inputStr ) )
{
//your logic
}

If we call your original regex A
^(?:[\w]+?\/)?[\w]+?#[\w]+?\.[\w]+?$
Then your next regex can be expressed more simply as
^(A){0,25}$
So you're not matching only 25 characters, you're matching A 0 to 25 times

Related

What is this regex: /^\D(?=\w{5})(?=\d{2})/ is not matching "bana12"? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
The objective is to match strings that are greater than 5 characters long, do not begin with numbers, and have two consecutive digits. I thought my regex was enough to do that but is not matching "bana12".
This regex does the job:
var pwRegex = /^\D(?=\w{5})(?=\w*\d{2})/;
Is not this regex more restrictive than mine? Why do I have to specify that the two or more digits are preceded by zero or more characters?
It is less restrictive than yours.
After the \D, there are 2 lookaheads. For your regex, they are
(?=\w{5})(?=\d{2})
This means that the thing after the non-digit must satisfy both of them. That is,
there must be 5 word characters immediately after the non-digit, and
there must be 2 digits immediately after the non-digit.
There is ana12 immediately after the non digit in the string. an is not 2 digits, so your regex does not match.
The working regex however has these two lookaheads:
(?=\w{5})(?=\w*\d{2})
It asserts that there must be these two things immediately after the \D:
5 word characters, and
a bunch of word characters, followed by two digits
ana12 fits both of those descriptions.
Try this Regex101 Demo. Look at step 6 in the regex debugger. That is when it tries to match the second lookahead.
You were on the right track to maybe use lookaheads, and also with the correct start of your pattern, but it is missing a few things. Consider this version:
^\D(?=.*\d{2})\w{4,}$
Here is an explanation of the pattern:
^ from the start of the string
\D match any non digit character
(?=.*\d{2}) then lookahead and assert that two consecutive digits occur
\w{4,} finally match four or more word characters (total of 5 or more characters)
$ end of the string
The major piece missing from your current attempt is that it only matches one non digit character in the beginning. You need to provide a pattern which can match 5 or more characters.

Regex: not providing length of certain string places [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 1 year ago.
What is the difference between:
(.+?)
and
(.*?)
when I use it in my php preg_match regex?
They are called quantifiers.
* 0 or more of the preceding expression
+ 1 or more of the preceding expression
Per default a quantifier is greedy, that means it matches as many characters as possible.
The ? after a quantifier changes the behaviour to make this quantifier "ungreedy", means it will match as little as possible.
Example greedy/ungreedy
For example on the string "abab"
a.*b will match "abab" (preg_match_all will return one match, the "abab")
while a.*?b will match only the starting "ab" (preg_match_all will return two matches, "ab")
You can test your regexes online e.g. on Regexr, see the greedy example here
The first (+) is one or more characters. The second (*) is zero or more characters. Both are non-greedy (?) and match anything (.).
In RegEx, {i,f} means "between i to f matches". Let's take a look at the following examples:
{3,7} means between 3 to 7 matches
{,10} means up to 10 matches with no lower limit (i.e. the low limit is 0)
{3,} means at least 3 matches with no upper limit (i.e. the high limit is infinity)
{,} means no upper limit or lower limit for the number of matches (i.e. the lower limit is 0 and the upper limit is infinity)
{5} means exactly 4
Most good languages contain abbreviations, so does RegEx:
+ is the shorthand for {1,}
* is the shorthand for {,}
? is the shorthand for {,1}
This means + requires at least 1 match while * accepts any number of matches or no matches at all and ? accepts no more than 1 match or zero matches.
Credit: Codecademy.com
+ matches at least one character
* matches any number (including 0) of characters
The ? indicates a lazy expression, so it will match as few characters as possible.
A + matches one or more instances of the preceding pattern. A * matches zero or more instances of the preceding pattern.
So basically, if you use a + there must be at least one instance of the pattern, if you use * it will still match if there are no instances of it.
Consider below is the string to match.
ab
The pattern (ab.*) will return a match for capture group with result of ab
While the pattern (ab.+) will not match and not returning anything.
But if you change the string to following, it will return aba for pattern (ab.+)
aba
+ is minimal one, * can be zero as well.
A star is very similar to a plus, the only difference is that while the plus matches 1 or more of the preceding character/group, the star matches 0 or more.
I think the previous answers fail to highlight a simple example:
for example we have an array:
numbers = [5, 15]
The following regex expression ^[0-9]+ matches: 15 only.
However, ^[0-9]* matches both 5 and 15. The difference is that the + operator requires at least one duplicate of the preceding regex expression

Regex with capture group not working , but without capture group works perfectly fine [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 1 year ago.
What is the difference between:
(.+?)
and
(.*?)
when I use it in my php preg_match regex?
They are called quantifiers.
* 0 or more of the preceding expression
+ 1 or more of the preceding expression
Per default a quantifier is greedy, that means it matches as many characters as possible.
The ? after a quantifier changes the behaviour to make this quantifier "ungreedy", means it will match as little as possible.
Example greedy/ungreedy
For example on the string "abab"
a.*b will match "abab" (preg_match_all will return one match, the "abab")
while a.*?b will match only the starting "ab" (preg_match_all will return two matches, "ab")
You can test your regexes online e.g. on Regexr, see the greedy example here
The first (+) is one or more characters. The second (*) is zero or more characters. Both are non-greedy (?) and match anything (.).
In RegEx, {i,f} means "between i to f matches". Let's take a look at the following examples:
{3,7} means between 3 to 7 matches
{,10} means up to 10 matches with no lower limit (i.e. the low limit is 0)
{3,} means at least 3 matches with no upper limit (i.e. the high limit is infinity)
{,} means no upper limit or lower limit for the number of matches (i.e. the lower limit is 0 and the upper limit is infinity)
{5} means exactly 4
Most good languages contain abbreviations, so does RegEx:
+ is the shorthand for {1,}
* is the shorthand for {,}
? is the shorthand for {,1}
This means + requires at least 1 match while * accepts any number of matches or no matches at all and ? accepts no more than 1 match or zero matches.
Credit: Codecademy.com
+ matches at least one character
* matches any number (including 0) of characters
The ? indicates a lazy expression, so it will match as few characters as possible.
A + matches one or more instances of the preceding pattern. A * matches zero or more instances of the preceding pattern.
So basically, if you use a + there must be at least one instance of the pattern, if you use * it will still match if there are no instances of it.
Consider below is the string to match.
ab
The pattern (ab.*) will return a match for capture group with result of ab
While the pattern (ab.+) will not match and not returning anything.
But if you change the string to following, it will return aba for pattern (ab.+)
aba
+ is minimal one, * can be zero as well.
A star is very similar to a plus, the only difference is that while the plus matches 1 or more of the preceding character/group, the star matches 0 or more.
I think the previous answers fail to highlight a simple example:
for example we have an array:
numbers = [5, 15]
The following regex expression ^[0-9]+ matches: 15 only.
However, ^[0-9]* matches both 5 and 15. The difference is that the + operator requires at least one duplicate of the preceding regex expression

Regex for Should consist of 8 to 20 numeric characters including spaces and one hyphen, and hyphen cannot be the first or the last character

I was trying to write regex for above mentioned rule I tried this way but its not working
"(?=^([0-9]|-){8,20}$)^[0-9]+[/-/][0-9]+$"
so, you could make the problem much simpler by creating a regex that tests that a string consists of 8 to 20 characters (numeric, space or hyphen), where the hyphen cannot be the first or the last character:
/^[\d\s][\d\s\-]{6,18}[\d\s]$/
and then test that the string contains a single hypen:
/^[^\-]*\-[^\-]*$/
and then that it contains a space:
/\s/
I appreciate this is a slightly different answer than what you were asking for, but thought it might help
I don't think it's possible to check the length AND the format of the string with only one regex. However, you could achieve this with 2 steps, for example:
var formatRegex = /^\d+(?:\d|\s)*-(?:\d|\s)+$/;
var myText = '111-111 11 111';
if (formatRegex.test(myText)) {
if (myText.length >= 8 && myText.length <= 20) {
console.log('ok');
}
}
^[0-9](?:([0-9]{5,17}-)|(?=[0-9]*[-][0-9]*$)[0-9-]{6,18})[0-9]$
this regex is working
explanation:
it will check string starting with [0-9] and ending with [0-9]
now for remaining string of length we can have this pattern (?:[0-9]{5,17}|(?=[0-9][-][0-9]$)[0-9-]{6,18})
where ?:([0-9]{5,17}-) will check presence of 5 to 17 numbers and one - in string
demo:
https://regex101.com/r/eO3jK8/1

JavaScript: \\d{4} RegExp allows more than 4 digits [duplicate]

This question already has answers here:
Match exact string
(3 answers)
Closed 4 years ago.
Have following validation for year value from text input:
if (!year.match(new RegExp('\\d{4}'))){
...
}
RegExp equals null if numeric of digits from 0 to 3. It's OK.
In case 4 digits it returns value.It's OK.
In case more than 4 digits it returns value again,that it's NOT OK.
Documentation says {n} declaration means exact number,but works like:
exact+
With such ugly validation it work's fine:
if (!year.match(new RegExp('\\d{4}')) || year.length>4){
...
}
I wish to utilize RegExp object only.
Yes it would allow more than 4 digits since it would be a partial match use the ^ and $ to mark the beginning and the end of the string.
if (!year.match(new RegExp('^\\d{4}$'))){
...
}
If you include ^ in your regex it matches the beginning of the string, while $ matches the end, so all up:
^\d{4}$
Will match only against beginning-of-string plus four digits plus end-of-string.
Note that regex literal syntax is generally a bit simpler than saying new Regex():
/^\d{4}$/
// is the equivalent of
new RegExp('^\\d{4}$')
Note that in the literal syntax you don't have to escape backslashes like with the string you pass to the new RegExp(). The forward slashes are not part of the expression itself, you can think of them like quotation marks for regexes.
Also, if you just want to check if a string matches a pattern (yes or no) without extracting what actually matched you should use the .test() method as follows:
if (!/^\d{4}$/.test(year)) {
...
}
It's matching the first four digits and then the fact that there's any remaining digits it neither here nor there. You need to change your regex so it stops after these four digits, say, by using the string termination anchors:
^\d{4}$
Try instead:
'^\\d{4}$'
What you had will match anything with 4 digits anywhere, such as asd1234asd or 123456789

Categories

Resources