Do consecutive lookaheads match based on first matched character - javascript

I came answers (below) to try and understand how consecutive lookaheads work. My understanding seems to be contradictory and was hoping someone could help clarify.
The answer here suggests that all the lookaheads specified must be present for the first matched character (Why consecutive lookaheads do not always work answer by Sam Whan)
If I apply that to the solution in this answer:
How to print a number with commas as thousands separators in JavaScript:
function numberWithCommas(x) {
return x.toString().replace(/\B(?=(\d{3})+(?!\d))/g, ",");
}
it means that it's looking for the a non-boundary character that is followed by a sequence of characters with length that is a multiple of 3 and at the same time followed by characters that are not digits.
e.g. 12345
Knowing that a comma should go after the 2 but it seems contradictory as 2 has 3 digits following it, satisfying the first lookahead but the second lookahead contradicts it as it's supposed to not be followed by any digits.
I'm sure I'm misunderstanding something. Any help is appreciated. Thanks!

This regex:
/\B(?=(\d{3})+(?!\d))/g
Has only one positive lookahead condition and other negative lookahead is inside this first lookahead.
Here are details:
\B: Match position where \b doesn't match (e.g. between word characters)
(?=: Start lookahead
(\d{3})+: Match one or more sets of 3 digits
(?!\d): Inner negative lookahead to assert that we don't have a digit after match set of 3 digits
): End lookahead
However do note that it is much better to use following code to format your number to a thousand separator string:
console.log( parseFloat('1234567.89').toLocaleString('en') )

Related

Regular expressions: prohibit the use of characters [duplicate]

I have a regex
/^([a-zA-Z0-9]+)$/
this just allows only alphanumerics but also if I insert only number(s) or only character(s) then also it accepts it. I want it to work like the field should accept only alphanumeric values but the value must contain at least both 1 character and 1 number.
Why not first apply the whole test, and then add individual tests for characters and numbers? Anyway, if you want to do it all in one regexp, use positive lookahead:
/^(?=.*[0-9])(?=.*[a-zA-Z])([a-zA-Z0-9]+)$/
This RE will do:
/^(?:[0-9]+[a-z]|[a-z]+[0-9])[a-z0-9]*$/i
Explanation of RE:
Match either of the following:
At least one number, then one letter or
At least one letter, then one number plus
Any remaining numbers and letters
(?:...) creates an unreferenced group
/i is the ignore-case flag, so that a-z == a-zA-Z.
I can see that other responders have given you a complete solution. Problem with regexes is that they can be difficult to maintain/understand.
An easier solution would be to retain your existing regex, then create two new regexes to test for your "at least one alphabetic" and "at least one numeric".
So, test for this :-
/^([a-zA-Z0-9]+)$/
Then this :-
/\d/
Then this :-
/[A-Z]/i
If your string passes all three regexes, you have the answer you need.
The accepted answers is not worked as it is not allow to enter special characters.
Its worked perfect for me.
^(?=.*[0-9])(?=.*[a-zA-Z])(?=\S+$).{6,20}$
one digit must
one character must (lower or upper)
every other things optional
Thank you.
While the accepted answer is correct, I find this regex a lot easier to read:
REGEX = "([A-Za-z]+[0-9]|[0-9]+[A-Za-z])[A-Za-z0-9]*"
This solution accepts at least 1 number and at least 1 character:
[^\w\d]*(([0-9]+.*[A-Za-z]+.*)|[A-Za-z]+.*([0-9]+.*))
And an idea with a negative check.
/^(?!\d*$|[a-z]*$)[a-z\d]+$/i
^(?! at start look ahead if string does not
\d*$ contain only digits | or
[a-z]*$ contain only letters
[a-z\d]+$ matches one or more letters or digits until $ end.
Have a look at this regex101 demo
(the i flag turns on caseless matching: a-z matches a-zA-Z)
Maybe a bit late, but this is my RE:
/^(\w*(\d+[a-zA-Z]|[a-zA-Z]+\d)\w*)+$/
Explanation:
\w* -> 0 or more alphanumeric digits, at the beginning
\d+[a-zA-Z]|[a-zA-Z]+\d -> a digit + a letter OR a letter + a digit
\w* -> 0 or more alphanumeric digits, again
I hope it was understandable
What about simply:
/[0-9][a-zA-Z]|[a-zA-Z][0-9]/
Worked like a charm for me...
Edit following comments:
Well, some shortsighting of my own late at night: apologies for the inconvenience...
The - incomplete - underlying idea was that only one "transition" from a digit to an alpha or from an alpha to a digit was needed somewhere to answer the question.
But next regex should do the job for a string only comprised of alphanumeric characters:
/^[0-9a-zA-Z]*([0-9][a-zA-Z]|[a-zA-Z][0-9])[0-9a-zA-Z]*$/
which in Javascript can be furthermore simplified as:
/^[0-9a-z]*([0-9][a-z]|[a-z][0-9])[0-9a-z]*$/i
In IMHO it's more straigthforward to read and understand than some other answers (no backtraking and the like).
Hope this helps.
If you need the digit to be at the end of any word, this worked for me:
/\b([a-zA-Z]+[0-9]+)\b/g
\b word boundary
[a-zA-Z] any letter
[0-9] any number
"+" unlimited search (show all results)

Negative lookahead RegEx limited to an exact number of characters

How can I limit a negative lookahead RegEx to an exact number of characters?
For example, this sentence should be denied...
This car is not that fast!
while this one should be allowed...
The car you are about to see may not be that fast, but it's very beautiful!
The RegEx should match any sentence that contains the word 'car', except the ones that include the word 'not' in the following 10 characters. This is the case of the first sentence, where there are only 4 characters in between the 'car' and 'not' words. So this sentence should be denied.
The second sentence, however, has more than 10 characters in between the 'car' and 'not' words, so it should pass the RegEx negative assertion.
Basically, what I am looking for is a negative lookahead RegEx that is limited to a certain number of characters.
Indeed, you can use negative look-ahead:
.*?car(?!.{0,10}not).*
If "car" and "not" in this rule are supposed to be separate words and not just substrings of any sequence, then add the appropriate \b:
.*?\bcar\b(?!.{0,10}\bnot\b).*
Negative lookahead Assertion,https://regex101.com/r/mD9JeR/11:
.*?car(?![\w\s]{0,10}not).*
looks whether \w\s characters 0-10 times before not. If so then it won't match.
Positive lookahead Assertion, just as an FYI - https://regex101.com/r/mD9JeR/12
.*?car(?=[\w\s]{10,}not).*

What is this regex: /^\D(?=\w{5})(?=\d{2})/ is not matching "bana12"? [duplicate]

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 2 years ago.
The objective is to match strings that are greater than 5 characters long, do not begin with numbers, and have two consecutive digits. I thought my regex was enough to do that but is not matching "bana12".
This regex does the job:
var pwRegex = /^\D(?=\w{5})(?=\w*\d{2})/;
Is not this regex more restrictive than mine? Why do I have to specify that the two or more digits are preceded by zero or more characters?
It is less restrictive than yours.
After the \D, there are 2 lookaheads. For your regex, they are
(?=\w{5})(?=\d{2})
This means that the thing after the non-digit must satisfy both of them. That is,
there must be 5 word characters immediately after the non-digit, and
there must be 2 digits immediately after the non-digit.
There is ana12 immediately after the non digit in the string. an is not 2 digits, so your regex does not match.
The working regex however has these two lookaheads:
(?=\w{5})(?=\w*\d{2})
It asserts that there must be these two things immediately after the \D:
5 word characters, and
a bunch of word characters, followed by two digits
ana12 fits both of those descriptions.
Try this Regex101 Demo. Look at step 6 in the regex debugger. That is when it tries to match the second lookahead.
You were on the right track to maybe use lookaheads, and also with the correct start of your pattern, but it is missing a few things. Consider this version:
^\D(?=.*\d{2})\w{4,}$
Here is an explanation of the pattern:
^ from the start of the string
\D match any non digit character
(?=.*\d{2}) then lookahead and assert that two consecutive digits occur
\w{4,} finally match four or more word characters (total of 5 or more characters)
$ end of the string
The major piece missing from your current attempt is that it only matches one non digit character in the beginning. You need to provide a pattern which can match 5 or more characters.

Regular Expressions: Positive and Negative Lookahead Solution

I was doing Freecodecamp RegEx challange, which's about:
Use lookaheads in the pwRegex to match passwords that are greater than 5 characters long, do not begin with numbers, and have two consecutive digits
So far the solution I found passed all the tests was:
/^[a-z](?=\w{5,})(?=.*\d{2}\.)/i
However, when I tried
/^[a-z](?=\w{5,})(?=\D*\d{2,}\D*)/i
I failed the test trying to match astr1on11aut (but passed with astr11on1aut). So can someone help me explaining how ?=\D*\d{2,}\D* failed this test?
So can someone help me explaining how (?=\D*\d{2,}\D*) failed this
test?
Using \D matches any char except a digit, so from the start of the string you can not pass single digit to get to the 2 digits.
Explanation
astr1on11aut astr11on1aut
^ ^^
Can not pass 1 before reaching 11 Can match 2 digits first
Note
the expression could be shortened to (?=\D*\d{2}) as it does not matter if there are 2, 3 or more because 2 is the minimum and the \D* after is is optional.
the first expression ^[a-z](?=\w{5,})(?=.*\d{2}\.) can not match the example data because it expects to match a . literally after 2 digits.
Failed for the \d{2,}
Then let's have a look at the regexp match process
/^[a-z](?=\w{5,})
means start with [a-z], the length is at least 6, no problem.
(?=\D*\d{2,}\D*)
means the first letter should be followed by these parts:
[0 or more no-digit][2 or more digits][0 or more no-digits]
and lets have a look at the test case
astr1on11aut
// ^[a-z] match "a"
// length matchs
// [0 or more no-digit]: "str"
// d{2,} failed: only 1 digit
the ?= position lookahead means exact followed by.
The first regular expression is wrong, it should be
/^[a-z](?=\w{5,})(?=.*\d{2}.*)/i
When it comes to lookahead usage in regex, rememeber to anchor them all at the start of the pattern. See Lookarounds (Usually) Want to be Anchored. This will help you avoid a lot of issues when dealing with password regexps.
Now, let's see what requirements you have:
"are greater than 5 characters long" => (?=.{6}) (it requires any 6 chars other than lien break chars immediately to the right of the current location)
"do not begin with numbers" => (?!\d) (no digit allowed immediately to the right)
have two consecutive digits => (?=.*\d{2}) (any two digit chunk of text is required after any 0 or more chars other than line break chars as many as possible, immediately to the right of the current location).
So, what you may use is
^(?!\d)(?=.{6})(?=.*\d{2})
Note the most expensive part is placed at the end of the pattern so that it could fail quicker if the input is non-matching.
See the regex demo.

Regex match valid Phone Number

I'm quite new to regex, and not sure what I'm doing wrong exactly.
I'm looking for a regex that match the following number format:
Matching requirements:
Must start with either 0 or 3
Must be between 7 to 11 digits
Must not allow ascending digits. e.g. 0123456789, 01234567
Must not allow repeated digits. e.g. 011111111, 3333333333, 0000000000
This is what I came up with:
^(?=(^[0,3]{1}))(?!.*(\d)\1{3,})(?!^(?:0(?=1|$))?(?:1(?=2|$))?(?:2(?=3|$))?(?:3(?=4|$))?(?:4(?=5|$))?(?:5(?=6|$))?(?:6(?=7|$))?(?:7(?=8|$))?(?:8(?=9|$))?9?$).{7,11}$
The above regex fails the No. (4) condition. Not sure why though.
Any help would be appreciated.
Thanks
A few notes about the pattern that you tried
You can omit the {1} and the comma in [0,3]
In the lookahead (?!.*(\d)\1{3,}) the (\d) is the second capturing group because this (?=(^[0,3]{1})) contains the first capturing group so it should be \2 instead of \1
In the lookahead, you can omit the comma in {3,}
In the match itself you use .{7,11} where the dot would match any character except a newline. You could use \d instead to match only digits
You pattern might look like
^(?=(^[03]))(?!.*(\d)\2{3})(?!^(?:0(?=1|$))?(?:1(?=2|$))?(?:2(?=3|$))?(?:3(?=4|$))?(?:4(?=5|$))?(?:5(?=6|$))?(?:6(?=7|$))?(?:7(?=8|$))?(?:8(?=9|$))?9?$)\d{7,11}$
Regex demo
Or leaving out the first lookahead and move that to the match, changing the quantifier to \d{6,10} and repeating capture group \1 instead of \2
^(?!.*(\d)\1{3})(?!(?:0(?=1|$))?(?:1(?=2|$))?(?:2(?=3|$))?(?:3(?=4|$))?(?:4(?=5|$))?(?:5(?=6|$))?(?:6(?=7|$))?(?:7(?=8|$))?(?:8(?=9|$))?9?$)[03]\d{6,10}$
Regex demo
Edit
Based on the comments, the string not having 4 ascending digits:
^(?!.*(\d)\1{3})[03](?!\d*(?:0123|1234|2345|3456|4567|5678|6789))\d{6,10}$
Regex demo
A solution for a JS flavor of PCRE would be
/^[03](?!123456(7(8(9|$)|$)|$))(?!(?<d>.)\k<d>+$)[0-9]{6,10}$/
Explanations
^[03] starts at the beginning of the string, then reads either 0 or 3
(?!123456(7(8(9|$)|$)|$)) makes sure that, after this first char, there is no sequence (if a sequence can be read, then the negative lookahead fails
(?!(?<d>.)\k<d>+$) is another negative lookahead : it ensures that the first char read (flagged d) is not repeated again and again until end of string
[0-9]{6,10}$/ finally reads 6 to 10 digits (first one already read)
A few tests:
"0123456789: No match"
"01234567: No match"
"01234568: No match"
"011111111: No match"
"33333333: No match"
"333333233 is valid"
"042157891023 is valid"
"019856: No match"
"0123451245 is valid"

Categories

Resources