Javascript RegEx assertion - javascript

I have this example:
/(?=\d)(?=[a-z])/.test("3a") which returns false
but this
/(?=\d)(?=.*[a-z])/.test("3a") works.
Can you explain this?

Let me break down what you are doing:
Test string = test("3a")
Example 1: /(?=\d)(?=[a-z])/
(?=\d) is a positive lookahead that the next character is a digit
(?=[a-z]) is a positive lookahead that the next character is in range a-z
This is impossible and will always return false as it is asserting that the next character is both a-z and a digit which it cannot be.
Example 2: /(?=\d)(?=.*[a-z])/
(?=\d) is a positive lookahead that the next character is a digit
(?=.*[a-z]) is a positive lookahead that anywhere in your string after where match starts there is a character is in range a-z
This sees 3a in the test string because starting the match at 3 the next character is a digit and 3a fulfills the .*[a-z] assertion.
It may or may not be important to point out that because these are lookaheads you are not actually matching anything. I don't know what it is you are really trying to do.
If you want to test that there is a-z after a digit you can put it into one assertion:
/(?=\d[a-z])/

Your first pattern
/(?=\d)(?=[a-z])/.test("3a")
is asserting that both a digit and letter occur in the same place. Obviously, this will never be true. On the other hand, your second pattern:
/(?=\d)(?=.*[a-z])/.test("3a")
asserts that a digit occurs, and it also asserts that a single letter occurs anywhere in the string. This matches for an input of 3a.

Related

Limit 10 characters is numbers and only 1 dot

I'm having a regex problem when input
That's the requirement: limit 10 characters (numbers) including dots, and only 1 dot is allowed
My current code is only 10 characters before and after the dot.
^[0-9]{1,10}\.?[0-9]{0,10}$
thank for support.
You could assert 10 chars in the string being either . or a digit.
Then you can match optional digits, and optionally match a dot and again optional digits:
^(?=[.\d]{10}$)\d*(?:\.\d*)?$
The pattern matches:
^ Start of string
(?=[.\d]{10}$) Positive lookahead, assert 10 chars . or digit till the end of string
\d* Match optional digits
(?:\.\d*)? Optionally match a `. and optional digits
$ End of string
See a regex demo.
If the pattern should not end on a dot:
^(?=[.\d]{10}$)\d*(?:\.\d+)?$
Regex demo
The decimal point throws a wrench into most single pattern approaches. I would probably use an alternation here:
^(?:\d{1,10}|(?=\d*\.)(?!\d*\.\d*\.)[0-9.]{2,11})$
This pattern says to match:
^ from the start of the number
(?:
\d{1,10} a pure 1 to 10 digit integer
| OR
(?=\d*\.) assert that one dot is present
(?!\d*\.\d*\.) assert that ONLY one dot is present
[0-9.]{2,11} match a 1 to 10 digit float
)
$ end of the number
You can use a lookahead to achieve your goals.
First, looking at your regex, you've used [0-9] to represent all digit characters. We can shorten this to \d, which means the same thing.
Then, we can focus on the requirement that there be only one dot. We can test for this with the following pattern:
^\d*\.?\d*$
\d* means any number of digit characters
\.? matches one literal dot, optionally
\d* matches any number of digit characters after the dot
$ anchors this to the end of the string, so the match can't just end before the second dot, it actually has to fail if there's a second dot
Now, we don't actually want to consume all the characters involved in this match, because then we wouldn't be able to ensure that there are <=10 characters. Here's where the lookahead comes in: We can use the lookahead to ensure that our pattern above matches, but not actually perform the match. This way we verify that there is only one dot, but we haven't actually consumed any of the input characters yet. A lookahead would look like this:
^(?=\d*\.?\d*$)
Next, we can ensure that there are aren't more than 10 characters total. Since we already made sure there are only dots and digits with the above pattern, we can just match up to 10 of any characters for simplicity, like so:
^.{1,10}$
Putting these two patterns together, we get this:
^(?=\d*\.?\d*$).{1,10}$
This will only match number inputs which have 10 or fewer characters and have no more than one dot.
If you would like to ensure that, when there is a dot, there is also a digit accompanying it, we can achieve this by adding another lookahead. The only case that meets this condition is when the input string is just a dot (.), so we can just explicitly rule this case out with a negative lookahead like so:
(?!\.$)
Adding this back in to our main expression, we get:
^(?=\d*\.?\d*$)(?!\.$).{1,10}$

Regular Expressions: Positive and Negative Lookahead Solution

I was doing Freecodecamp RegEx challange, which's about:
Use lookaheads in the pwRegex to match passwords that are greater than 5 characters long, do not begin with numbers, and have two consecutive digits
So far the solution I found passed all the tests was:
/^[a-z](?=\w{5,})(?=.*\d{2}\.)/i
However, when I tried
/^[a-z](?=\w{5,})(?=\D*\d{2,}\D*)/i
I failed the test trying to match astr1on11aut (but passed with astr11on1aut). So can someone help me explaining how ?=\D*\d{2,}\D* failed this test?
So can someone help me explaining how (?=\D*\d{2,}\D*) failed this
test?
Using \D matches any char except a digit, so from the start of the string you can not pass single digit to get to the 2 digits.
Explanation
astr1on11aut astr11on1aut
^ ^^
Can not pass 1 before reaching 11 Can match 2 digits first
Note
the expression could be shortened to (?=\D*\d{2}) as it does not matter if there are 2, 3 or more because 2 is the minimum and the \D* after is is optional.
the first expression ^[a-z](?=\w{5,})(?=.*\d{2}\.) can not match the example data because it expects to match a . literally after 2 digits.
Failed for the \d{2,}
Then let's have a look at the regexp match process
/^[a-z](?=\w{5,})
means start with [a-z], the length is at least 6, no problem.
(?=\D*\d{2,}\D*)
means the first letter should be followed by these parts:
[0 or more no-digit][2 or more digits][0 or more no-digits]
and lets have a look at the test case
astr1on11aut
// ^[a-z] match "a"
// length matchs
// [0 or more no-digit]: "str"
// d{2,} failed: only 1 digit
the ?= position lookahead means exact followed by.
The first regular expression is wrong, it should be
/^[a-z](?=\w{5,})(?=.*\d{2}.*)/i
When it comes to lookahead usage in regex, rememeber to anchor them all at the start of the pattern. See Lookarounds (Usually) Want to be Anchored. This will help you avoid a lot of issues when dealing with password regexps.
Now, let's see what requirements you have:
"are greater than 5 characters long" => (?=.{6}) (it requires any 6 chars other than lien break chars immediately to the right of the current location)
"do not begin with numbers" => (?!\d) (no digit allowed immediately to the right)
have two consecutive digits => (?=.*\d{2}) (any two digit chunk of text is required after any 0 or more chars other than line break chars as many as possible, immediately to the right of the current location).
So, what you may use is
^(?!\d)(?=.{6})(?=.*\d{2})
Note the most expensive part is placed at the end of the pattern so that it could fail quicker if the input is non-matching.
See the regex demo.

Do consecutive lookaheads match based on first matched character

I came answers (below) to try and understand how consecutive lookaheads work. My understanding seems to be contradictory and was hoping someone could help clarify.
The answer here suggests that all the lookaheads specified must be present for the first matched character (Why consecutive lookaheads do not always work answer by Sam Whan)
If I apply that to the solution in this answer:
How to print a number with commas as thousands separators in JavaScript:
function numberWithCommas(x) {
return x.toString().replace(/\B(?=(\d{3})+(?!\d))/g, ",");
}
it means that it's looking for the a non-boundary character that is followed by a sequence of characters with length that is a multiple of 3 and at the same time followed by characters that are not digits.
e.g. 12345
Knowing that a comma should go after the 2 but it seems contradictory as 2 has 3 digits following it, satisfying the first lookahead but the second lookahead contradicts it as it's supposed to not be followed by any digits.
I'm sure I'm misunderstanding something. Any help is appreciated. Thanks!
This regex:
/\B(?=(\d{3})+(?!\d))/g
Has only one positive lookahead condition and other negative lookahead is inside this first lookahead.
Here are details:
\B: Match position where \b doesn't match (e.g. between word characters)
(?=: Start lookahead
(\d{3})+: Match one or more sets of 3 digits
(?!\d): Inner negative lookahead to assert that we don't have a digit after match set of 3 digits
): End lookahead
However do note that it is much better to use following code to format your number to a thousand separator string:
console.log( parseFloat('1234567.89').toLocaleString('en') )

what is difference between these two syntax in my code

What is difference between The following syntaxs in regular expression?
Please give an example.
(?=.*\d)
and
.*(?=\d)
The first one is just an assertion, a positive look-ahead saying "there must be zero or more characters followed by a digit." If you match it against a string containing at least one digit, it will tell you whether the assertion is true, but the matched text will just be an empty string.
The second one searches for a match, with an assertion (a positive-lookahead) after the match saying "there must be a digit." The matched text will be the characters before the last digit in the string (including any previous digits, because .* is greedy, so it'll consume digits up until the last one, because the last one is required by the assertion).
Note the difference in the match object results:
var str = "foo42";
test("rex1", /(?=.*\d)/, str);
test("rex2", /.*(?=\d)/, str);
function test(label, rex, str) {
console.log(label, "test result:", rex.test(str));
console.log(label, "match object:", rex.exec(str));
}
Output (for those who can't run snippets):
rex1 test result: true
rex1 match object: [
""
]
rex2 test result: true
rex2 match object: [
"foo4"
]
Notice how the match result in the second case was foo4 (from the string foo42), but blank in the first case.
(?=...) is a positive lookahead. Both of these expressions will match "any text followed by a number". The difference, though, is that (?=...) doesn't "eat" ("capture") any characters as it matches. For practical purposes, if this is the only thing your regex contains, they'll match the same stuff. However, .*(?=\d) would be a more correct expression, unless there's more to it than what you put in the question.
Where it really matters is when you're using capturing groups or where you're using the content of the matched text after running the regular expression:
If you want to capture all text before the number, but not the number itself, and use it after, you could do this:
(.*?(?=\d))
The ? makes the match non-greedy, so it will only match up to the first number. All text leading up to the number will be in the match result as the first group.
Please find the difference below
In detail
.* means matches any character (except newline)
(?=\d) means Positive Lookahead - Assert that the regex below can be matched
\d match a digit [0-9]
(?=.*\d)
CapturingGroup
MatchOnlyIfFollowedBy
Sequence: match all of the followings in order
Repeat
AnyCharacterExcept\n
zero or more times
Digit
.*(?=\d)
Sequence: match all of the followings in order
Repeat
AnyCharacterExcept\n
zero or more times
CapturingGroup
MatchOnlyIfFollowedBy
Digit

Regular expression for alphanumeric with atleast one digit

I am trying to match an alphanumeric string with at least one digit.
The second condition is that it's minimum length should be 3.
For example, the following strings should match
111
12345
ABCD1
123A
11AA11
And the following should not match
ABCD
AB
12
1A
I have got myself to a point where I can get the first condition right.
That is, having minimum one digit:
([a-zA-z0-9]*[0-9]+[a-zA-z0-9]*)
But I don't have any idea to specify a minimum length. If I try using {3},
it will require minimum of 3 numbers.
Try using a positive lookahead to ascertain that there's at least one digit, and use {3,} to indicate that it should match at least 3 characters:
/^(?=.*\d)[a-z\d]{3,}$/i
You can use lookahead to assure your expression contains a digit, and then match the minimum-three-chars:
/^(?=.*?\d)[a-zA-Z0-9]{3,}$/

Categories

Resources