Regex - how to ignore order of the matched groups? [duplicate] - javascript

This question already has answers here:
Password REGEX with min 6 chars, at least one letter and one number and may contain special characters
(10 answers)
Closed 2 years ago.
I'm trying to create a regex validation for a password which is meant to be:
6+ characters long
Has at least one a-z
Has at least one A-Z
Has at leat one 0-9
So, in other words, the match will have :
at least one a-z, A-Z, 0-9
at least 3 any other characters
I've came up with:
((.*){3,}[a-z]{1,}[A-Z]{1,}[0-9]{1,})
it seems pretty simple and logical to me, but 2 things go wrong:
quantifier {3,} for (.*) somehow doesn't work and destroys whole regex. At first I had {6,} at the end but then regex would affect the quantifiers in inner groups, so it will require [A-Z]{6,} instead of [A-Z]{1,}
when I remove {3,} the regex works, but will match only if the groups are in order - so that it will match aaBB11, but not BBaa11

This is a use case where I wouldn't use a single regular expression, but multiple simpler ones.
Still, to answer your question: If you only want to validate that the password matches those criteria, you could use lookaheads:
^(?=.{6})(?=.*?[a-z])(?=.*?[A-Z])(?=.*?[0-9])
You're basically looking for a position from which you look at
6 characters (and maybe more to follow, doesn't matter): (?=.{6})
maybe something, then a lowercase letter: (?=.*?[a-z])
maybe something, then an uppercase letter: (?=.*?[A-Z])
maybe something, then a digit: (?=.*?[0-9])
The order of appearance is arbitrary due to the maybe something parts.
(Note that I've interpreted 6 characters long as at least 6 characters long.)

I believe this is what you want:
^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])[!-~]{6,}$
If we follow your spec to the letter, your validation password looks like this:
^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9]).{6,}$
However, we need to improve on this, because apart from the number, lower-case and upper-case letter, are you really willing to accept any character? For instance, can the user use a character in the Thai language? A space character? A tab? Didn't think so. :)
If you want to allow all the printable ASCII characters apart from space, instead of a dot, we can use this character range: [!-~]
How does it work?
The ^ anchor makes sure we start the match at the start of the string
The (?=.*[a-z]) lookahead ensures we have a lower-case character
The (?=.*[A-Z]) lookahead ensures we have an upper-case character
The (?=.*[0-9]) lookahead ensures we a digit
The (?=.*[a-z]) lookahead ensures we have a lower-case character
The [!-~]{6,} matches six or more ASCII printable ASCII characters that are not space.
The $ ensures we have reached the end of the string (otherwise, the password could contain more characters that are not allowed).

you could use this pattern ^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9]).{6,}

Related

Regex creation to allow, disallow few characters

I am new to regex, i have this use case:
Allow characters, numbers.
Zero or one question mark allowed. (? - valid, consecutive question marks are not allowed (??)).
test-valid
?test - valid
??test- invalid
?test?test - valid
???test-invalid
test??test -invalid
Exlcude $ sign.
[a-zA-Z0-9?] - seems this doesn't work
Thanks.
Try the following regular expression: ^(?!.*\?\?)[a-zA-Z0-9?]+$
first we're using Negetive lookahead - which allows us to exclude any character which is followed by double question marks (Negetive lookahaed does not consume characters)
Since question mark has special meaning in regular expressions (Quantifier — Matches between zero and one times), each question mark is escaped using backslash.
The plus sign at the end is a Quantifier — Matches between one and unlimited times, as many times as possible
You can test it here
Your description can be broken down into the regex:
^(?:\??[a-zA-Z0-9])+\??$
You say characters and your description shows letters and numbers only, but it's possible \w (word characters) may be used instead - this includes underscore
It's between ^ and $ meaning the whole field must match (no partial matches, although if you want those you can remove this. The + means there must be at least one match (so empty string won't match). The capturing group ((\??[a-zA-Z0-9])) says I must either see a question mark followed by letters or just letters repeating many times, and the final question mark allows the string to end with a single question mark.
You probably don't want capturing groups here, so we can start that with ?: to prevent capture leading to:
^(?:\??[a-zA-Z0-9])+\??$
Matches
test
?test
?test?test
test?
Doesn't match
??test
???test
test??test
test??
<empty string>
?

Regex pattern for a String contains "at least one digit and allows only two special chars"

I have requirement like this for Regex pattern for a string:
1.String length >=8;
2.Contains at-least one digit.
3.Contains exactly two special characters in string.
4.remaining characters are alphabets.
I have tried like this:
"/^(?=.*[0-9]+)(?=.*[##$%]{2})[0-9##$%A-Za-z]{8,}$/g"
But in this examples i am getting:
1."Example1##" --true (passed my test)
2."Example2#" --false (passed my test)
3."Example3####$#" --true (failed my test)
==>In 3rd case it is accepting more than 2 special characters.
How to achieve my requirement, please help me solve this.
You can use
^(?=\D*\d)(?=(?:\w*[##$%]){2}(?!\w*[##$%]))[\d##$%A-Za-z]{8,}
https://regex101.com/r/9iyoIX/2
(?=\D*\d) - at least one digit
(?=(?:\w*[##$%]){2}(?!\w*[##$%])) - Repeat (optional word characters followed by a special character) 2 times, then negative lookahead for more special characters
(?=[\d##$%A-Za-z]{8,}) - String is at least 8 characters long, and only contains the desired types of characters
Note that \d is preferable to [0-9] - easier to read. Also, when you're looking for a certain number of characters of a particular type in a string, a negated character set has better performance than .*. For example, when looking for a digit, better to use
(?=\D*\d)
than
(?=.*\d)
which will be more efficient - the regex will fail quicker when there's no match.

Regex which accepts alphanumerics only, except for one hyphen in the middle

I am trying to construct a regular expression which accepts alphanumerics only ([a-zA-Z0-9]), except for a single hyphen (-) in the middle of the string, with a minimum of 9 characters and a maximum of 20 characters.
I have verified the following expression, which accepts a hyphen in the middle.
/^[a-zA-Z0-9]+\-?[a-zA-Z0-9]+$/
How can I set the minimum 9 and maximum 20 characters for the above regex? I have already used quantifiers + and ? in the above expression.
How would I apply {9,20} to the above expression? Are there any other suggestions for the expression?
/^[a-zA-Z0-9]+\-?[a-zA-Z0-9]+$/
can be simplified to
/^[a-z0-9]+(?:-[a-z0-9]+)?$/i
since if there is no dash then you don't need to look for more letters after it, and you can use the i flag to match case-insensitively and avoid having to reiterate both lower-case and upper-case letters.
Then split your problem into two cases:
9-20 alpha numerics
10-21 characters, all of which are alpha numerics except one dash
You can check the second using a positive lookahead like
/^(?=.{10,21}$)/i
to check the number of characters without consuming them.
Combining these together gives you
/^(?:[a-z0-9]{9,20}|(?=.{10,21}$)[a-z0-9]+-[a-z0-9]+)$/i
You can do this provided you don't want - to be present exactly in middle
/^(?=[^-]+-?[^-]+$)[a-zA-Z\d-]{9,20}$/
[^-] matches any character that is not -

Please explain some Javascript Regular Expressions

I'm learning Javascript via an online tutorial, but nowhere on that website or any other I googled for was the jumble of symbols explained that makes up a regular expression.
Check if all numbers: /^[0-9]+$/
Check if all letters: /^[a-zA-Z]+$/
And the hardest one:
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
What do all the slashes and dollar signs and brackets mean? Please explain.
(By the way, what languages are required to create a flexible website? I know a bit of Javascript and wanna learn jQuery and PHP. Anything else needed?)
Thanks.
There are already a number of good sites that explain regular expressions so I'll just dive a bit into how each of the specific examples you gave translate.
Check if all numbers: ^ anchors the start of the expression (e.g. start at the beginning of the text). Without it a match could be found anywhere. [0-9] finds the characters in that character class (e.g. the numbers 0-9). The + after the character class just means "one or more". The ending $ anchors the end of the text (e.g. the match should run to the end of the input). So if you put that together, that regular expression would allow for only 1 or more numbers in a string. Note that the anchors are important as without them it might match something like "foo123bar".
Check if all letters: Pretty much the same as above but the character classes are different. In this example the character class [a-zA-Z] represents all lowercase and uppercase characters.
The last one actually isn't any more difficult than the other two it's just longer. This answer is getting quite long so I'll just explain the new symbols. A \w in a character class will match word characters (which are defined per regex implementation but are generally 0-9a-zA-Z_ at least). The backslash before the # escapes the # so that it isn't seen as a token in the regex. A period will match any character so .+ will match one or more of any character (e.g. a, 1, Z, 1a, etc). The last part of the regex ({2,4}) defines an interval expression. This means that it can match a minimum of 2 of the thing that precedes it, and a maximum of 4.
Hope you got something out of the above.
There is an awesome explanation of regular expressions at http://www.regular-expressions.info/ including notes on language and implementation specifics.
Let me explain:
Check if all numbers: /^[0-9]+$/
So, first thing we see is the "/" at the beginning and the end. This is a deliminator, and only serves to show the beginning and end of the regular expression.
Next, we have a "^", this means the beginning of the string. [0-9] means a number from 0-9. + is a modifier, which modifies the term in front of it, in this case, it means you can have one or more of something, so you can have one or more numbers from 0-9.
Finally, we end with "$", which is the opposite of "^", and means the end of the string. So put that all together and it basically makes sure that inbetween the start and end of the string, there can be any number of digits from 0-9.
Check if all letters: /^[a-zA-Z]+$/
We notice this is very similar, but instead of checking for numbers 0-9, it checks for letters a-z (lowercase) and A-Z (uppercase).
And the hardest one:
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
"\w" means that it is a word, in this case we can have any number of letters or numbers, as well as the period means that it can be pretty much any character.
The new thing here is escape characters. Many symbols cannot be used without escaping them by placing a slash in front, as is the case with "\#". This means it is looking directly for the symbol "#".
Now it looks for letters and symbols, a period (this one seems incorrect, it should be escaping the period too, though it will still work, since an unescaped period will make any symbol). Numbers inside {} mean that there is inbetween this many terms in the previous term, so of the [a-zA-Z0-9], there should be 2-4 characters (this part here is the website domain, such as .com, .ca, or .info). Note there's another error in this one here, the [a-zA-z0-9] should be [a-zA-Z0-9] (capital Z).
Oh, and check out that site listed above, it is a great set of tutorials too.
Regular Expressions is a complex beast and, as already pointed out, there are quite a few guides off of google you can go read.
To answer the OP questions:
Check if all numbers: /^[0-9]+$/
regexps here are all delimated with //, much like strings are quoted with '' or "".
^ means start of string or line (depending on what options you have about multiline matching)
[...] are called character classes. Anything in [] is a list of single matching characters at that position in this case 0-9. The minus sign has a special meaning of "sequence of characters between". So [0-9] means "one of 0123456789".
+ means "1 or more" of the preceeding match (in this case [0-9]) so one or more numbers
$ means end of string/line match.
So in summary find any string that contains only numbers, i.e '0123a' will not match as [0-9]+ fails to match a before $).
Check if all letters: /^[a-zA-Z]+$/
Hopefully [A-Za-z] makes sense now (A-Z = ABCDEF...XYZ and a-z abcdef...xyz)
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
Not all regexp parses know the \w sequence. Javascript, java and perl I know do support it.
I have already have covered '/^ at the beginning, for this [] match we are looking for
\w - . and +. I think that regexp is incorrect. Either the minus sign should be escaped with \ or it should be at the end of the [] (i.e [\w+.-]). But that is an aside they are basically attempting to allow anything of abcdefghijklmnopqrstuvwxyz01234567890-.+
so fred.smith-foo+wee#mymail.com will match but fred.smith%foo+wee#mymail.com wont (the % is not matched by [\w.+-]).
\# is the litteral atsil sign (it is escaped as perl expands # an array variable reference)
[a-zA-Z0-9.-]+ is the same as [\w.-]+. Very much like the user part of the match, but does not match +. So this matches foo.com. and google.co. but not my+foo.com or my***domain.co.
. means match any one character. This again is incorrect as fred#foo%com will match as . matches %*^%$£! etc. This should of been written as \.
The last character class [a-zA-z0-9]{2,4} looks for between 2 3 or 4 of the a-zA-Z0-9 specified in the character class (much like + looks for "1 more more" {2,4} means at least 2 with a maximum of 4 of the preceeding match. So 'foo' matches, '11' matches, '11111' does not match and 'information' does not.
The "tweaked" regexp should be:
/^[\w.+-]+\#[a-zA-Z0-9.-]+\.[a-zA-z0-9]{2,4}$/
I'm not doing a tutorial on RegEx's, that's been done really well already, but here are what your expressions mean.
/^<something>$/ String begins, has something in the middle, and then immediately ends.
/^foo$/.test('foo'); // true
/^foo$/.test('fool'); // false
/^foo$/.test('afoo'); // false
+ One or more of something:
/a+/.test('cot');//false
/a+/.test('cat');//true
/a+/.test('caaaaaaaaaaaat');//true
[<something>] Include any characters found between the brackets. (includes ranges like 0-9, a-z, and A-Z, as well as special codes like \w for 0-9a-zA-Z_-
/^[0-9]+/.test('f00')//false
/^[0-9]+/.test('000')//true
{x,y} between X and Y occurrences
/^[0-9]{1,2}$/.test('12');// true
/^[0-9]{1,2}$/.test('1');// true
/^[0-9]{1,2}$/.test('d');// false
/^[0-9]{1,2}$/.test('124');// false
So, that should cover everything, but for good measure:
/^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
Begins with at least character from \w, -, +, or .. Followed by an #, followed by at least one in the set a-zA-Z0-9.- followed by one character of anything (. means anything, they meant \.), followed by 2-4 characters of a-zA-z0-9
As a side note, this regular expression to check emails is not only dated, but it is very, very, very incorrect.

Regex to match card code input

How can I write a regex to match strings following these rules?
1 letter followed by 4 letters or numbers, then
5 letters or numbers, then
3 letters or numbers followed by a number and one of the following signs: ! & # ?
I need to allow input as a 15-character string or as 3 groups of 5 chars separated by one space.
I'm implementing this in JavaScript.
I'm not going to write out the whole regex for you since this is homework, but here are some hints which should help you out:
Use character classes. [A-Z] matches all uppercase. [a-z] matches all lowercase. [0-9] matches numbers. You can combine them like so [A-Za-z0-9].
Use quantifiers like {n} so [A-Z]{3} gives you 3 uppercase letters.
You can put other characters in character classes. Let's say you wanted to match % or # or #, you could do [%##] which would match any of those characters.
Some meta-characters (characters which have special meaning in the context of regular expressions) will need to be escaped like so: \$ (since $ matches the end of a line)
^ and $ match the beginning and end of the line respectively.
\s matches white-space, but if you sanitize your input, you shouldn't need to use this.
Flags after the regex do special things. For example in /[a-z]/i, the i ignores case.
This should be it:
/^[a-z][a-z0-9]{4} ?[a-z0-9]{5} ?[a-z0-9]{3}[0-9][!&#?]$/i
Feel free to change 0-9 and [0-9] with \d if you see fit.
The regex is simple and readable enough. ^ and $ make sure this is a whole match, so there aren't extra characters before or after the code, and the /i flag allows upper or lower case letters.
I would start with a tutorial.
Pay attention to the quantifiers (like {N}) and character classes (like [a-zA-Z])
^[a-zA-Z][a-zA-Z0-9]{4} ?[a-zA-Z0-9]{5} ?[a-zA-Z0-9]{3}[\!\&\#\?]$

Categories

Resources