This question already has answers here:
How can I find repeated characters with a regex in Java?
(3 answers)
Closed 4 years ago.
My question is what can be a valid regex for strings that don't contain a double letter.
My solution is : b(ab)*a + a(ba)*b .
But i don't think it is correct, because it doesn't include the a or b.
Can someone help me?
You can achieve this with a negative lookahead:
const re = /^(?!.*?(.).*?\1)[a-z]*$/g;
let s1 = "abcdefgh", s2 = "abcdefga";
console.log(re.test(s1));
console.log(re.test(s2));
How it works:
/^(?!.*?(.).*?\1)[a-z]*$/g
^ asserts position at start of the string
Negative Lookahead (?!.*?(.).*?\1): Assert that the Regex below does not match
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed
1st Capturing Group (.)
. matches any character (except for line terminators)
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed
\1 matches the same text as most recently matched by the 1st capturing group
Match a single character present in the list below [a-z]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)
Global pattern flags
g modifier: global. All matches (don't return after first match)
Related
I want to write a regex that will find substrings of length 10-15 where all characters are [A-Z0-9] and it must contain at least 1 letter and one number (spaces are ok but not other special characters). Some examples:
ABABABABAB12345 should match
ABAB1234ABA32 should match
ABA BA BABAB12345 should match
1234567890987 should not match
ABCDEFGHIJK should not match
ABABAB%ABAB12?345 should not match
So far the best two candidates I have come up with are:
(?![A-Z]{10,15}|[0-9]{10,15})[0-9A-Z]{10,15} - this fails because if the string has 10 consecutive numbers/letters it will not match, even though the 15 character string has a mix (e.g ABABABABAB12345).
(?=.*[0-9])(?=.*[A-Z])([A-Z0-9]+){10,15 } - this fails because it will match 15 consecutive letters as long as there is a number later in the string (even though it is outside the match) and vice versa (e.g. 123456789098765 abcde will match 123456789098765).
(I need to do this in python and js)
If each string is on its own line, then you can use start/end anchors to construct the regex:
^(?=.*[0-9])(?=.*[A-Z])(?:\s*[A-Z0-9]\s*){10,15}$
^ - start of line
(?=.*[0-9]) - lookahead, must contain a number
(?=.*[A-Z]) - lookahead, must contain a letter
(?: - start a non-capturing group
\s*[A-Z0-9]\s* Contains a letter or number with optional whitespaace
) - end non-capturing group
{10,15} - Pattern occurs 10 to 15 times
$ - end of line
See a live example here: https://regex101.com/r/eWX2Qo/1
This doesn't account for ABA BA BABAB12345, but this still might help.
Based on what you're trying to match, it looks like you want there to be a mix.
What you can do is two lookaheads. One looking for a in the following 15 characters, and another looking for a letter in the same space. If this matches, then it looks for a group of numbers and letters of length 10 to 15.
(?=.{0,14}\d)(?=.{0,14}[A-Z])[A-Z\d]{10,15}
https://regex101.com/r/qw1Q0S/1
(?=.{0,14}\d) character 1 through 15 has to be a number
(?=.{0,14}[A-Z]) character 1 through 15 has to be a capital letter
[A-Z\d]{10,15} match 10 to 15 letters and numbers if the previous conditions are true
Edit with an improved answer:
To account for the spaces, you can tweak the above concept.
(?=(?:. *+){0,14}\d)(?=(?:. *+){0,14}[A-Z])(?:[A-Z\d] *){10,15}
Above, in the lookahead we were matching .{0,14}. . has been changed to (?:. *+), which is a non capturing group that matches . in addition to 0 or more spaces.
So putting it together:
Lookahead 1:
(?=(?:. *+){0,14}\d)
This matches 0,14 characters that may or may not be followed by spaces. This effectively ignoring spaces. This also uses a possessive quantifier ( *+) when matching spaces to prevent the engine from backtracking when spaces are matched. The pattern would work without the + modifier, but would more than double the steps taken to match on the example.
Lookahead 2:
(?=(?:. *+){0,14}[A-Z])
Same as lookahead 1, but now testing for a capital letter instead of a digit.
If lookahead 1 and lookahead 2 both match, then the engine will be left in a place where our matches can potentially be made.
Actual match:
(?:[A-Z\d] *){10,15}
This matches the capital letters and numbers, but now also 0 or more spaces. The only drawback being that the trailing space will be include in your match, although that's easily handled in post processing.
Edit:
All whitespace (\r, \n, \t and ) can be accounted for by using \s vs .
Depending on the amount of space that exists. the possessive quantifier is necessary to prevent catestrophic backtracking. This modification to the input using possessive quantifiers completes in 22,332 steps, while this one using the same input, but with a regular quantifier, fails match anything due to catastrophic backtracking .
It should be noted that the possessive quantifier *+ is not supported with javascript or python's builtin re module, but it is supported with python's regex module:
>>> import regex
>>> pattern = r'(?=(?:.\s*+){0,14}\d)(?=(?:.\s*+){0,14}[A-Z])(?:[A-Z\d]\s*){10,15}'
>>> regex.search(pattern, 'AAAAAAAAAA\n2')
<regex.Match object; span=(0, 12), match='AAAAAAAAAA\n2'>
>>>
Has the right stuff
function lfunko() {
let a = ["ABABABABAB12345","ABAB1234ABA32","ABA BA BABAB12345","1234567890987","ABCDEFGHIJK","ABABAB%ABAB12?345"];
let o = a.map((s,i) => {
let ll = s.split("").filter(s => s.match(/[A-Z]/)).length;
let ln = s.split("").filter(s => s.match(/[0-9]/)).length;
let ot = s.split("").filter(s => s.match(/[^A-Z0-9]/)).length;
let sum = ll + ln
return (ll > 1 && ln > 1 && sum > 9 && sum < 16 && ot == 0)? `${s} - TRUE`:`${s} - FALSE`;
});
console.log(JSON.stringify(o));
}
Execution log
11:18:20 PM Notice Execution started
11:18:21 PM Info ["ABABABABAB12345 - TRUE","ABAB1234ABA32 - TRUE","ABA BA BABAB12345 - FALSE","1234567890987 - FALSE","ABCDEFGHIJK - FALSE","ABABAB%ABAB12?345 - FALSE"]
11:18:21 PM Notice Execution completed
Your require of [A-Z0-9] does not include spaces so third example should be false.
Should be
ABABABABAB12345 should match
ABAB1234ABA32 should match
ABA BA BABAB12345 should not match has spaces
1234567890987 should not match
ABCDEFGHIJK should not match
ABABAB%ABAB12?345 should not match
Hi all I am making a password regular expression in javascript test() method, It will take the following inputs
solution
/^(?=.*\d)^(?=.*[!#$%'*+\-/=?^_{}|~])(?=.*[A-Z])(?=.*[a-z])\S{8,15}$/gm
May contains any letter except space
At least 8 characters long but not more the 15 character
Take at least one uppercase and one lowercase letter
Take at least one numeric and one special character
But I am not able to perform below task with (period, dot, fullStop)
(dot, period, full stop) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively.
Can anyone one help me to sort out this problem, Thanks in advance
You may move the \S{8,15} part with the $ anchor to the positive lookahead and place it as the first condition (to fail the whole string if it has spaces, or the length is less than 8 or more than 15) and replace that pattern with [^.]+(?:\.[^.]+)* consuming subpattern.
/^(?=\S{8,15}$)(?=.*\d)(?=.*[!#$%'*+\/=?^_{}|~-])(?=.*[A-Z])(?=.*[a-z])[^.]+(?:\.[^.]+)*$/
See the regex demo
Details:
^ - start of string
(?=\S{8,15}$) - the first condition that requires the string to have no whitespaces and be of 8 to 15 chars in length
(?=.*\d) - there must be a digit after any 0+ chars
(?=.*[!#$%'*+\/=?^_{}|~-]) - there must be one symbol from the defined set after any 0+ chars
(?=.*[A-Z]) - an uppercase ASCII letter is required
(?=.*[a-z]) - a lowercase ASCII letter is required
[^.]+(?:\.[^.]+)* - 1+ chars other than ., followed with 0 or more sequences of a . followed with 1 or more chars other than a dot (note that we do not have to add \s into these 2 negated character classes as the first lookahead already prevalidated the whole string, together with its length)
$ - end of string.
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 7 years ago.
Please explain me the meaning of the following regular expression in JavaScript with proper exploration:
/^\b_((?:__|[\s\S])+?)_\b|^\*((?:\*\*|[\s\S])+?)\*(?!\*)/
This is the meaning.
/^\b_((?:__|[\s\S])+?)_\b|^\*((?:\*\*|[\s\S])+?)\*(?!\*)/
1st Alternative: ^\b_((?:__|[\s\S])+?)_\b
^ assert position at start of the string
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
_ matches the character _ literally
1st Capturing group ((?:__|[\s\S])+?)
(?:__|[\s\S])+? Non-capturing group
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
1st Alternative: __
__ matches the characters __ literally
2nd Alternative: [\s\S]
[\s\S] match a single character present in the list below
\s match any white space character [\r\n\t\f ]
\S match any non-white space character [^\r\n\t\f ]
_ matches the character _ literally
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
2nd Alternative: ^\*((?:\*\*|[\s\S])+?)\*(?!\*)
^ assert position at start of the string
\* matches the character * literally
2nd Capturing group ((?:\*\*|[\s\S])+?)
(?:\*\*|[\s\S])+? Non-capturing group
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
1st Alternative: \*\*
\* matches the character * literally
\* matches the character * literally
2nd Alternative: [\s\S]
[\s\S] match a single character present in the list below
\s match any white space character [\r\n\t\f ]
\S match any non-white space character [^\r\n\t\f ]
\* matches the character * literally
(?!\*) Negative Lookahead - Assert that it is impossible to match the regex below
\* matches the character * literally
Well, in a really nice form:
You can check this out at Regex 101.
I'm using this regex to match some strings:
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
I'm confusing about why it's permitted to enter two dots, like ..
What I understand is that only allowed to put 1 dash or none (-)?
Any digits with no limit or none (\d+)?
One dot or none (\.)?
Why is allowed to put .. or even .4.6?
Testing done in http://www.regextester.com/
[^\s] means anything that is not a whitespace. This includes dots. Trying to match .. will get you:
[^\s] matches .
(-)? doesn't match
(\d+)? doesn't match
(\.)? matches .
(\d+)? doesn't match
I'll assume you wanted to match numbers (possibly negative/floating):
^-?\d+(\.\d+)?$
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
Assert position at the beginning of the string ^
Match the regex below and capture its match into backreference number 1 ([^\s](-)?(\d+)?(\.)?(\d+)?)
Match any single character that is NOT present in the list below and that is NOT a line break character (line feed) [^\s]
A single character from the list “\s” (case sensitive) \s
Match the regex below and capture its match into backreference number 2 (-)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “-” literally -
Match the regex below and capture its match into backreference number 3 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Match the regex below and capture its match into backreference number 4 (\.)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “.” literally \.
Match the regex below and capture its match into backreference number 5 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Assert position at the very end of the string $
Match the character “/” literally /
Created with RegexBuddy
As I mentioned in my comment, [^\n] is a negated character class that matches .. and as there is another (\.)? pattern, the regex can match 2 consecutive dots (since all of the parts except for [^\s] are optional).
In order not to match strings like .4.5 or .. you just need to add the . to the [^\n] negated character class:
^([^\s.](-)?(\d+)?(\.)?(\d+)?)$
^
See demo. This will not let any . in the initial capturing group.
You can use a lookahead to only disallow the first character as a dot:
^(?!\.)([^\s](-)?(\d+)?(\.)?(\d+)?)$
See another demo
All explanation is available at the online regex testers:
In order to match the numbers in the format you expect, use:
^(?:[-]?\d+\.?\d*|-)$
Human-readable explanation:
^ - start of string and then there are 2 alternatives...
[-]? - optional hyphen
\d+ - 1 or more digits
\.? - optional dot
\d* - 0 or more digits
| -OR-
- - a hyphen
$ - end of string
See demo
I see this line of code and the regular expression just panics me...
quickExpr = /^(?:[^#<]*(<[\w\W]+>)[^>]*$|#([\w\-]*)$)/
Can someone please explain little by little what it does?
Thanks,G
Here's what I can extract:
^ beginning of string.
(?: non-matching group.
[^#<]* any number of consecutive characters that aren't # or <.
(<[\w\W]+>) a group that matches strings like <anything_goes_here>.
[^>]* any number of characters in sequence that aren't a >.
The part after the | denotes a second regex to try if the first one fails. That one is #([\w\-]*):
# matches the # character. Not that complex.
([\w\-]*) is a group that matches any number of word characters or dashes. Basically Things-of-this-form
$ marks the end of the regex.
I'm no regex pro, so please correct me if I am wrong.
^(?:[^#<]*(<[\w\W]+>)[^>]*$|#([\w\-]*)$)
Assert position at the start of the string «^»
Match the regular expression below «(?:[^#<]*(<[\w\W]+>)[^>]*$|#([\w\-]*)$)»
Match either the regular expression below (attempting the next alternative only if this one fails) «[^#<]*(<[\w\W]+>)[^>]*$»
Match a single character NOT present in the list "#<" «[^#<]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the regular expression below and capture its match into backreference number 1 «(<[\w\W]+>)»
Match the character "<" literally «<»
Match a single character present in the list below «[\w\W]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a "word character" (letters, digits, etc.) «\w»
Match a single character that is a "non-word character" «\W»
Match the character ">" literally «>»
Match any character that is not a ">" «[^>]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «#([\w\-]*)$»
Match the character "#" literally «#»
Match the regular expression below and capture its match into backreference number 2 «([\w\-]*)»
Match a single character present in the list below «[\w\-]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character that is a "word character" (letters, digits, etc.) «\w»
A - character «\-»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Created with RegexBuddy