Javascript Regex Input Validation to Prevent Duplicate Characters

Javascript Regex Input Validation to Prevent Duplicate Characters - javascript

I am attempting to validate text input with the following requirements:
allowed characters & length /^\w{8,15}$/
must contain /[a-z]+/
must contain /[A-Z]+/
must contain /[0-9]+/
must not contain repeated characters (ie. aba=pass and aab=fail)
Each test would return true when used with .test().
With modest familiarity, I am able to write the first 4 tests, albeit individually. The 5th test is not working out, negated lookahead (which is what i believe i need to be using) is challenging.
Here are a few value/result examples:
re.test("Fail1");//returns false, too short
re.test("StringFailsRule1");//returns false, too long
re.test("Fail!");//returns false, invalid !
re.test("FAILRULE2");//returns false, missing [a-z]+
re.test("failrule3");//returns false, missing [A-Z]+
re.test("failRuleFour");//returns false, missing [0-9]+
re.test("failRule55");//returns false, repeat of "5"
re.test("TestValue1");//returns true
Finally, the ideal would be a single combined test used to enforce all requirements.

This uses negative and positive lookaheads zero-length assertions for your tests and the .{8,15} bit validates length.
^(?!.*(.)\1)(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])\w{8,15}$
For your fifth rule I used a negative lookahead to make sure that a capture group of any character is never followed by itself.
Regexpal demo
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
. any character except \n
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\1 what was matched by capture \1
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[a-z] any character of: 'a' to 'z'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[A-Z] any character of: 'A' to 'Z'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[0-9] any character of: '0' to '9'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
\w{8,15} word characters (a-z, A-Z, 0-9, _)
(between 8 and 15 times (matching the most
amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Related

What's the maximum length should I allow to avoid Catastrophic backtracking?

The line is approximately 7915621 in length and is actually the view state value of an ASPX website.
I get the original HTML of the site, then pass it line by line to the extract function, and as soon as it reaches the view_state line containing that long string, the regex become stuck.
Here is the regex pattern that get stuck,
/[\w\.]+\#[\w]+(?:\.[\w]{3}|\.[\w]{2}\.[\w]{2})\b/gi
I thought about setting a maximum line length to skip this line or any other lines like that but I can't think of a optimal size as I care about false positives.

[\w\.]+ is found so many times in your document that it becomes a problem to process them with your expression.
Reducing the amount of places to start searching at is a possible solution. E.g. using a word boundary.
(?:\.\w{3}|\.\w{2}\.\w{2}) can be streamlined as \.\w{2}(?:\w|\.\w{2}).
Use
/\b[\w.]+#\w+\.\w{2}(?:\w|\.\w{2})\b/gi
Or, get rid of the brackets
/\b\w+(?:\.\w+)*#\w+\.\w{2}(?:\w|\.\w{2})\b/gi
EXPLANATION
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
# '#'
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\w{2} word characters (a-z, A-Z, 0-9, _) (2
times)
--------------------------------------------------------------------------------
(?: group, but do not capture:
--------------------------------------------------------------------------------
\w word characters (a-z, A-Z, 0-9, _)
--------------------------------------------------------------------------------
| OR
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
\w{2} word characters (a-z, A-Z, 0-9, _) (2
times)
--------------------------------------------------------------------------------
) end of grouping
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char

I want to limit number of subdomain in Regular Expression

I want to limit levels of subdomain to 3 levels only. trying regex below fails
([\.]?[a-z]*){3}
My Target: abc.def.ghi
but
regex above accepts abc.def.ghi. (Notice the last .)

Use
^(?:[a-z]+(?:\.[a-z]+){0,2})?$
See proof.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
[a-z]+ any character of: 'a' to 'z' (1 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (between 0 and
2 times (matching the most amount
possible)):
--------------------------------------------------------------------------------
\. '.'
--------------------------------------------------------------------------------
[a-z]+ any character of: 'a' to 'z' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
){0,2} end of grouping
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Want to remove / ( -) . from phone number strings

I want to remove symbols in phone numbers. Sometimes it is in the format of 151-454-6545 but sometimes it is in (545)-(564)-(5465) and in sometimes it is in 548.445.8454. I am using
val.replace(/(\d{3})(\d{3})(\d{4})/, '($1) -$2-$3')
for replacing.. but it doesn't remove the dot.What to do remove the dot also? expected output like 545-455-4545

I suggest to use a non-digit expression to replace them by '-' string :
val.replace(/^\D+/, '')
.replace(/\D+$/, '')
.replace(/\D+/g, '-')
Let me know if it does what you need.
EDIT : trim whitespaces

here is a version with only 1 regex
https://regex101.com/r/Wavw45/1
regex
[^\d\n]*(\d{3})[^\d\n]+(\d{3,4})[^\d\n]+(\d{4})[^\d\n]*
replace (or whatever pattern you want)
($1) -$2-$3

Use
.replace(/^\D*(\d{3})\D*(\d{3})\D*(\d{4})\D*$/, '$1-$2-$3')
See proof.
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\D* non-digits (all but 0-9) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
\d{3} digits (0-9) (3 times)
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\D* non-digits (all but 0-9) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
\d{3} digits (0-9) (3 times)
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
\D* non-digits (all but 0-9) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
\d{4} digits (0-9) (4 times)
--------------------------------------------------------------------------------
) end of \3
--------------------------------------------------------------------------------
\D* non-digits (all but 0-9) (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

Regex that allows PascalCased with one or more consecutive uppercase letters and numbers

I'm trying to check user input for PascalCased names and this works, but I would like to also allow one or more consecutive upperCase letters eg. UNOrganization and also allow numbers in between eg. A2BOrganization.
So all of the following should be allowed: ABCWord, A3BWord, OtherWord, Word3, Word3AB (last unlikely but if possible fine)
if (value.match(/^[A-Z][a-z]+(?:[A-Z][a-z]+)*$/)) {
//logic here
}
Regex is a little beyond me and the logic to parse the strings would be too long for my needs and I know this can be done in a one-liner with regex so hopefully someone more savvy can help me.

Use
^[A-Z]+[a-z]*(?:\d*(?:[A-Z]+[a-z]*)?)*$
See proof
If you require at least one lowercase letter in the input string:
^(?=.*[a-z])[A-Z]+[a-z]*(?:\d*(?:[A-Z]+[a-z]*)?)*$
Explanation
--------------------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------------------------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
[a-z] any character of: 'a' to 'z'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
[A-Z]+ any character of: 'A' to 'Z' (1 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
[a-z]* any character of: 'a' to 'z' (0 or more
times (matching the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
\d* digits (0-9) (0 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
[A-Z]+ any character of: 'A' to 'Z' (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
[a-z]* any character of: 'a' to 'z' (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
)? end of grouping
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
$ before an optional \n, and the end of the
string

/^(?=.*[a-z])[A-Z]\d*[a-z]*(([A-Z\d]*[A-Z]|[A-Z][A-Z\d]*|\d+$)[a-z]*)*$/
seems to fit your requirements. See tests below.
The idea is to start with a capital letter followed by an optional digit and zero or more lowercase letters. Then enter the main group repeated zero or more times. This group matches one or more capital letters or digits followed by zero or more lowercase letters. The alternation handles allowing digits at the end and disallowing lone digits sandwiched between two lowercase characters.
const pat = /^(?=.*[a-z])[A-Z]\d*[a-z]*(([A-Z\d]*[A-Z]|[A-Z][A-Z\d]*|\d+$)[a-z]*)*$/;
const tests = [
"UNOrganization",
"A2BOrganization",
"ABCWord",
"A3BWord",
"OtherWord",
"Word3",
"Word3AB",
"a",
"A",
"Aa",
"aA",
"AAA",
"aaa",
"9A",
"A4",
"A4a",
"AaA",
"AaAa",
"Aa1",
"AaA1",
"Aa1a",
"",
];
const len = 2 + Math.max(...tests.map(e => e.length));
tests.forEach(e =>
console.log(`${`'${e}'`.padStart(len)} => ${pat.test(e)}`)
);

Another pattern:
^[A-Z]+[A-Za-z]*(?:[0-9]+(?:[A-Z]+[A-Za-z]*)*)*$
[A-Z]+[A-Za-z]*: At least one capital letter optionally followed a sequence of letters of any case, ...
(?: ... )*: ... optionally followed by a sequence of :
[0-9]+: one or more digit...
(?: ... )*: optionally followed by a sequence of :
[A-Z]+[A-Za-z]*: at least one capital letter optionally followed a sequence of letters of any case
Regex101.com working examples (replaced ^ and $ by word delimiters \b).

Regex to exclude an entire line match if certain characters found

I'm stuck on the cleanest way to accomplish two bits of regex. Every solution I've come up with so far seems clunky.
Example text
Match: Choose: blah blah blah 123 for 100'ish characters, this matches
NoMatch: Choose: blah blah blah 123! for 100'ish characters?, .this potential match fails for the ! ? and .
The first regex (?:^\w+?:)(((?![.!?]).)*)$ needs to:
Match a line containing any word followed by a : so long as !?. are not found in the same line (the word: will always be at the beginning of a line)
Ideally, match every part of the line from the example EXCEPT Choose:. Matching the whole line is still a win.
The second regex ^(^\w+?:)(?:(?![.!?]).)*$ needs to:
Match a line containing any word followed by a : so long as !?. are not found in the same line (the word: will always be at the beginning of a line)
Match only Choose:
The regex is in a greasemonkey/tampermonkey script.

Use
^\w+:(?:(?!.*[.!?])(.*))?
See proof.
EXPLANATION
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
[.!?] any character of: '.', '!', '?'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
.* any character except \n (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
)? end of grouping

Does this do what you want?
(?:^\w+:)((?:(?![!?.]).)*)$
What makes you feel that this is clunky?
(?: ... ) non-capturing group
^ start with
\w+: a series of one or more word characters followed by a :
( ... )$ capturing group that continues to the end
(?: ... )* non-capturing group, repeated zero or more times, with
(?! ... ) negative look-ahead: no following character can be
[!?.] either ?, ! or .
. followed by any character

For the first pattern, you could first check that there is no ! ? or . present using a negative lookahead. Then capture in the first group 1+ word chars and : and the rest of the line in group 2.
^(?![^!?.\n\r]*[!?.])(\w+:)(.*)$
^ Start of string
(?! Negative lookahead, assert what is on the right is not
[^!?.\n\r]*[!?.] Match 0+ times any char except the listed using contrast, then match either ! ? .
) Close lookahead
(\w+:) Capture group 1, match 1+ word chars and a colon
(.*) Capture group 2, match any char except a newline 0+ times
$ End of string
Regex demo
For the second part, if you want a match only for Choose:, you could use the negative lookahead only without a capturing group.
^(?![^!?.\n\r]*[!?.])\w+:
Regex demo

Develop Reference

JavaScript is the programming language of the Web.

Javascript Regex Input Validation to Prevent Duplicate Characters - javascript

Related

What's the maximum length should I allow to avoid Catastrophic backtracking?

I want to limit number of subdomain in Regular Expression

Want to remove / ( -) . from phone number strings

Regex that allows PascalCased with one or more consecutive uppercase letters and numbers

Regex to exclude an entire line match if certain characters found

Categories

Resources