regular expressions - javascript

Is it possible to merge these two regular expressions?
/^[A-Za-z]\S{3,30}$/
/^(?:(\w)(?!\1\1))+$/
I would like a string of:
only letters,
length between 3 and 30,
no spaces,
no to the repetition of the same letter more than two consecutive times (e.g: 'ddd' //false, 'dtdddyyt' //false, 'dtddyyt'//true).

You can use the second pattern as a negative lookahead assertion once to not match ddd in the string.
^(?!\S*(\w)\1\1)[A-Za-z]\S{3,30}$
^ Start of string
(?! Negative lookahead
\S*(\w)\1\1 Match optional non whitespace chars, capture a word char and match the same with 2 backreferences
) Close lookahead
[A-Za-z]\S{3,30} Match a single char A-Za-z and 3-30 non whitespace chars
$ End of string
Regex demo
const regex = /^(?!\S*(\w)\1\1)[A-Za-z]\S{3,30}$/;
[
"ddad",
"dtddyyt",
"adaddd",
"dtdddyyt",
"ddd"
].forEach(s => console.log(`${s} --> ${regex.test(s)}`));

You can use
/^(?:([A-Za-z])(?!\1{2})){3,30}$/
/^(?:(\p{Alphabetic})(?!\1{2})){3,30}$/u
See the regex demo. Note:
only letters - [A-Za-z] / \p{L} or \p{Alphabetic} (available in ECMAScript 2018+ compliant JS environments with /u flag)
length between 3 and 30 - {3,30}
no spaces - this condition is already covered by the first one
no to the repetition of the same letter more than two consecutive times - ([A-Za-z])(?!\1{2}).
JavaScript test:
const rx = /^(?:([A-Za-z])(?!\1{2})){3,30}$/;
const texts = ["ddad","dtddyyt","adaddd","dtdddyyt","ddd"];
for (let text of texts) {
console.log(text, "=>", rx.test(text));
}
More details:
^ - start of string
(?: - start of a non-capturing group (used as a container of a pattern sequence):
([A-Za-z]) - Group 1: an ASCII letter (\p{L} / \p{Alphabetic} matches any Unicode letter)
(?!\1{2}) - right after the letter, there should not be two occurrences of the same letter
) - end of the group
{3,30} - match three to thirty consecutive occurrences of the pattern sequence inside the non-capturing group
$ - end of string.

Related

How do I use RegEx to find parens that contain BOTH numbers and letters, not just one or the other

In this example...
(5) (dogs) (5 dogs) (dogs 5)
I would like to only match to...
(5 dogs) -or- (dogs 5)
The numbers could be any number of digits, contain commas, decimal points, math operators, dollar signs, etc. The only thing I need to pay attention to is that there are both numbers and alpha characters present.
I started with this modification of an example provided by hrs using this for the RegEx...
\(((letter).*(number))\)|((number).*(letter))\)
to only capture this...
(number letter) -or- (letter number)
but not...
(number) (letter)
by modifying the expression to be...
\(((^[a-zA-Z]).*(^[0-9]))\)|((^[0-9]).*(^[a-zA-Z]))\)
...but obviously I don't know what I'm doing.
You can use forward lookaheads to assert that there are both a number and a letter within each set of parentheses:
\((?=[^)\d]*\d)(?=[^)a-z]*[a-z])[^)]+\)
The two lookaheads assert that there are some number of non-closing parenthesis characters and then a digit (first lookahead (?=[^)\d]*\d)) or a letter (second lookahead (?=[^)a-z]*[a-z])). The [^)]+ then matches the characters between the ( and ).
Demo on regex101
In Javascript:
const str = '(5) (dogs) (5 dogs) (dogs 5)'
const regex = /\((?=[^)\d]*\d)(?=[^)a-z]*[a-z])[^)]+\)/ig
console.log(str.match(regex))
As an alternative with a single lookahead:
\((?=[^)a-z]*[a-z])[^\d)]*\d[^)]*\)
Explanation
\( Match (
(?= Positive lookahead
[^)a-z]*[a-z] Match any char except ) or a-z, then match a-z
) Close the lookahead
[^\d)]*\d Match any char except a digit or ) and then match a digit
[^)]* Match any char except )
\) Match )
Regex demo
const s = '(5) (dogs) (5 dogs) (dogs 5)';
const regex = /\((?=[^)a-z]*[a-z])[^\d)]*\d[^)]*\)/ig;
console.log(s.match(regex));

Regex for an alphanumeric string of specific length with at least one letter and at least one digit

I want to match alphanumeric string of specific length with at least one letter and at least one digit.
For example, adfyg432 should contain alphabetic and digit and the length should start from 8.
I used this expression but it won't work:
^([A-Za-z]{1,}\d{1,}){8,}$
Your current pattern repeats a group 8 or more times. That group by itself matches 1 or more chars a-z followed by 1 or more digits.
That means that the minimum string length to match is 16 chars in pairs of at least 2. So for example a string like a1aa1a1a1a1a1a1a1 would match.
You could write the pattern using 2 lookahead assertions to assert a length of at least 8 and assert at least a char a-z.
Then match at least a digit. Using a case insensitive match:
^(?=[a-z\d]{8,}$)(?=\d*[a-z])[a-z]*\d[a-z\d]*$
In parts, the pattern matches:
^ Start of string
(?=[a-z\d]{8,}$) Positive lookahead, assert 8 or more chars a-z or digits till end of string
(?=\d*[a-z]) Positive lookahead to assert at least a char a-z
[a-z]* Match optional chars a-z
\d Match at least a single digit
[a-z\d]* Match optional chars a-z or digits
$ End of string
Regex demo
const regex = /^(?=[a-z\d]{8,}$)(?=\d*[a-z])[a-z]*\d[a-z\d]*$/i;
[
"AdfhGg432",
"Abc1aaa"
].forEach(s =>
console.log(`Match "${s}": ${regex.test(s)}`)
)

positive lookahead

Use lookaheads to match a string that is greater than 5 characters long and have two consecutive digits.
I know the solution should be
/(?=\w{6,})(?=\D*\d{2})/
But why the second element is
(?=\D*\d{2})
Instead of
(?=\d{2})
Please help me to understand this.
Actually, /(?=\w{6,})(?=\D*\d{2})/ does not ensure there will be a match in a string with 2 consecutive digits.
Check this demo:
var reg = /(?=\w{6,})(?=\D*\d{2})/;
console.log(reg.test("Matches are found 12."))
console.log(reg.test("Matches are not found 1 here 12."))
This happens because \D* only matches any non-digit chars, and once the \w{6,} matches, (?=\D*\d{2}) wants to find the two digits after any 0+ digits, but it is not the case in the string.
So, (?=\w{6,})(?=\D*\d{2}) matches a location in the string that is immediately followed with 6 or more word chars and any 0+ non-digit chars followed with 2 digits.
The correct regex to validate if a string contains 6 or more word chars and two consecutive digits anywhere in the string is
var reg = /^(?=.*\w{6,})(?=.*\d{2})/;
Or, to support multiline strings:
var reg = /^(?=[^]*\w{6,})(?=[^]*\d{2})/;
where [^] matches any char. Also, [^] can be replaced with [\s\S] / [\d\D] or [\w\W].
And to match a string that is greater than 5 characters long and have two consecutive digits you may use
var reg = /^(?=.*\d{2}).{5,}$/
var reg = /^(?=[\s\S]*\d{2})[\s\S]{5,}$/
where
^ - start of string
(?=[\s\S]*\d{2}) - there must be two digits anywhere after 0+ chars to the right of the current location
[\s\S]{5,} - five or more chars
$ - end of string.
The lookahead has to allow the 2 digits anywhere in the input. If you used just (?=\d{2}) then the 2 digits would have to be at the beginning.
You could also use (?=.*\d{2}). The point is that \d{2} has to be preceded by something that can match the rest of the input before the digits.

“combine” 2 regex with a logic or?

I have two patterns for javascript:
/^[A-z0-9]{10}$/ - string of exactly length of 10 of alphanumeric symbols.
and
/^\d+$/ - any number of at least length of one.
How to make the expression of OR string of 10 or any number?
var pattern = /^([A-z0-9]{10})|(\d+)$/;
doesn't work by some reason. It passes at lest
pattern.test("123kjhkjhkj33f"); // true
which is not number and not of length of 10 for A-z0-9 string.
Note that your ^([A-z0-9]{10})|(\d+)$ pattern matches 10 chars from the A-z0-9 ranges at the start of the string (the ^ only modifies the ([A-z0-9]{10}) part (the first alternative branch), or (|) 1 or more digits at the end of the stirng with (\d+)$ (the $ only modifies the (\d+) branch pattern.
Also note that the A-z is a typo, [A-z] does not only match ASCII letters.
You need to fix it as follows:
var pattern = /^(?:[A-Za-z0-9]{10}|\d+)$/;
or with the i modifier:
var pattern = /^(?:[a-z0-9]{10}|\d+)$/i;
See the regex demo.
Note that grouping is important here: the (?:...|...) makes the anchors apply to each of them appropriately.
Details
^ - start of string
(?: - a non-capturing alternation group:
[A-Za-z0-9]{10} - 10 alphanumeric chars
| - or
\d+ - 1 or more digits
) - end of the grouping construct
$ - end of string

remove unwanted groups of characters from string using regex

Given: 1999 some text here 1.3i [more]
Needed: some text here
The following regex - replace(/[\d{4} |\d\.*$]/,'') - failed, it just removed the first digit. Any idea why and how to fix it?
var s = "1999 some text here 1.3i [more]"
console.log(s.replace(/[\d{4} |\d\.*$]/,''))
The regex you have removes the first digit only because it matches just 1 char - either a digit, {, 4, }, space, |, ., * or $ (as [...] formed a character class), just once (there is no global modifier).
You may use
/^\d{4}\s+|\s*\d\..*$/g
See the regex demo
Basically, remove the [ and ] that form a character class, add g modifier to perform multiple replacements, and add .* (any char matching pattern) at the end.
Details:
First alternative:
- ^ - start of string
- \d{4} - 4 digits
- \s+ - 1+ whitespaces
Second alternative:
- \s* - 0+ whitespaces
- \d - a digit
- \. - a dot
- .* - any 0+ chars up to...
- $ - the end of the string
var rx = /^\d{4}\s+|\s*\d\..*$/g;
var str = "1999 some text here 1.3i [more]";
console.log(str.replace(rx, ''));

Categories

Resources