How to match only specific characters in a given string with regex? - javascript

I want a specific value, the value be only numbers and:
the length should be 11.
the first digit should be 0.
the second digit should be 1.
the third digit should be 0, 1, 2, 5.
then match any digit from the forth digit to the end.
if the third digit is 1, then the last two digits(10th, 11th) should be the same.
if the third digit is 2, the 8th, 9th digits should be the same.
Input string, and expected result.
01012345678 -----> allowed.
0101234a5678 -----> not allowed., letter exists.
01112345688 -----> allowed, 10th, 11st are the same
01112345677 -----> allowed, 10th, 11st are the same
01112345666 -----> allowed, 10th, 11st are the same
01112345689 -----> not allowed..10th, 11st different
01112345-678 -----> not allowed..hyphen exists.
01298765532 -----> allowed..8th, 9th are the same.
01298765732 -----> not allowed, 8th, 9th different.
01298765mm432 -----> not allowed, letter exists.
01500011122 -----> allowed..
020132156456136 -----> not allowed..more than 11 digit.
01530126453333 -----> not allowed..more than 11 digit.
00123456789 -----> not allowed.. second digit not 1.
This is my attempt at regex101,^01[0125][0-9]{8}$ https://regex101.com/r/cIcD0R/1 but it ignore specific cases also it works for specific cases.

You could make use of an alternation with 2 capture groups and backreferences:
^01(?:[05]\d{8}|1\d{6}(\d)\1|2\d{4}(\d)\2\d\d)$
Explanation
^ Start of string
01 Match literally
(?: Non capture group for the alternatives
[05]\d{8} Match either 0 or 5 and 8 digits
| Or
1\d{6}(\d)\1 Match 1, then 6 digits, capture a single digit in group 1 followed by a backreference to match the same digit
| Or
2\d{4}(\d)\2\d\d Match 2, then 4 digits, capture a single digit in group 2 followed by a backrefence to match the same digit and match the last 2 digits
) Close the non capture group
$ End of string
See a regex101 demo
const regex = /^01(?:[05]\d{8}|1\d{6}(\d)\1|2\d{4}(\d)\2\d\d)$/;
[
"01012345678",
"0101234a5678",
"01112345688",
"01112345677",
"01112345666",
"01112345689",
"01112345-678",
"01298765532",
"01298765732",
"01298765mm432",
"01500011122",
"020132156456136",
"01530126453333",
"00123456789"
].forEach(s => console.log(`${s} => ${regex.test(s)}`))

If you're looking for a regex, purely to filter certain numbers without error messaging, this answer is probably not for you.
For validation purposes, a regex might not be the best way to go. If you would use one giant regex you would show one universal error message. This might leave a user confused since they partially complied with some of the criteria.
Instead split up the criteria so you can show a user relevant error messages.
function isValid(input, criteria) {
const errors = [];
for (const [isValid, error] of criteria) {
if (!isValid(input)) errors.push(error);
}
return [!errors.length, errors];
}
const criteria = [
[input => input.length === 11,
"must have a length of 11"],
[input => input.match(/^\d*$/),
"must only contain digits (0-9)"],
[input => input[0] === "0",
"must have 0 as 1st digit"],
[input => input[1] === "1",
"must have 1 as 2nd digit"],
[input => ["0","1","2","5"].includes(input[2]),
"must have 0, 1, 2 or 5 as 3rd digit"],
[input => input[2] !== "1" || input[9] === input[10],
"the 10th and 11th digit must be the same if the 3rd digit is 1"],
[input => input[2] !== "2" || input[7] === input[8],
"the 8th and 9th digit must be the same if the 3rd digit is 2"],
];
document.forms["validate-number"].addEventListener("submit", function (event) {
event.preventDefault();
const form = event.target;
const inputs = form.elements.inputs.value.split("\n");
inputs.forEach(input => console.log(input, ...isValid(input, criteria)));
});
<form id="validate-number">
<textarea name="inputs" rows="14" cols="15">01012345678
0101234a5678
01112345688
01112345677
01112345666
01112345689
01112345-678
01298765532
01298765732
01298765mm432
01500011122
020132156456136
01530126453333
00123456789</textarea>
<br />
<button>validate</button>
</form>

With your shown samples please try following regex. Here is the Online Demo for used regex.
^01(?:(?:[05][0-9]{8})|(?:1[0-9]{6}([0-9])\1)|(?:2[0-9]{4}([0-9])\2[0-9]{2}))$
Here is the JS code for above regex, using foreach loop along with using test function in it.
const regex = /^01(?:(?:[05][0-9]{8})|(?:1[0-9]{6}([0-9])\1)|(?:2[0-9]{4}([0-9])\2[0-9]{2}))$/;
[
"01012345678",
"0101234a5678",
"01112345688",
"01112345677",
"01112345666",
"01112345689",
"01112345-678",
"01298765532",
"01298765732",
"01298765mm432",
"01500011122",
"020132156456136",
"01530126453333",
"00123456789"
].forEach(element =>
console.log(`${element} ----> ${regex.test(element)}`)
);
Explanation: Adding detailed explanation for used regex.
^01 ##Matching 01 from starting of the value.
(?: ##Starting outer non-capturing group from here.
(?: ##In a non-capturing group
[05][0-9]{8} ##Matching 0 OR 5 followed by any other 8 digits.
)
| ##Putting OR condition here.
(?: ##In a non-capturing group
1[0-9]{6}([0-9])\1 ##Matching 1 followed by 6 digits followed by single digit(in a capturing group) and making sure next digit is matching previous.
)
| ##Puting OR condition here.
(?: ##In a non-capturing group matching, 2 followed by 4 digits followed by 1 digit in capturing group followed by it followed by 2 any other digits.
2[0-9]{4}([0-9])\2[0-9]{2}
)
)$ ##Closing outer non-capturing grouo here at the last of the value.

Related

Regex for decimal prices with or without spaces

I have a problem with my price regex which I'm trying to change. I want it to allow numbers like:
11111,64
2 122,00
123,12
123 345,23
For now I have something like this, but it won't accept numbers without spaces.
'^\d{1,3}( \d{3}){0,10}[,]*[.]*([0-9]{0,2'})?$'
I tried changing ( \d{3}) to (\s{0,1}\d{3}) but it still doesn't work :(
All problems are easier if you break them into pieces.
First we have to match the non decimal
1
100
1 000
10 000 000
The first grouping is 1 to 3 digits or \d{1,3}
We still need to account for the following groups which may or may not be there. That in regex is a * or 0 or many \d{1,3}(\s\d{3})* in that second part we put a space in front of the set to it now looks for spaces between groups of 3.
To complete this set we add in a \d+ for a flat block of numbers
Last we have to match the decimal, optionally ?. ^(\d{1,3}(\s\d{3})*|\d+)(,\d+)?$
/^(\d{1,3}(\s\d{3})*|\d+)(,\d+)?$/.test(str)
Test it some more here: https://regex101.com/r/NKAVLk/1
[
'1',
'123',
'11111,64',
'2 122,00',
'123,12',
'123 345,23',
'2 12,00', // no match, pair of 2 digits
'2 1234,00', // no match, pair of 4 digits
'1,123' // no match, > 2 digits after comma
].forEach(str => {
let t = /^(:?\d{1,3}( \d{3})*|\d+)(,\d{0,2})?$/.test(str);
console.log(str + ' => ' + t);
});
Output:
1 => true
123 => true
11111,64 => true
2 122,00 => true
123,12 => true
123 345,23 => true
2 12,00 => false
2 1234,00 => false
1,123 => false
Explanation of regex:
^ -- anchor at start
(:? -- non-capture group start (for logical OR)
\d{1,3} -- expect 1 to 3 digits
( \d{3})* -- expect 0+ patterns of space and 3 digits
| -- logical OR
\d+ -- expect 1+ digits
) -- non-capture group end
(,\d{0,2})? -- optional pattern of comma and 0 to 2 digits
$ -- anchor at end

How to match 2 separate numbers in Javascript

I have this regex that should match when there's two numbers in brackets
/(P|C\(\d+\,{0,1}\s*\d+\))/g
for example:
C(1, 2) or P(2 3) //expected to match
C(43) or C(43, ) // expect not to match
but it also matches the ones with only 1 number, how can i fix it?
You have a couple of issues. Firstly, your regex will match either P on its own or C followed by numbers in parentheses; you should replace P|C with [PC] (you could use (?:P|C) but [PC] is more performant, see this Q&A). Secondly, since your regex makes both the , and spaces optional, it can match 43 without an additional number (the 4 matches the first \d+ and the 3 the second \d+). You need to force the string to either include a , or at least one space between the numbers. You can do that with this regex:
[PC]\(\d+[ ,]\s*\d+\)
Demo on regex101
Try this regex
[PC]\(\d+(?:,| +) *\d+\)
Click for Demo
Explanation:
[PC]\( - matches either P( or C(
\d+ - matches 1+ digits
(?:,| +) - matches either a , or 1+ spaces
*\d+ - matches 0+ spaces followed by 1+ digits
\) - matches )
You can relax the separator between the numbers by allowing any combination of command and space by using \d[,\s]+\d. Test case:
const regex = /[PC]\(\d+[,\s]+\d+\)/g;
[
'C(1, 2) or P(2 3)',
'C(43) or C(43, )'
].forEach(str => {
let m = str.match(regex);
console.log(str + ' ==> ' + JSON.stringify(m));
});
Output:
C(1, 2) or P(2 3) ==> ["C(1, 2)","P(2 3)"]
C(43) or C(43, ) ==> null
Your regex should require the presence of at least one delimiting character between the numbers.
I suppose you want to get the numbers out of it separately, like in an array of numbers:
let tests = [
"C(1, 2)",
"P(2 3)",
"C(43)",
"C(43, )"
];
for (let test of tests) {
console.log(
test.match(/[PC]\((\d+)[,\s]+(\d+)\)/)?.slice(1)?.map(Number)
);
}

Regex for numbers with spaces and + sign in front

If i want to accept only numbers then i will use this regex
^[0-9]*$
but the problem here is that the numbers like
+1 00
are not catched and my regex will show that it is invalid
The user needs to type only numbers but only one space in between is allowed and + sign at the beggining should be optional.
So acceptable is:
+1 11 1 1 11
or
1 11 1 1 11
Unacceptable is:
+1 11 1 1 11
or
1 11 1 1 11
please help
You may try using this regex pattern:
^\+?\d+(?:[ ]?\d+)*$
Sample script:
console.log(/^\+?\d+(?:[ ]?\d+)*$/.test('+1 11 1 1 11')); // true
console.log(/^\+?\d+(?:[ ]?\d+)*$/.test('1 11 1 1 11')); // true
console.log(/^\+?\d+(?:[ ]?\d+)*$/.test('+1 11 1 1 11')); // false
console.log(/^\+?\d+(?:[ ]?\d+)*$/.test('1 11 1 1 11')); // false
The regex pattern used here says to:
^ from the start of the string
\+? match an optional leading +
\d+ then match one or more digits
(?:[ ]?\d+)* followed by an optional single space and more digits,
zero or more times
$ end of the string
Close!
/^[0-9]{1,}$/g
^ = start/first character
[0-9] = Select only number 0-9 but match it once,
{1,} = Match it one or more times,
$ = look no further, so cut all spaces, letters or non matches out!
or even
/^[0-9]+$/g
or even (preferred)
/^-?[1-9]\d*\.?(\d+)?$/g
You should not match anything more but numerals.
function CheckInt(inputNum) {
if (inputNum.toString().match(/^-?[1-9]\d*\.?(\d+)?$/g)) {
console.log(`${inputNum} is a number (INT)`);
} else {
console.log(`${inputNum} is not a number (INT)`);
}
}
CheckInt("a");
CheckInt("b");
CheckInt("c");
CheckInt("102 020");
CheckInt("102-1029");
CheckInt(5400);
CheckInt(-2);
CheckInt(20);
CheckInt(2042992540);

Regex uppercase separation but not separating more than 1 next to each other

I have array of values which I have to separate by their uppercase. But there are some cases where the value of the array has 2, 3 or 4 serial uppercases that I must not separate. Here are some values:
ERISACheckL
ERISA404cCheckL
F401kC
DisclosureG
SafeHarborE
To be clear result must be:
ERISA Check L
ERISA 404c Check L
F 401k C
Disclosure G
Safe Harbor E
I tried using:
value.match(/[A-Z].*[A-Z]/g).join(" ")
But of couse it is not working for serial letters.
One option could be matching 1 or more uppercase characters asserting what is directly to the right is not a lowercase character, or get the position where what is on the left is a char a-z or digit, and on the right is an uppercase char.
The use split and use a capture group for the pattern to keep it in the result.
([A-Z]+(?![a-z]))|(?<=[\da-z])(?=[A-Z])
( Capture group 1 (To be kept using split)
[A-Z]+(?![a-z]) Match 1+ uppercase chars asserting what is directly to the right is a-z
) Close group 1
| Or
(?<=[\da-z])(?=[A-Z]) Get the postion where what is directly to left is either a-z or a digit and what is directly to the right is A-Z
Regex demo
const pattern = /([A-Z]+(?![a-z]))|(?<=[\da-z])(?=[A-Z])/;
[
"ERISACheckL",
"ERISA404cCheckL",
"F401kC",
"DisclosureG",
"SafeHarborE"
].forEach(s => console.log(s.split(pattern).filter(Boolean).join(" ")))
Another option is to use an alternation | matching the different parts:
[A-Z]+(?![a-z])|[A-Z][a-z]*|\d+[a-z]+
[A-Z]+(?![a-z]) Match 1+ uppercase chars asserting what is directly to the right is a-z
| Or
[A-Z][a-z]* Match A-Z optionally followed by a-z to also match single uppercase chars
| Or
\d+[a-z]+ match 1+ digits and 1+ chars a-z
Regex demo
const pattern = /[A-Z]+(?![a-z])|[A-Z][a-z]*|\d+[a-z]+/g;
[
"ERISACheckL",
"ERISA404cCheckL",
"F401kC",
"DisclosureG",
"SafeHarborE"
].forEach(s => console.log(s.match(pattern).join(" ")))
function formatString(str) {
return str.replace(/([A-Z][a-z]+|\d+[a-z]+)/g, ' $1 ').replace(' ', ' ').trim();
}
// test
[
'ERISACheckL',
'ERISA404cCheckL',
'F401kC',
'DisclosureG',
'SafeHarborE'
].forEach(item => {
console.log(formatString(item));
});

Trying to filter a string for multiple values with REGEX

I need to match multiples groups from multiples lines according to a structured source string.
The string is formatted with one name per line, but with some other values, in this order:
May have a number before the name starting each line;
May have some junk separators between the number and the name;
The name may have any character, including symbols as parentheses, apostrophes, etc;
May have a code between parentheses with 3 or 4 letters after the name (don't bother with the possibility of the name having 3 or 4 letter between parenthesis, this will not happen)
May have a asterisk at the end of line, before the break line.
I need to retrieve this 4 groups for each line. That is what I'm trying :
/^(\d+)?(?:[ \t]?[x:.=]?)[ \t]?(.+?)(?=[ \t]?(\(\w{3,4}\))?[ \t]?(\*))$/igm
To catch the number:
^(\d+)?
To clean the possible separators:
(?:[ \t]?[x:.=]?)
Filtering the space between each group:
[ \t]?
The name (and the rest):
(.+?(?=[ \t]?(\(\w{3,4}\))?[ \t]?(\*)?))
The problem is, obviously, with this last one. It's catching all together (groups 2, 3 and 4). As you can see, I'm trying the two last optional groups as positive lookaheads to separate them from the name.
What am I doing wrong or how would be the better way to achieve the result?
EDIT
String sample:
2 John Smith
3 Messala Oliveira (NMN) *
Mary Pop *
Joshua Junior (pMHH)
What I need:
[ "2", "John Smith", "", "" ],
[ "3", "Messala Oliveira", "(NMN)", "*" ],
[ "", "Mary Pop", "", "*" ],
[ "", "Joshua Junior", "(pMHH)", "" ],
You need to wrap the capturing groups that can be present or absent with optional non-capturing groups:
/^(?:(\d+)[ \t]*)?(.*?)(?:[ \t](\(\w{3,4}\)))?(?:[ \t](\*))?$/igm
See the regex demo.
Details:
^ - start of string
(?:(\d+)[ \t]*)? - an optional non-capturing group matching
(\d+) - (Group 1) 1+ digits
[ \t]* - 0+ spaces or tabs (if \s is used, 0+ whitespaces)
(.*?) - Group 2 capturing any 0+ chars other than linenbreaks symbols as few as possible
(?:[ \t](\(\w{3,4}\)))? - an optional group matching
[ \t] - a space or tab
(\(\w{3,4}\)) - Group 3 capturing a (, 3 or 4 word chars, )
(?:[ \t](\*))? - another optional group matching a space or tab and capturing into Group 4 a * symbol.
$ - end of string.
If you test the strings separately, the [ \t] can be replaced with a simpler \s:
var regex = /^(?:(\d+)\s*)?(.*?)(?:\s(\(\w{3,4}\)))?(?:\s(\*))?$/i;
var strs = ['2 John Smith','3 Messala Oliveira (NMN) *','Mary Pop *','Joshua Junior (pMHH)'];
for (var i=0; i<strs.length; i++) {
if ((m = regex.exec(strs[i])) !== null) {
var res = [];
if (m[1]) {
res.push(m[1]);
} else res.push("");
res.push(m[2]);
if (m[3]) {
res.push(m[3]);
} else res.push("");
if (m[4]) {
res.push(m[4]);
} else res.push("");
}
console.log(res);
}

Categories

Resources