I'm trying to wrap my head around Regex, but having some troubles with the basics.
I want to check to see if a the last character in a string is either a "0" or a "5", but I also want to check to is if the second to last character (if it exists) is odd.
If it matters, I'm trying to do this in Javascript for some form validation. I have the following Regex to satisfy my first condition of checking the last character and making sure its a "0" or a "5"
/([0|5]$)/g
But how do I properly add a 2nd condition to see if the 2nd to last character exists and is odd? Something like the following...?
/([0|5]$)([1|3|5|7|9]$-1)/g
If someone doesn't mind helping me out here and also explain to me what each part of their regex is doing, I'd be very grateful.
I'd go with /(?<=[13579]{1})[05]|^[05]$/.
This utilises two conditionals. One that checks for the presence of an odd character in the second-to-last position when there's at least two characters in the string, and one that checks for a single character string.
Breaking this down:
(?<=[13579]{1}) - does a positive lookbehind on exactly one odd character
[05] - match a 0 or a 5 directly following the lookbehind
| - denotes an OR
^ denotes the start of the string
[05] - match a 0 or a 5
$ - the end of the string
This can be seen in the following:
var re = /(?<=[13579]{1})[05]|^[05]$/;
console.log(re.test('12345')); // 12345 should return `false`
console.log(re.test('12335')); // 12335 should return `true`
console.log(re.test('1')); // 1 should return `false`
console.log(re.test('5')); // 5 should return `true`
And also seen on Regex101 here.
You're thinking about it the wrong way.
Try this:
/([13579])([05])$/g
If you want to check if a the last character in a string is either a "0" or a "5" and also want to check if the second to last character (if it exists) is odd, I think you do not need the capturing groups.
You could use an alternation and character classes for your requirements.
(?:\D[05]|[13579][05]|^[05])$
That would match:
(?: Non capturing group
\D[05] Match not a digit and 0 or 5
| Or
[13579][05] Match an odd digit and 0 or 5
| Or
^[05] Match from the beginning of the string 0 or 5
) Close non capturing group
$ Assert the end of the line
const strings = [
"00",
"11",
"text1",
"text10",
"text00",
"text5",
"10",
"05",
"15",
"99",
"12345",
"12335",
"0000",
"0010",
"5",
"1",
"0",
];
let pattern = /(?:[13579][05]|\D[05]|^[05])$/;
strings.forEach((s) => {
console.log(s + " ==> " + pattern.test(s));
});
/(^|[13579])[05]$/
Explained:
[05]$ means "0 or 5 followed by end of string"
(^|[13579]) means "beginning of string OR 1 or 3 or 5 or 7 or 9"
Tested in console:
re.test('aaa0') - false
re.test('aa15') - true
re.test('aa20') - false
re.test('0') - true
Is this what you were after?
As you said
I want to check to see if a the last character in a string is either a "0" or a "5", but I also want to check to is if the second to last character (if it exists) is odd
Try this :
var rgx = /^([1-9]+[13579][05]|[1-9][05])$/;
function test(str) {
for (var i = 0; i < str.length; i++) {
var res = str[i].match(rgx);
if (res) {
console.log("match");
} else {
console.log("not match");
}
}
}
var arr = ["12335", "12350", "45", "10", "12337", "11", "01", "820"];
test(arr);
You would want to do:
/(^|[1|3|5|7|9])([0|5])$/
https://regex101.com/r/nMX7L2/4
1st Capturing Group (^|[1|3|5|7|9])
1st Alternative ^
^ asserts position at start of the string
2nd Alternative [|1|3|5|7|9]
Match a single character present in the list below [|1|3|5|7|9]
|1|3|5|7|9 matches a single character in the list |13579 (case sensitive)
Match a single character present in the list below [1|3|5|7|9]
1|3|5|7|9 matches a single character in the list 1|3|5|7|9 (case sensitive)
2nd Capturing Group ([0|5])
Match a single character present in the list below [0|5]
0|5 matches a single character in the list 0|5 (case sensitive)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)
Related
I'm trying to match words that consist only of characters in this character class: [A-z'\\/%], excluding cases where:
they are between < and >
they are between [ and ]
they are between { and }
So, say I've got this funny string:
[beginning]<start>How's {the} /weather (\\today%?)[end]
I need to match the following strings:
[ "How's", "/weather", "\\today%" ]
I've tried using this pattern:
/[A-z'/\\%]*(?![^{]*})(?![^\[]*\])(?![^<]*>)/gm
But for some reason, it matches:
[ "[beginning]", "", "How's", "", "", "", "/weather", "", "", "\\today%", "", "", "[end]", "" ]
I'm not sure why my pattern allows stuff between [ and ], since I used (?![^\[]*\]), and a similar approach seems to work for not matching {these cases} and <these cases>. I'm also not sure why it matches all the empty strings.
Any wisdom? :)
There are essentially two problems with your pattern:
Never use A-z in a character class if you intend to match only letters (because it will match more than just letters1). Instead, use a-zA-Z (or A-Za-z).
Using the * quantifier after the character class will allow empty matches. Use the + quantifier instead.
So, the fixed pattern should be:
[A-Za-z'/\\%]+(?![^{]*})(?![^\[]*\])(?![^<]*>)
Demo.
1 The [A-z] character class means "match any character with an ASCII code between 65 and 122". The problem with that is that codes between 91 and 95 are not letters (and that's why the original pattern matches characters like '[' and ']').
Split it with regular expression:
let data = "[beginning]<start>How's {the} /weather (\\today%?)[end]";
let matches = data.split(/\s*(?:<[^>]+>|\[[^\]]+\]|\{[^\}]+\}|[()])\s*/);
console.log(matches.filter(v => "" !== v));
You can match all the cases that you don't want using an alternation and place the character class in a capturing group to capture what you want to keep.
The [^ is a negated character class that matches any character except what is specified.
(?:\[[^\][]*]|<[^<>]*>|{[^{}]*})|([A-Za-z'/\\%]+)
Explanation
(?: Non capture group
\[[^\][]*] Match from opening till closing []
| Or
<[^<>]*> Match from opening till closing <>
| Or
{[^{}]*} Match from opening till closing {}
) Close non capture group
| Or
([A-Za-z'/\\%]+) Repeat the character class 1+ times to prevent empty matches and capture in group 1
Regex demo
const regex = /(?:\[[^\][]*]|<[^<>]*>|{[^{}]*})|([A-Za-z'/\\%]+)/g;
const str = `[beginning]<start>How's {the} /weather (\\\\today%?)[end]`;
let m;
while ((m = regex.exec(str)) !== null) {
if (m[1] !== undefined) console.log(m[1]);
}
Could you please tell me why my condition is always true? I am trying to validate my value using regex.i have few conditions
Name should not contain test "text"
Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
I do like this
https://jsfiddle.net/aoerLqkz/2/
var val = 'ab dd'
if (/test|[^a-z]|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i.test(val)) {
alert( 'match')
} else {
alert( 'false')
}
I tested my code with the following string and getting an unexpected result
input string "abc" : output fine :: "match"
input string "aaa" : output fine :: "match"
input string "aa a" : **output ** :: "match" why it is match ?? there is space between them why it matched ????
input string "sa c" : **output ** :: "match" why it is match ?? there is different string and space between them ????
The string sa c includes a space, the pattern [^a-z] (not a to z) matches the space.
Possibly you want to use ^ and $ so your pattern also matches the start and end of the string instead of looking for a match anywhere inside it.
there is space between them why it matched ????
Because of the [^a-z] part of your regular expression, which matches the space:
> /[^a-z]/i.test('aa a');
true
The issue is the [^a-z]. This means that any string that has a non-letter character anywhere in it will be a match. In your example, it is matching the space character.
The solution? Simply remove |[^a-z]. Without it, your regex meets all three criteria.
test checks if the value contains the word 'test'.
abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz checks if the value contains three sequential letters.
(.)\1\1 checks if any character is repeated three times.
Complete regex:
/test|(.)\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz/i`
I find it helpful to use a regex tester, like https://www.regexpal.com/, when writing regular expressions.
NOTE: I am assuming that the second criteria actually means "three consecutive letters", not "three consecutive characters" as it is written. If that is not true, then your regex doesn't meet the second criteria, since it only checks for three consecutive letters.
I would not do this with regular expresions, this expresion will always get more complicated and you have not the possibilities you had if you programmed this.
The rules you said suggest the concept of string derivative. The derivative of a string is the distance between each succesive character. It is specially useful dealing with password security checking and string variation in general.
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
//these strings have the same derivative: [0,0,0,0]
console.log(derivative('aaaaa'));
console.log(derivative('bbbbb'));
//these strings also have the same derivative: [1,1,1,1]
console.log(derivative('abcde'));
console.log(derivative('mnopq'));
//up and down: [1,-1, 1,-1, 1]
console.log(derivative('ababa'));
With this in mind you can apply your each of your rules to each string.
// Rules:
// 1. Name should not contain test "text"
// 2. Name should not contain three consecutive characters example "abc" , "pqr" ,"xyz"
// 3. Name should not contain the same character three times example "aaa", "ccc" ,"zzz"
const derivative = (str) => {
const result = [];
for(let i=1; i<str.length; i++){
result.push(str.charCodeAt(i) - str.charCodeAt(i-1));
}
return result;
};
const arrayContains = (master, sub) =>
master.join(",").indexOf( sub.join( "," ) ) == -1;
const rule1 = (text) => !text.includes('text');
const rule2 = (text) => !arrayContains(derivative(text),[1,1]);
const rule3 = (text) => !arrayContains(derivative(text),[0,0]);
const testing = [
"smthing textual",'abc','aaa','xyz','12345',
'1111','12abb', 'goodbcd', 'weeell'
];
const results = testing.map((input)=>
[input, rule1(input), rule2(input), rule3(input)]);
console.log(results);
Based on the 3 conditions in the post, the following regex should work.
Regex: ^(?:(?!test|([a-z])\1\1|abc|bcd|cde|def|efg|fgh|ghi|hij|ijk|jkl|klm|lmn|mno|nop|opq|pqr|qrs|rst|stu|tuv|uvw|vwx|wxy|xyz).)*$
Demo
I've being trying to generate a regex for this string:
case1: test-123456789 should get 56789
case2: test-1234-123456789 should get 56789
case3: test-12345 should fail or not giving anything
what I need is a way to get only the last 5 numbers from only 9 numbers
so far I did this:
case.match(/\d{5}$/)
it works for the first 2 cases but not for the last one
You may use
/\b\d{4}(\d{5})$/
See the regex demo. Get Group 1 value.
Details
\b - word boundary (to make sure the digit chunks are 9 digit long) - if your digit chunks at the end of the string can contain more, remove \b
\d{4} - four digits
(\d{5}) - Group 1: five digits
$ - end of string.
JS demo:
var strs = ['test-123456789','test-1234-123456789','test-12345'];
var rx = /\b\d{4}(\d{5})$/;
for (var s of strs) {
var m = s.match(rx);
if (m) {
console.log(s, "=>", m[1]);
} else {
console.log("Fail for ", s);
}
}
You can try this:
var test="test-123456789";
console.log((test.match(/[^\d]\d{4}(\d{5})$/)||{1: null/*default value if not found*/})[1]);
This way supports default value for when not found any matching (look at inserted comment inline above code.).
You can use a positive lookbehind (?<= ) to assert that your group of 5 digits is preceeded by a group of 4 digits without including them in the result.
/(?<=\d{4})\d{5}$/
var inputs = [
"test-123456789", // 56789
"test-1234-123456789", // 56789
"test-12345", //fail or not giving anything
]
var rgx = /(?<=\d{4})\d{5}$/
inputs.forEach(str => {
console.log(rgx.exec(str))
})
I need to match multiples groups from multiples lines according to a structured source string.
The string is formatted with one name per line, but with some other values, in this order:
May have a number before the name starting each line;
May have some junk separators between the number and the name;
The name may have any character, including symbols as parentheses, apostrophes, etc;
May have a code between parentheses with 3 or 4 letters after the name (don't bother with the possibility of the name having 3 or 4 letter between parenthesis, this will not happen)
May have a asterisk at the end of line, before the break line.
I need to retrieve this 4 groups for each line. That is what I'm trying :
/^(\d+)?(?:[ \t]?[x:.=]?)[ \t]?(.+?)(?=[ \t]?(\(\w{3,4}\))?[ \t]?(\*))$/igm
To catch the number:
^(\d+)?
To clean the possible separators:
(?:[ \t]?[x:.=]?)
Filtering the space between each group:
[ \t]?
The name (and the rest):
(.+?(?=[ \t]?(\(\w{3,4}\))?[ \t]?(\*)?))
The problem is, obviously, with this last one. It's catching all together (groups 2, 3 and 4). As you can see, I'm trying the two last optional groups as positive lookaheads to separate them from the name.
What am I doing wrong or how would be the better way to achieve the result?
EDIT
String sample:
2 John Smith
3 Messala Oliveira (NMN) *
Mary Pop *
Joshua Junior (pMHH)
What I need:
[ "2", "John Smith", "", "" ],
[ "3", "Messala Oliveira", "(NMN)", "*" ],
[ "", "Mary Pop", "", "*" ],
[ "", "Joshua Junior", "(pMHH)", "" ],
You need to wrap the capturing groups that can be present or absent with optional non-capturing groups:
/^(?:(\d+)[ \t]*)?(.*?)(?:[ \t](\(\w{3,4}\)))?(?:[ \t](\*))?$/igm
See the regex demo.
Details:
^ - start of string
(?:(\d+)[ \t]*)? - an optional non-capturing group matching
(\d+) - (Group 1) 1+ digits
[ \t]* - 0+ spaces or tabs (if \s is used, 0+ whitespaces)
(.*?) - Group 2 capturing any 0+ chars other than linenbreaks symbols as few as possible
(?:[ \t](\(\w{3,4}\)))? - an optional group matching
[ \t] - a space or tab
(\(\w{3,4}\)) - Group 3 capturing a (, 3 or 4 word chars, )
(?:[ \t](\*))? - another optional group matching a space or tab and capturing into Group 4 a * symbol.
$ - end of string.
If you test the strings separately, the [ \t] can be replaced with a simpler \s:
var regex = /^(?:(\d+)\s*)?(.*?)(?:\s(\(\w{3,4}\)))?(?:\s(\*))?$/i;
var strs = ['2 John Smith','3 Messala Oliveira (NMN) *','Mary Pop *','Joshua Junior (pMHH)'];
for (var i=0; i<strs.length; i++) {
if ((m = regex.exec(strs[i])) !== null) {
var res = [];
if (m[1]) {
res.push(m[1]);
} else res.push("");
res.push(m[2]);
if (m[3]) {
res.push(m[3]);
} else res.push("");
if (m[4]) {
res.push(m[4]);
} else res.push("");
}
console.log(res);
}
I have this code :
var tlTemp=new Array();
tlTemp.push("00 - 01:??:?? - TL 1");
tlTemp.push("00 - 12:??:?? - TL 2");
for(i=0; i<tlTemp.length; i++) {
var removedTL = tlTemp[i].match(/^(\d\d) - (\?\?|10|0\d):(\?\?|[0-5]\d):(\?\?|[0-5]\d) - (.*)/);
if(removedTL!=null) {
alert("ok");
}
else
{
alert("no");
return;
}
}
and I don't understand why first string print ok and the second (so similar) no. Why?
The appropriate part of the regexp that defines the different part of the string is:
(\?\?|10|0\d)
It matches:
??
10
0x where x is a digit
So 12 does not match.
Now, also there is TL 2 instead of TL 1 but in the regexp this is defined as:
(.*)
which matches everything so that is not causing the problem.
Because your regular expression explicitly excludes it.
This section:
/^(\d\d) - (\?\?|10|0\d)
constrains matches to strings starting with two digits, a space, a dash, and a space, and then either "??", "10", or "0" followed by a digit.
This part of your regular expression: (\?\?|10|0\d) should be changed to (\?\?|10|\d\d). The zero is changed to a \d. in the first string, that part of the string is 01, while the second string has 12, not matching the regular expression.