Adding zero to non leaded zero datetime string regex - javascript

I have the following datetime string 2020-5-1 1:2 I used the pattern (\W)(\d{1}) to match any digit with length 1 i.e non zero leaded, 5,1,1,2. This demo shows that pattern succeeded to catch them in the group 2 for every match.
Using Javascript's String replace method, I have tried to turn the datetime sample string to be 2020-05-01 01:02. In this jsbin that runs the following snippet:
var txt = '2020-5-1 1:2'
var output = [];
output[0] = txt.replace(/(\W)(\d{1})/gi,'0$1');
output[1] = txt.replace(/(\W)(\d{1})/gi,'0$2');
console.log(output);
// The output: ["20200-0-0 0:", "202005010102"]
In the first output's entry, it does unexpected behavior, instead of adding 0 to the match, it replaced it with 0! How could I solve this issue?

You only used a single placeholder in the replacement pattern, but in the regex pattern, you consumed two substrings with two capturing groups, so one is lost.
To add 0 before single digits you may use
txt.replace(/\b\d\b/g,'0$&')
txt.replace(/(^|\D)(\d)(?!\d)/g,'$10$2')
txt.replace(/(?<!\d)\d(?!\d)/g,'0$&') // With the ECMAScript2018+
Here, \b\d\b matches a digit that is neither preceded nor followed with an ASCII letter, digit or _. The substitution is 0 and the whole match value, $&.
The (^|\D)(\d)(?!\d) pattern capture start of string or a non-digit char into Group 1, then a digit is captured in Group 2. Then, (?!\d) makes sure there is no digit immediately to the right. The substitution is $10$2, Group 1 value, 0 and then Group 2 value.
The (?<!\d)\d(?!\d) pattern matches any digit not enclosed with other digits, and the substitution is the same as in Case 1.
JS demo:
var txt = '2020-5-1 1:2';
console.log( txt.replace(/\b\d\b/g,'0$&') )
console.log( txt.replace(/(^|\D)(\d)(?!\d)/g,'$10$2') )

Related

JS Regex multiple capturing groups return all matches

I'm trying to create a regex to extract data from the string. My sample string: dn1:pts-sc1.1. Format of a data I expect: ['pts', 'sc', '1.1'] so basically every set of letters after : and the numbers from the end.
What i have right now:
/^[^:]+:(?:([a-z]+)-?)+([\d\.]+)$/g
Unfortunately, it returns only last set of letters.
['sc', '1.1']
I also tried to add + to the first capturing group:
/^[^:]+:(?:([a-z]+)+-?)+([\d\.]+)$/g
The result in the same. Only difference in that regex101 gives me this comment:
A repeated capturing group will only capture the last iteration. Put a capturing group around the repeated group to capture all iterations or use a non-capturing group instead if you're not interested in the data
--edit
examples of input string:
dn2.33:sc-pts-tt-as3.43
dn2.33:sc3.43
dn2.33:sc-tt-as3.43
So basically I don't know the number of letter groups.
You may not get arbitrary number of groups, their number is specified by the number of capturing groups in your pattern. You may instead match and capture the --separated values into 1 group and then split it with - to get individual items and build the result dynamically:
var strs = ['dn2.33:sc-pts-tt-as3.43','dn2.33:sc3.43','dn2.33:sc-tt-as3.43'];
var rx = /^[^:]+:([a-z]+(?:-[a-z]+)*)([\d.]+)$/; // Define the regex
for (var s of strs) {
var res = []; // The resulting array variable
var m = rx.exec(s); // Run the regex search
if (m) { // If there is a match...
res = m[1].split('-'); // Split Group 1 value with - and assign to res
res.push(m[2]); // Add Group 2 value to the resulting array
}
console.log(s, "=>", res);
}
The pattern - ^[^:]+:([a-z]+(?:-[a-z]+)*)([\d.]+)$ - will match the following:
^ - start of string
[^:]+ - 1+ chars other than :
: - a colon
([a-z]+(?:-[a-z]+)*) - Group 1 (it will be abc-def-ghij...): 1 or more letters followed with 0+ consecutive sequences of - and 1+ letters (add /i modifier to make the pattern case insensitive)
([\d.]+) - Group 2 (it can be just "push"ed into to the resulting array as m[2]): 1 or more digits or .
$ - end of string.

Why isn't this group capturing all items that appear in parentheses?

I'm trying to create a regex that will capture a string not enclosed by parentheses in the first group, followed by any amount of strings enclosed by parentheses.
e.g.
2(3)(4)(5)
Should be: 2 - first group, 3 - second group, and so on.
What I came up with is this regex: (I'm using JavaScript)
([^()]*)(?:\((([^)]*))\))*
However, when I enter a string like A(B)(C)(D), I only get the A and D captured.
https://regex101.com/r/HQC0ib/1
Can anyone help me out on this, and possibly explain where the error is?
Since you cannot use a \G anchor in JS regex (to match consecutive matches), and there is no stack for each capturing group as in a .NET / PyPi regex libraries, you need to use a 2 step approach: 1) match the strings as whole streaks of text, and then 2) post-process to get the values required.
var s = "2(3)(4)(5) A(B)(C)(D)";
var rx = /[^()\s]+(?:\([^)]*\))*/g;
var res = [], m;
while(m=rx.exec(s)) {
res.push(m[0].split(/[()]+/).filter(Boolean));
}
console.log(res);
I added \s to the negated character class [^()] since I added the examples as a single string.
Pattern details
[^()\s]+ - 1 or more chars other than (, ) and whitespace
(?:\([^)]*\))* - 0 or more sequences of:
\( - a (
[^)]* - 0+ chars other than )
\) - a )
The splitting regex is [()]+ that matches 1 or more ) or ( chars, and filter(Boolean) removes empty items.
You cannot have an undetermined number of capture groups. The number of capture groups you get is determined by the regular expression, not by the input it parses. A capture group that occurs within another repetition will indeed only retain the last of those repetitions.
If you know the maximum number of repetitions you can encounter, then just repeat the pattern that many times, and make each of them optional with a ?. For instance, this will capture up to 4 items within parentheses:
([^()]*)(?:\(([^)]*)\))?(?:\(([^)]*)\))?(?:\(([^)]*)\))?(?:\(([^)]*)\))?
It's not an error. It's just that in regex when you repeat a capture group (...)* that only the last occurence will be put in the backreference.
For example:
On a string "a,b,c,d", if you match /(,[a-z])+/ then the back reference of capture group 1 (\1) will give ",d".
If you want it to return more, then you could surround it in another capture group.
--> With /((?:,[a-z])+)/ then \1 will give ",b,c,d".
To get those numbers between the parentheses you could also just try to match the word characters.
For example:
var str = "2(3)(14)(B)";
var matches = str.match(/\w+/g);
console.log(matches);

How to use RegEx to ignore the first period and match all subsequent periods?

How to use RegEx to ignore the first period and match all subsequent periods?
For example:
1.23 (no match)
1.23.45 (matches the second period)
1.23.45.56 (matches the second and third periods)
I am trying to limit users from entering invalid numbers. So I will be using this RegEx to replace matches with empty strings.
I currently have /[^.0-9]+/ but it is not enough to disallow . after an (optional) initial .
Constrain the number between the start ^ and end anchor $, then specify the number pattern you require. Such as:
/^\d+\.?\d+?$/
Which allows 1 or more numbers, followed by an optional period, then optional numbers.
I suggest using a regex that will match 1+ digits, a period, and then any number of digits and periods capturing these 2 parts into separate groups. Then, inside a replace callback method, remove all periods with an additional replace:
var ss = ['1.23', '1.23.45', '1.23.45.56'];
var rx = /^(\d+\.)([\d.]*)$/;
for (var s of ss) {
var res = s.replace(rx, function($0,$1,$2) {
return $1+$2.replace(/\./g, '');
});
console.log(s, "=>", res);
}
Pattern details:
^ - start of string
(\d+\.) - Group 1 matching 1+ digits and a literal .
([\d.]*) - zero or more chars other than digits and a literal dot
$ - end of string.

Replace last character of a matched string using regex

I am need to post-process lines in a file by replacing the last character of string matching a certain pattern.
The string in question is:
BRDA:2,1,0,0
I'd like to replace the last digit from 0 to 1. The second and third digits are variable, but the string will always start BRDA:2 that I want to affect.
I know I can match the entire string using regex like so
/BRDA:2,\d,\d,1
How would I get at that last digit for performing a replace on?
Thanks
You may match and capture the parts of the string with capturing groups to be able to restore those parts in the replacement result later with backreferences. What you need to replace/remove should be just matched.
So, use
var s = "BRDA:2,1,0,0"
s = s.replace(/(BRDA:2,\d+,\d+,)\d+/, "$11")
console.log(s)
If you need to match the whole string you also need to wrap the pattern with ^ and $:
s = s.replace(/^(BRDA:2,\d+,\d+,)\d+$/, "$11")
Details:
^ - a start of string anchor
(BRDA:2,\d+,\d+,) - Capturing group #1 matching:
BRDA:2, - a literal sunstring BRDA:2,
\d+ - 1 or more digits
, - a comma
\d+, - 1+ digits and a comma
\d+ - 1+ digits.
The replacement - $11 - is parsed by the JS regex engine as the $1 backreference to the value kept inside Group 1 and a 1 digit.

Regex split string of numbers at finding of Alpha Characters

OK Regex is one of the most confusing things to me. I'm trying to do this in Javascript. I have a search field that the user will enter a series of characters. Codes are either:
999MC111
or just
999MC
There is ALWAYS 2 Alpha characters. BUT there may be 1-4 characters at the front and sometimes 1-4 characters at the end.
If the code ENDS with the Alpha characters, then I run a certain ajax script. If there are Numbers + 2 letters + numbers....it runs a different ajax script.
My struggle is I know \d is for 2 digits....but it may not always be 2 digits.
So what would my regex code be to split this into an array. or something.
I think correct regex would be (/^([0-9]+)([a-zA-z]+)([0-9]+)$/
But how do i make sure its ONLY 2 alpha characters in middle?
Thanks
You could use the regex /\d$/ to determine if it ends with a decimal.
\d matches a decimal character, and $ matches the end of the string. The / characters enclose the expression.
Try running this in your javascript console, line by line.
var values = ['999MC111', '999MC', '999XYZ111']; // some test values
// does it end in digits?
!!values[0].match(/\d$/); // evaluates to true
!!values[1].match(/\d$/); // evaluates to false
To specify the exact number of tokens you must use brackets {}, so if you know that there are 2 alphabetic tokens you put {2}, if you know that there could be 0-4 digits you put {0,4}
^([0-9]{0,4})([a-zA-z]{2})([0-9]{0,4})$
The above RegEx evaluates as follows:
999MC ---> TRUE
999MC111 --> TRUE
999MAC111 ---> FALSE
MC ---> TRUE
The splitting of the expression into capturing groups is done by means of grouping subexpressions into parentheses
As you can see in the following link:
http://regexr.com?2vfhv
you obtain this:
3 capturing groups:
group 1: ([0-9]{0,4})
group 2: ([a-zA-z]{2})
group 3: ([0-9]{0,4})
The regex /^\d{1,4}[a-zA-Z]{2}\d{0,4}$/ matches a series of 1-4 digits, followed by a series of 2 alpha characters, followed by another series of 0-4 digits.
This regex: /^\d{1,4}[a-zA-Z]{2}$/ matches a series of 1-4 digits, followed only by 2 alpha characters.
Ok so I didnt really care about the middle 2 characters....all that really mattered was the 1st set of numbers and last set of numbers (if any).
So essentially I just needed to deal with digits. So I did this:
var lead = '123mc444'; //For example purposes
var regex = /(\d+)/g;
var result = (lead.match(regex));
var memID = result[0]; //First set of numbers is member id
if(result[1] != undefined) {
var leadID = result[1];
}

Categories

Resources