Regex calling backreferences from all group interations - javascript

I'm catching international numbers and running a regex to replace the characters people like to put between numbers.
I'm using the below RegEx:
[+]([0-9]{1,3})(([\s\-\.\(\)]*)([0-9]*)([\s\-\.\(\)]*)){1,3}
It works great but when I use a repeated group, it only catches the last iteration. When I use the regex101 site to debug my regular expression, I see:
A repeated capturing group will only capture the last iteration. Put a
capturing group around the repeated group to capture all iterations
I want to take the advice but I'm not sure how I can put a capturing group around the repeated group. See: https://regex101.com/r/pT3cK9/1

As stated in comments, the simples way to clean phone numbers would be to define a list of unwanted characters, and replace them by spaces:
'+94 (666) 999-5555'.replace(/[ .()-]+/g, ' '); // +94 666 999 5555
'+42 555.123.4567'.replace(/[ .()-]+/g, ' '); // +42 555 123 4567

Related

RegExp capturing non-match

I have a regex for a game that should match strings in the form of go [anything] or [cardinal direction], and capture either the [anything] or the [cardinal direction]. For example, the following would match:
go north
go foo
north
And the following would not match:
foo
go
I was able to do this using two separate regexes: /^(?:go (.+))$/ to match the first case, and /^(north|east|south|west)$/ to match the second case. I tried to combine the regexes to be /^(?:go (.+))|(north|east|south|west)$/. The regex matches all of my test cases correctly, but it doesn't correctly capture for the second case. I tried plugging the regex into RegExr and noticed that even though the first case wasn't being matched against, it was still being captured.
How can I correct this?
Try using the positive lookbehind feature to find the word "go".
(north|east|south|west|(?<=go ).+)$
Note that this solution prevents you from including ^ at the start of the regex, because the text "go" is not actually included in the group.
You have to move the closing parenthesis to the end of the pattern to have both patterns between anchors, or else you would allow a match before one of the cardinal directions and it would still capture the cardinal direction at the end of the string.
Then in the JavaScript you can check for the group 1 or group 2 value.
^(?:go (.+)|(north|east|south|west))$
^
Regex demo
Using a lookbehind assertion (if supported), you might also get a match only instead of capture groups.
In that case, you can match the rest of the line, asserting go to the left at the start of the string, or match only 1 of the cardinal directions:
(?<=^go ).+|^(?:north|east|south|west)$
Regex demo

How to include dot once as last character in regex?

I'm trying to make regex in JavaScript that matches numbers between 1-100 and includes two decimals.
Example numbers that need to be included in the regex:
1
1.1
1.15
0.5
100
100.00
Example numbers that need to be excluded:
101
100.01
100.1
55.999
This is my regex at the moment:
^(?:100(?:\.00?)?|\d?\d(?:\.\d\d?)?)$
This works otherwise but I also need to include two numbers followed by a decimal or 100 followed by a decimal like this:
14.
9.
100.
This is because I'm running the regex check each time a button is pressed in an input field and even though number like 0.5 is allowed by the current regex, I can't type it in because 0. is not allowed.
You will need to use two separate regexps here, one for the live input validation (just what you described in the question, that will let you input allowed values), and another one for a final "on-submit" validation (that will check the validity of the whole input string).
Otherwise, you won't be able to input 0.5 like values.
The regex for live input validation is:
/^(?:100(?:\.0?0?)?|\d\d?(?:\.\d?\d?)?)$/
See this regex demo. Note how the ? quantifiers make patterns optional, especially the \d and 0 after \. patterns.
The regex for final validation of numbers starting with 0.01 to 100 is
/^(?!0*(?:\.0*)?$)(?:100(?:\.00?)?|\d?\d?(?:\.\d\d?)?)$/
See this regex demo.
Make the numbers after the dot optional, too.
^(?:100(?:\.00?)?|\d?\d(?:\.\d{0,2})?)$
If you want to disallow 0 before the dot if the numbers after are all zero or nothing, probably include a separate negative lookahead for that:
^(?:100(?:\.00?)?|(?!0\.0*$)\d?\d(?:\.\d{0,2})?)$
Nothing here or in your original regex should disallow 0.5; perhaps add more debugging details if that is genuinely something you are grappling with.
You may use this regex for your task:
^(?:0?\.(?:\d?[1-9]|[1-9]\d)|[1-9]\d?(?:\.\d{1,2})?|100(?:\.0{1,2})?)$
RegEx Demo
With your shown samples, could you please try following.
^(?:100(?:\.0{1,2})?|(?:(?:\d\d?)(?:\.\d{1,2})?))$
Online demo for above regex
Explanation: Adding detailed explanation for above.
^ ##Matching from starting of value here.
(?: ##Starting 1st capturing group here.
100(?:\.0{1,2})?| ##matching 100 with or without 1 to 2 zeroes OR
(?: ##Starting 2nd capturing group here.
(?:\d\d?) ##In a non-capturing group matching 0 to 9 and 0 to 9 optional.
(?:\.\d{1,2})? ##In a non-capturing group matching dot followed by 1 or 2 digits
) ##Closing 2nd capturing group here.
)$ ##Closing 1st capturing group at the end of value.

Regex finding second string

I'm attempting to get the last word in the following strings.
After about 45 minutes I can't seem to find the right combination of slashes, dashes and brackets.
The closest I've got is
/(?![survey])[a-z]+/gi
It matches the following strings, except for "required" it is returning the match "quired" I'm assuming it's because the re are in the word survey.
survey[1][title]
survey[1][required]
survey[2][anotherString]
You're using a character set, which will exclude any of the characters from being the first character in the match, which isn't what you want. Using plain negative lookahead would be a start:
(?!survey)[a-z]+
But you also want to match the final word, which can be done by matching word characters that are followed with \]$ - that is, by a ] and the end of the string:
[a-z]+(?=\]$)
https://regex101.com/r/rLvsY5/1
If you want to be more efficient, match the whole string, but capture what comes between the square brackets in a capturing group - the last repeated captured group will be in the result:
survey(?:\[(\w+)\])+
https://regex101.com/r/rLvsY5/2
One way to solve this is to match the full line and only capture the part you need.
survey\[\d+\]\[([a-z]+)\]

Regex exact match on number, not digit

I have a scenario where I need to find and replace a number in a large string using javascript. Let's say I have the number 2 and I want to replace it with 3 - it sounds pretty straight forward until I get occurrences like 22, 32, etc.
The string may look like this:
"note[2] 2 2_ someothertext_2 note[32] 2finally_2222 but how about mymomsays2."
I want turn turn it into this:
"note[3] 3 3_ someothertext_3 note[32] 3finally_2222 but how about mymomsays3."
Obviously this means .replace('2','3') is out of the picture so I went to regex. I find it easy to get an exact match when I am dealing with string start to end ie: /^2$/g. But that is not what I have. I tried grouping, digit only, wildcards, etc and I can't get this to match correctly.
Any help on how to exactly match a number (where 0 <= number <= 500 is possible, but no constraints needed in regex for range) would be greatly appreciated.
The task is to find (and replace) "single" digit 2, not embedded in
a number composed of multiple digits.
In regex terms, this can be expressed as:
Match digit 2.
Previous char (if any) can not be a digit.
Next char (if any) can not be a digit.
The regex for the first condition is straightforward - just 2.
In other flavours of regex, e.g. PCRE, to forbid the previous
char you could use negative lookbehind, but unfortunately Javascript
regex does not support it.
So, to circumvent this, we must:
Put a capturing group matching either start of text or something
other than a digit: (^|\D).
Then put regex matching just 2: 2.
The last condition, fortunately, can be expressed as negative lookahead,
because even Javascript regex support it: (?!\d).
So the whole regex is:
(^|\D)2(?!\d)
Having found such a match, you have to replace it with the content
of the first capturing group and 3 (the replacement digit).
You can use negative look-ahead:
(\D|^)2(?!\d)
Replace with: ${1}3
If look behind is supported:
(?<!\d)2(?!\d)
Replace with: 3
See regex in use here
(\D|\b)2(?!\d)
(\D|\b) Capture either a non-digit character or a position that matches a word boundary
(?!\d) Negative lookahead ensuring what follows is not a digit
Alternations:
(^|\D)2(?!\d) # Thanks to #Wiktor in the comments below
(?<!\d)2(?!\d) # At the time of writing works in Chrome 62+
const regex = /(\D|\b)2(?!\d)/g
const str = `note[2] 2 2_ someothertext_2 note[32] 2finally_2222 but how about mymomsays2.`
const subst = "$13"
console.log(str.replace(regex, subst))

Whitespace causing problems in regex to capture addresses

I'm having trouble creating regex to capture icelandic home addresses.
Icelandic addresses can have a couple of formats
address 3
address 3b
add-ress
add-ress 2453
ad dr ess
Basically almost any form of a sentence and then an optional number and letter.
I have come up with the following regex.
^(\D+)\s*?(\d+\w*)?
Now this works pretty well except that the \D+ is greedy and always consumes the whitespace between the number and the street/house name.
I've tried many different quantifiers and also tried positive and negative lookups without success.
I know I can always trim the whitespace from the address after this has been captured in code but i want to know if there is any way to do this properly using Regex.
I would just group the separating space with the optional number group, but make sure it's excluded from the captured group.
^(\D+)(?:\s+(\d+\w*))?$
Instead of "one or more non-digits" (\D+), you want "one or more non-digits, of which the last one is also non-whitespace", i.e. "zero or more non-digits, plus one non-whitespace–non-digit" (\D*[^\d\s]):
^(\D*[^\d\s])\s*?(\d+\w*)?

Categories

Resources