Placing each matched value in its own capturing group - javascript

I've been at this for too long, trying to figure out how to match a comma-delimited string of values, while breaking apart the values into their own capturing groups. Here are my requirements:
No leading comma
Terms can be alphanumeric, with between 1 and 7 characters
Min: 1 term; Max: unlimited
Unlimited whitespace between terms and commas
No trailing comma
I'm so close, but I'm not able to get all terms in the string into their own capture groups. Instead it places the last matched term from the first capturing group into group #1, instead of placing all matches into previous groups. So here's my example:
abc1234, def5678, ghi9012
I would expect abc1234 to be group #1, def5678 to be group #2, and ghi9012 to be group #3. Instead, using the expression below, I get def5678 in group #1 and ghi9012 in group #2.
/(?:([A-z0-9]{1,7})\s*,\s*)+([A-z0-9]{1,7})/g
Link to RegExr example
I'm pretty sure I haven't set up my capturing/non-capturing groups correctly. Any help would be greatly appreciated.

This can do it for you. Using the extraction regex the value is in group 1. Also the value is trimmed.
Let me know if you need one for quoted fields.
Note that the requirement for 1-7 chars can't be enforced using the extraction one,
unless its validated ahead of time.
Validation regex:
# /^(?:(?:(?:^|,)\s*)[a-zA-Z0-9]{1,7}(?:\s*(?:(?=,)|$)))+$/
^
(?:
(?: # leading comma + optional whitespaces
(?: ^ | , )
\s*
)
[a-zA-Z0-9]{1,7} # alpha-num, 1-7 chars
(?: # trailing optional whitespaces
\s*
(?:
(?= , )
| $
)
)
)+
$
Extraction regex.
# /(?:(?:^|,)\s*)([^,]*?)(?:\s*(?:(?=,)|$))/
(?: # leading comma + optional whitespaces
(?: ^ | , )
\s*
)
( [^,]*? ) # (1), non-quoted field
(?: # trailing optional whitespaces
\s*
(?:
(?= , )
| $
)
)

Related

Regex to exclude an entire line match if certain characters found

I'm stuck on the cleanest way to accomplish two bits of regex. Every solution I've come up with so far seems clunky.
Example text
Match: Choose: blah blah blah 123 for 100'ish characters, this matches
NoMatch: Choose: blah blah blah 123! for 100'ish characters?, .this potential match fails for the ! ? and .
The first regex (?:^\w+?:)(((?![.!?]).)*)$ needs to:
Match a line containing any word followed by a : so long as !?. are not found in the same line (the word: will always be at the beginning of a line)
Ideally, match every part of the line from the example EXCEPT Choose:. Matching the whole line is still a win.
The second regex ^(^\w+?:)(?:(?![.!?]).)*$ needs to:
Match a line containing any word followed by a : so long as !?. are not found in the same line (the word: will always be at the beginning of a line)
Match only Choose:
The regex is in a greasemonkey/tampermonkey script.
Use
^\w+:(?:(?!.*[.!?])(.*))?
See proof.
EXPLANATION
NODE EXPLANATION
--------------------------------------------------------------------------------
^ the beginning of the string
--------------------------------------------------------------------------------
\w+ word characters (a-z, A-Z, 0-9, _) (1 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
: ':'
--------------------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
.* any character except \n (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
[.!?] any character of: '.', '!', '?'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
.* any character except \n (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
)? end of grouping
Does this do what you want?
(?:^\w+:)((?:(?![!?.]).)*)$
What makes you feel that this is clunky?
(?: ... ) non-capturing group
^ start with
\w+: a series of one or more word characters followed by a :
( ... )$ capturing group that continues to the end
(?: ... )* non-capturing group, repeated zero or more times, with
(?! ... ) negative look-ahead: no following character can be
[!?.] either ?, ! or .
. followed by any character
For the first pattern, you could first check that there is no ! ? or . present using a negative lookahead. Then capture in the first group 1+ word chars and : and the rest of the line in group 2.
^(?![^!?.\n\r]*[!?.])(\w+:)(.*)$
^ Start of string
(?! Negative lookahead, assert what is on the right is not
[^!?.\n\r]*[!?.] Match 0+ times any char except the listed using contrast, then match either ! ? .
) Close lookahead
(\w+:) Capture group 1, match 1+ word chars and a colon
(.*) Capture group 2, match any char except a newline 0+ times
$ End of string
Regex demo
For the second part, if you want a match only for Choose:, you could use the negative lookahead only without a capturing group.
^(?![^!?.\n\r]*[!?.])\w+:
Regex demo

How to match non-escaped quoted strings and also non-quoted strings?

I have a string that contains single, double, and escaped quotations:
Telling myself 'you are \'great\' ' and then saying "thank you" feels "a \"little\" nice"
I would like a single regex to pull out:
single quoted strings
double quoted strings
strings not in quotes
Expected Result: the following groups
Telling myself
you are \'great\'
and then saying
thank you
feels
a \"little\" nice
Requirements: don't return quotes, and ignore escaped quotes
What I have so far:
Regex #1 to return single and double quotes (source):
((?<![\\])['"])((?:.(?!(?<![\\])\1))*.?)\1
Result:
Regex #2 to return non-quoted strings:
((?<![\\])['"]|^).*?((?<![\\])['"]|$)
Result:
Problems:
I am unable to make regex #2 put the non-quoted string into a consistent group
I am unable to combine regex #1 and #2 to return all strings in one regex function
How about something like this:
(?<!\\)'(.+?)(?<!\\)'|(?<!\\)"(.+?)(?<!\\)"|(.+?)(?='|"|$)
Demo.
The basic idea behind this is that it tries to match the strings with quotes first so that whatever is left after that is the strings that were not enclosed quotes. You will have all the matched strings (not including the quotes) in the capturing groups.
Shortened version:
(?<!\\)(['"])(.+?)(?<!\\)\1|(.+?)(?='|"|$)
Demo.
If you don't want to use capturing groups, you may adjust it to work with Lookarounds like the following:
(?<=(?<!\\)').+?(?=(?<!\\)')|(?<=(?<!\\)").+?(?=(?<!\\)")|(?<=^|['"]).+?(?=(?<!\\)['"]|$)
Demo.
Shortened version:
(?<=(?<!\\)(['"])).+?(?=(?<!\\)\1)|(?<=^|['"]).+?(?=(?<!\\)['"]|$)
Demo.
JS version
/(?:"([^"\\]*(?:\\[\S\s][^"\\]*)*)"|'([^'\\]*(?:\\[\S\s][^'\\]*)*)'|([^'"\\]+)|(\\[\S\s]))/
https://regex101.com/r/5xfs7q/1
PCRE - Pro level, super version ..
(?|(?|\s*((?:[^'"\\]|(?:\\[\S\s][^'"\\]*))+)(?<!\s)\s*|\s+(*SKIP)(*FAIL))|(?<!\\)(?|"([^"\\]*(?:\\[\S\s][^"\\]*)*)"|'([^'\\]*(?:\\[\S\s][^'\\]*)*)')|([\S\s]))
https://regex101.com/r/Tdyd3y/1
This is the cleanest, nicest one I've ever seen.
Wsp trim and regex contains just a single capture group.
Explained
(?| # BReset
(?| # BReset
\s* # Wsp trim
( # (1 start), Non-quoted data
(?:
[^'"\\]
| (?: \\ [\S\s] [^'"\\]* )
)+
) # (1 end)
(?<! \s )
\s* # Wsp trim
| # or,
\s+ (*SKIP) (*FAIL) # Skip intervals with all whitespace
)
|
(?<! \\ ) # Not an escape behind
(?| # BReset
"
( # (1 start), double quoted string data
[^"\\]*
(?: \\ [\S\s] [^"\\]* )*
) # (1 end)
"
| # or,
'
( # (1 start), single quoted string data
[^'\\]*
(?: \\ [\S\s] [^'\\]* )*
) # (1 end)
'
)
|
( [\S\s] ) # (1), Pass through, single char
# Un-balanced " or ' or \ at EOF
)

Livecycle RegExp - trouble with decimal

Within Livecycle, I am validating that the number entered is a 0 through 10 and allows quarter hours. With the help of this post, I've written the following.
if (!xfa.event.newText.match(/^(([10]))$|^((([0-9]))$|^((([0-9]))\.?((25)|(50)|(5)|(75)|(0)|(00))))$/))
{
xfa.event.change = "";
};
The problem is periods are not being accepted. I have tried wrapping the \. in parenthesis but that did not work either. The field is a text field with no special formatting and the code in the change event.
Yikes, that's a convoluted regex. This can be simplified a lot:
/^(?:10|[0-9](?:\.(?:[27]?5)?0*)?)$/
Explanation:
^ # Start of string
(?: # Start of group:
10 # Either match 10
| # or
[0-9] # Match 0-9
(?: # optionally followed by this group:
\. # a dot
(?:[27]?5)? # either 25, 75 or 5 (also optional)
0* # followed by optional zeroes
)? # As said before, make the group optional
) # End of outer group
$ # End of string
Test it live on regex101.com.

Javascript Regular expression for currency amount with spaces

I have this regular expression
/^[',",\+,<,>,\(,\*,\-,%]?([£,$,€]?\d+([\,,\.]\d+)?[£,$,€]?\s*[\-,\/,\,,\.,\+]?[\/]?\s*)+[',",\+, <,>,\),\*,\-,%]?$/
It matches this very well $55.5, but in few of my test data I have some values like $ 55.5 (I mean, it has a space after $ sign).
The answers on this link are not working for me.
Currency / Percent Regular Expression
So, how can I change it to accept the spaces as well?
Try following RegEx:
/^[',",\+,<,>,\(,\*,\-,%]?([£,$,€]?\s*\d+([\,,\.]\d+)?[£,$,€]?\s*[\-,\/,\,,\.,\+]?[\/]?\s*)+[',",\+, <,>,\),\*,\-,%]?$/
Let me know if it worked!
Demo Here
TLDR:
/^[',",\+,<,>,\(,\*,\-,%]?([£,$,€]?\s*\d+([\,,\.]\d+)?[£,$,€]?\s*[\-,\/,\,,\.,\+]?[\/]?\s*)+[',",\+, <,>,\),\*,\-,%]?$/
The science bit
Ok, I'm guessing that you didn't construct the original regular expression, so here are the pieces of it, with the addition marked:
^ # match from the beginning of the string
[',",\+,<,>,\(,\*,\-,%]? # optionally one of these symbols
( # start a group
[£,$,€]? # optionally one of these symbols
\s* # <--- NEW ADDITION: optionally one or more whitespace
\d+ # then one or more decimal digits
( # start group
[\,,\.] # comma or a dot
\d+ # then one or more decimal digits
)? # group optional (comma/dot and digits or neither)
[£,$,€]? # optionally one of these symbols
\s* # optionally whitespace
[\-,\/,\,,\.,\+]? # optionally one of these symbols
[\/]? # optionally a /
\s* # optionally whitespace
)+ # this whole group one or more times
[',",\+, <,>,\),\*,\-,%]? # optionally one of these symbols
$ # match to the end of the string
Much of this is poking about matching stuff around the currency amount, so you could reduce that.

Regex validation for comma separated string

I need to validate and input string client side.
Here is an example of the string:
1:30-1:34, 1:20-1:22, 1:30-1:37,
It's basically time codes for a video.
Can this be done with regex?
Banging my head against the wall...
^(?:\b\d+:\d+-\d+:\d+\b(?:, )?)+$
would probably work; at least it matches your example. But you might need to add a few edge cases to make the rules for matching/not matching clearer.
^ # Start of string
(?: # Try to match...
\b # start of a "word" (in this case, number)
\d+ # one or more digits
: # a :
\d+ # one or more digits
- # a dash
\d+ # one or more digits
: # a :
\d+ # one or more digits
\b # end of a "word"
(?:, )? # optional comma and space
)+ # repeat one or more times
$ # until the end of the string
The following is a simple representation. I have assumed that the string has the exact same form as you have shown. This may be a good starting point for you. I'll improve the regex if you provide more specific requirements.
([0-9]+:[0-9]{1,2}-[0-9]+:[0-9]{1,2},\w*)+
Explanation (inspired from Tim above)
[0-9]+   #One ore more digits
:      # A colon
[0-9]{1,2}  #A single digit or a pair of digits
-       #A dash
,       #A comma
\w*      #Optional whitespace

Categories

Resources