jQuery Regex - Finding All Joined Words - javascript

jQuery Regex
/((\b([a-zA-Z]{0,15})\b)([^a-z0-9\$_]))/g
My Attempt So Far: https://regex101.com/r/d3VUpG/1
Example test string:
(options.method==="
|options.method==="
=options.method==="HEAD"
options.method.options.method==="HEAD"
What I'm Trying TO Achieve
Returned as $1 the value of any connected words such as:
options.method - Would = $1
options.method.options.method - Would also = $1
Question
How can I find all words connected with a dot (.) to then wrap in a span like the below example;
.replace(//gi,'<span class="join">$1</span>')

You can use the following expression:
/((?:\w+\.)+\w+)/g
Explanation:
( - Start of capturing group 1
(?: - Start of a non-capturing group
\w+\. - Match [a-zA-Z0-9_] characters one or more times followed by a literal . character
)+ - End of the non-capturing group; match the group one or more times
\w+ - Match [a-zA-Z0-9_] characters one or more times
) - End of capturing group 1
So in other words, the non-capturing group, (?:\w+\.)+, will match a substring like option. one or more times followed by a final \w+ which will match the final word without a literal . character following it. Since there is only one capturing group wrapping everything, you can wrap your span tag around the first group, $1.
Live Example
string.replace(/((?:\w+\.)+\w+)/g, '<span class="join">$1</span>');
As mentioned above, \w includes underscore, numbers and letters ([a-zA-Z0-9_]), so if you only want to match letter characters, then you could swap out \w with [a-z] and use the case-insensitive flag:
/((?:[a-z]+\.)+[a-z]+)/gi

Related

How do i write a RegEx that starts reading from behind?

I have a series of words I try to capture.
I have the following problem:
The string ends with a fixed set of words
It is not clearly defined how many words the string consists of. However, it should capture all words that start with a upper case letter (German language). Therefore, the left anchor should be the first word starting with lower case.
Example (bold is what I try to capture):
I like Apple Bananas And Cars.
building houses Might Be Salty + Hard said Jessica.
This is the RegEx I tried so far, it only works, if the "non-capture" string does not include any upper case words:
/(?:[a-zäöü]*)([\p{L} +().&]+[Cars|Hard])/gu
You might start the match with an uppercase character allowing German uppercase chars as well, and then optionally repeat matching either words that start with an uppercase character, or a "special character.
Then end the match with an alternation matching either Hard or Cars.
(?<!\S)[A-ZÄÖÜß][a-zA-ZäöüßÄÖÜẞ]*(?:\s+(?:[A-ZÄÖÜß][a-zA-ZäöüßÄÖÜẞ]*|[+()&]))*\s+(?:Hard|Cars)\b
Explanation
(?<!\S) Assert a whitespace boundary to the left to prevent starting the match after a non whitespace char
[A-ZÄÖÜß][a-zA-ZäöüßÄÖÜẞ]* Match a word that starts with an uppercase char
(?: Non capture group to match as a whole part
\s+ Match 1+ whitespace chars
(?: Non capture group
[A-ZÄÖÜß][a-zA-ZäöüßÄÖÜẞ]* Match a word that starts with uppercase
| Or
[+()&] Match one of the "special" chars
) Close the non capture group
)* Close the non capture group and optionally repeat it
\s+ Match 1+ whitespace chars
(?:Hard|Cars) Match one of the alternatives
\b A word boundary to prevent a partial word match
See a regex demo.
Use \p{Lu} for uppercase letters:
(?:[\p{Lu}+()&][\p{L}+()&]* )+(?:Cars|Hard)
See live demo (showing matching umlauted letters and ß).

JavaScript Regular Expression to repeat element match in string

Yes I am a newbie, I have looked online and cannot seem to find the answer to the following, I know it must be simple.
I have a simple string and need to match spaced Capitals eg T G D ......repeater,
Secondly I need to match capitals with a dot between them and no space eg T.G.D ........repeater
I have the current string = str.match(/ [A-Z] [A-Z] | [A-Z].[A-Z]/g)
but this will only match the first two e.g T G I Need it to match wherever it finds the following pattern eg T G D E F L ...repeater as a one match
Likewise It will only match the T.G but nothing after e.g T.G I Need to match T.G.D.L.T repeater (may end with a dot and may not)
any help will be appreciated.
You might use a capturing group matching either a dot or a space in a character class ([. ])
Match the first 2 capitals and capture the space or dot in group 1. Then optionally repeat what is captured using a back reference followed by a captital A-Z.
\b[A-Z]([. ])[A-Z](?:\1[A-Z])*\b
\b Word boundary
[A-Z]([. ])[A-Z] Match A-Z, capture in group 1 matching either space or .
(?:\1[A-Z])* Repeat 0+ times matching what it captured in group 1 followed by A-Z
\b Word boundary
Regex demo

Match everything Between two Characters except when there is a Blank line

I am trying to find a regex pattern that matches everything between one or two dollar signs, \$.*\$|\${2}.*\${2}, except when there is a blank line (it's either two or one, can't be this: \$.*\$\$). Below, I provide examples of what I want to match and what I want to skip. The match should include/exclude everything.
Examples of what I want to match:
$$ \abc + ko$$
$*-ls$
Here the single dollar sign has a escape character before it so it won't break the match.
$$
654a\$
$$
$123
a*/\
[]{}$
Examples of what I want to exclude:
$$
asd
$$
$asdasd$$
Again, I want to match everything if they are bound by one $ or two $ at each side, unless there is (are) empty line(s) in between.
So far I figured out how to match the ones occurring in a single line, but I am struggling how to include break-line and exclude them if the whole line is empty.
Here is what I have:
^\${2}.*[^\\$]\${2}$|^\$.*[^\\$]\$$
Demo
You may use
/^[^\S\r\n]{0,3}(\${1,2})(?:(?!\1|^$)[\s\S])+?\1[^\S\r\n]*$/gm
See the regex demo
Details
^ - start of a line (since m makes ^ match line start positions)
[^\S\r\n]{0,3} - zero to three occurrences of any whitespace but CR and LF
(\${1,2}) - Group 1 holding one or two $ chars
(?:(?!\1|^$)[\s\S])+? - any char ([\s\S]), 1 or more occurrences, but as few as possible (due to the lazy +? quantifier), that does not start the same sequence as captured in Group 1 (\1) and a position between two line break chars (^$)
\1 - the same value as in Group 1 ($ or $$)
[^\S\r\n]* - zero or more occurrences of any whitespace but CR and LF
$ - end of a line (since m makes ^ match line start positions)
For your example data, you might use
(?<!\S)(\$\$?+)[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)*(?:\r?\n(?![^\S\r\n]*$)[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)*)*\1(?!\S)
Explanation
(?<!\S) Assert a whitespace boundary on the left
(\$\$?+) Capture group 1, match $ or $$ where the second one is possessive (prevent backtracking)
[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)* Match any char except $ or newline or a $ when not directly followed by another $
(?: Non capture group
\r?\n(?![^\S\r\n]*$) Match a newline, assert not a line consisting of only spaces
[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)* Same pattern as above
)* Close the group and repeat 0+ times
\1 Backreference to what is captured in group 1
(?!\S) Assert a whitespace boundary on the right
Regex demo

How to match any string that contains no consecutively repeating letter

My regular expression should match if there aren't any consecutive letters that are the same.
for example :
"ploplir" should match
"ploppir" should not match
so I use this regular expression:
/([.])\1{1,}/
But It does the exact contrary of what I want. How can I make the match work correctly?
Code
See regex in use here
\b(?!\w*(\w)\1)\w+\b
var r = /\b(?!\w*(\w)\1)\w+\b/g
var s = "ploplir ploppir"
console.log(s.match(r))
Explanation
\b Assert position as a word boundary
(?!\w*(\w)\1\w*) Negative lookahead ensuring what follows doesn't match
\w* Match any number of word characters
(\w) Capture a word character into capture group 1
\1 Match the same text as most recently matched by the 1st capture group
\w+ Match one or more word characters
\b Assert position as a word boundary
Maybe you could use lookarounds to check if there are no consecutive letters in the string:
^(?!.*(.)(?=\1)).*$
Explanation
From the beginning of the string ^
A negative look ahead (?!
Which asserts that following .* a character (.) is not followed by the same character (?=\1) using the group reference \1
Close the negative lookahead
Match zero or more characters .*
The end of the string

Explain this regex js

I'm using this regex to match some strings:
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
I'm confusing about why it's permitted to enter two dots, like ..
What I understand is that only allowed to put 1 dash or none (-)?
Any digits with no limit or none (\d+)?
One dot or none (\.)?
Why is allowed to put .. or even .4.6?
Testing done in http://www.regextester.com/
[^\s] means anything that is not a whitespace. This includes dots. Trying to match .. will get you:
[^\s] matches .
(-)? doesn't match
(\d+)? doesn't match
(\.)? matches .
(\d+)? doesn't match
I'll assume you wanted to match numbers (possibly negative/floating):
^-?\d+(\.\d+)?$
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
Assert position at the beginning of the string ^
Match the regex below and capture its match into backreference number 1 ([^\s](-)?(\d+)?(\.)?(\d+)?)
Match any single character that is NOT present in the list below and that is NOT a line break character (line feed) [^\s]
A single character from the list “\s” (case sensitive) \s
Match the regex below and capture its match into backreference number 2 (-)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “-” literally -
Match the regex below and capture its match into backreference number 3 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Match the regex below and capture its match into backreference number 4 (\.)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “.” literally \.
Match the regex below and capture its match into backreference number 5 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Assert position at the very end of the string $
Match the character “/” literally /
Created with RegexBuddy
As I mentioned in my comment, [^\n] is a negated character class that matches .. and as there is another (\.)? pattern, the regex can match 2 consecutive dots (since all of the parts except for [^\s] are optional).
In order not to match strings like .4.5 or .. you just need to add the . to the [^\n] negated character class:
^([^\s.](-)?(\d+)?(\.)?(\d+)?)$
^
See demo. This will not let any . in the initial capturing group.
You can use a lookahead to only disallow the first character as a dot:
^(?!\.)([^\s](-)?(\d+)?(\.)?(\d+)?)$
See another demo
All explanation is available at the online regex testers:
In order to match the numbers in the format you expect, use:
^(?:[-]?\d+\.?\d*|-)$
Human-readable explanation:
^ - start of string and then there are 2 alternatives...
[-]? - optional hyphen
\d+ - 1 or more digits
\.? - optional dot
\d* - 0 or more digits
| -OR-
- - a hyphen
$ - end of string
See demo

Categories

Resources