Match everything Between two Characters except when there is a Blank line - javascript

I am trying to find a regex pattern that matches everything between one or two dollar signs, \$.*\$|\${2}.*\${2}, except when there is a blank line (it's either two or one, can't be this: \$.*\$\$). Below, I provide examples of what I want to match and what I want to skip. The match should include/exclude everything.
Examples of what I want to match:
$$ \abc + ko$$
$*-ls$
Here the single dollar sign has a escape character before it so it won't break the match.
$$
654a\$
$$
$123
a*/\
[]{}$
Examples of what I want to exclude:
$$
asd
$$
$asdasd$$
Again, I want to match everything if they are bound by one $ or two $ at each side, unless there is (are) empty line(s) in between.
So far I figured out how to match the ones occurring in a single line, but I am struggling how to include break-line and exclude them if the whole line is empty.
Here is what I have:
^\${2}.*[^\\$]\${2}$|^\$.*[^\\$]\$$
Demo

You may use
/^[^\S\r\n]{0,3}(\${1,2})(?:(?!\1|^$)[\s\S])+?\1[^\S\r\n]*$/gm
See the regex demo
Details
^ - start of a line (since m makes ^ match line start positions)
[^\S\r\n]{0,3} - zero to three occurrences of any whitespace but CR and LF
(\${1,2}) - Group 1 holding one or two $ chars
(?:(?!\1|^$)[\s\S])+? - any char ([\s\S]), 1 or more occurrences, but as few as possible (due to the lazy +? quantifier), that does not start the same sequence as captured in Group 1 (\1) and a position between two line break chars (^$)
\1 - the same value as in Group 1 ($ or $$)
[^\S\r\n]* - zero or more occurrences of any whitespace but CR and LF
$ - end of a line (since m makes ^ match line start positions)

For your example data, you might use
(?<!\S)(\$\$?+)[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)*(?:\r?\n(?![^\S\r\n]*$)[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)*)*\1(?!\S)
Explanation
(?<!\S) Assert a whitespace boundary on the left
(\$\$?+) Capture group 1, match $ or $$ where the second one is possessive (prevent backtracking)
[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)* Match any char except $ or newline or a $ when not directly followed by another $
(?: Non capture group
\r?\n(?![^\S\r\n]*$) Match a newline, assert not a line consisting of only spaces
[^\r\n$]*(?:\$(?!\$)[^\r\n$]*)* Same pattern as above
)* Close the group and repeat 0+ times
\1 Backreference to what is captured in group 1
(?!\S) Assert a whitespace boundary on the right
Regex demo

Related

RegExp avoid double space and space before characters

I'm trying to write a regular expression in order to not allow double spaces anywhere in a string, and also force a single space before a MO or GO mandatory, with no space allowed at the beginning and at the end of the string.
Example 1 : It is 40 GO right
Example 2 : It is 40GO wrong
Example 3 : It is 40 GO wrong
Here's what I've done so far ^[^ ][a-zA-Z0-9 ,()]*[^;'][^ ]$, which prevents spaces at the beginning and at the end, and also the ";" character. This one works like a charm.
My issue is not allowing double spaces anywhere in the string, and also forcing spaces right before MO or GO characters.
After a few hours of research, I've tried these (starting from the previous RegExp I wrote):
To prevent the double spaces: ^[^ ][a-zA-Z0-9 ,()]*((?!.* {2}).+)[^;'][^ ]$
To force a single space before MO: ^[^ ][a-zA-Z0-9 ,()]*(?=\sMO)*[^;'][^ ]$
But neither of the last two actually work. I'd be thankful to anyone that helps me figure this out
The lookahead (?!.* {2} can be omitted, and instead start the match with a non whitespace character and end the match with a non whitespace character and use a single space in an optionally repeated group.
If the string can not contain a ' or ; then using [^;'][^ ]$ means that the second last character should not be any of those characters.
But you can omit that part, as the character class [a-zA-Z0-9,()] does not match ; and '
Note that using a character class like [^ ] and [^;'] actually expect a single character, making the pattern that you tried having a minimum length.
Instead, you can rule out the presence of GO or MO preceded by a non whitespace character.
^(?!.*\S[MG]O\b)[a-zA-Z0-9,()]+(?: [a-zA-Z0-9,()]+)*$
The pattern matches:
^ Start of string
(?!.*\S[MG]O\b) Negative lookahead, assert not a non whitspace character followed by either MO or GO to the right. The word boundary \b prevents a partial word match
[a-zA-Z0-9,()]+ Start the match with 1+ occurrences of any of the listed characters (Note that there is no space in it)
(?: [a-zA-Z0-9,()]+)* Optionally repeat the same character class with a leading space
$ End of string
Regex demo

Regular Expression Strict Test

I am trying to create regex where user have to enter exactly the same thing no extra no less
Here is my regex;
/[a-zA-Z0-9][a-zA-Z0-9\-]*\.myshopify\.com/
when I test this with, for example, myshop.myshopify.coma it returns true or myshop.myshopify.com myshop123.myshopify.com still returns true
What I am trying to get is if user enters myshop.myshopify.coma or myshop321.myshopify.com myshop123.myshopify.coma it shouldn't be match.
It should only match when the entire input is exactly like this [anything except ()=>%$ etc].myshopify.com
what should I include in my regex to strictly test exactly one thing.
you can use boundary-type assertions to match the beginning of an input (^) and an end ($) - to make sure your input matches fully.
const pattern = /^[a-zA-Z0-9][a-zA-Z0-9\-]*\.myshopify\.com$/
console.log(pattern.test('myshop.myshopify.com')) // true
console.log(pattern.test('myshop.myshopify.coma')) // false
console.log(pattern.test('myshop.myshopify.com myshop123.myshopify.com')) // false
You'd currently allow for input like "A---", so besides the good point about start and end line anchors, you'd maybe want to reconsider your pattern. Maybe something like:
^[a-z\d]+(?:-[a-z\d]+)*\.myshopify\.com$
See the online demo
^ - Start line anchor.
[a-z\d]+ - 1+ any alnum character.
(?: - Open non-capture group:
-[a-z\d]+ - A literal hyphen followed by 1+ alnum chars.
)* - Close non-capture group and match it zero or more times.
\.myshopify\.com - Match a ".myshopify.com" literallyy.
$ - End line anchor.
A 2nd option would be to use a negative lookahead to achieve the same concept:
^(?!-|.*-[-.])[a-z\d-]+\.myshopify\.com$
See the online demo
^ - Start line anchor.
(?! - Negative lookahead for:
- - A leading hypen
| - Or:
.*-[-.] - Any character other than newline zero or more times up to an hypen with either another hypen or a literal dot.
) - Close negative lookahead.
[a-zA-Z\d]+ - 1+ any alnum character.
\.myshopify\.com - Match a ".myshopify.com" literallyy.
$ - End line anchor.
In both cases I used both the global and case-insensitive flags: /<pattern>/gi. See a sample below:
const patt1 = /^[a-z\d]+(?:-[a-z\d]+)*\.myshopify\.com$/gi
console.log(patt1.test('myshop.myshopify.com'))
console.log(patt1.test('myshop-.myshopify.com'))
const patt2 = /^(?!-|.*-[-.])[a-z\d-]+\.myshopify\.com$/gi
console.log(patt2.test('myshop.myshopify.com'))
console.log(patt2.test('myshop-.myshopify.com'))

Regular expression to match line separated size strings

I am writing a reular expression to validate input string, which is a line separated list of sizes ([width]x[height]).
Valid input example:
300x200
50x80
100x100
The regular expression I initially came up with is (https://regex101.com/r/H9JDjA/1):
^(\d+x\d+[\r\n|\r|\n]*)+$
This regular expression matches my input but also matches this invalid input (size can't be 100x100x200):
300x200
50x80
100x100x200
Adding a word boundary at the end seems to have fixed this issue:
^(\d+x\d+[\r\n|\r|\n]*\b)+$
My questions:
Why does the initial regular expression without the word boundary fail? It looks like I am matching one or more instances of a \d+(number), followed by character 'x', followed by a \d+(number), followed by one or more new lines from various operating systems.
How to validate input having multiple training new line characters in this input? The following doesn't work for some kind of inputs like this:
500x500\n100x100\n\n\n384384
^(\d+x\d+[\r\n|\r|\n]\b)+|[\r\n|\r|\n]$
Isolate the problem with this target 100x100x200
For now, forget about the anchors in the regex.
The minimum regex is \d+x\d+ since it only has to be satisfied once
for a match to take place.
The maximum is something like this \d+x\d+ (?: (?:\r?\n | \r)* \d+x\d+ )*
Since \r?\n|\r is optional, it can be reduced to this \d+x\d+ (?: \d+x\d+ )*
The result, when you applied to the target string is:
100x100x200 matches.
But, since you've anchored the regex ^$, it is forced to break up
the middle 100 to make it match.
100x10 from \d+x\d+
0x200 from (?: \d+x\d+ )*
So, that is why the first regex seemingly matches 100x100x200.
To avoid all of that, just require a line break between them, and
make the trailing linebreaks optional (if you need to validate the whole
string, otherwise leave it and the end anchor off).
^\d+x\d+(?:(?:\r?\n|\r)+\d+x\d+)*(?:\r?\n|\r)*$
A better view of it
^
\d+ x \d+
(?:
(?: \r? \n | \r )+
\d+ x \d+
)*
(?: \r? \n | \r )*
$
Your initial regular expression "fails" because of the +:
^(\d+x\d+[\r\n|\r|\n]*)+$
-----------------------^ here
Your parenthesis pattern (\d+x\d+[\r\n|\r|\n]*) says match one or more number followed by an "x" followed by one or more number followed by zero or more newlines. The + after that says match one or more of the entire parenthesis pattern, which means that for an input like 100x200x300 your pattern matches 100x200 and then 200x300, so it looks like it matches the entire line.
If you're simply trying to extract dimensions from a newline-separated string, I would use the following regular expression with a multiline flag:
^(\d+x\d+)$
https://regex101.com/r/H9JDjA/2
Side note: In your expression, [\r\n|\r|\n] is actually saying match any one instance of \r, \n, |, \r, |, or \n (i.e. it's quite redundant, and you probably aren't meaning to match |). If you want to match a sequential set of any combination of \r or \n, you can simply use [\r\n]+.
You can use multiline modifier, which should make life easier:
var input = "\n\
300x200x400\n\
50x80\n\
\n\
\n\
300x200\n\
50x80\n\
100x100x200x100\n";
var allSizes = input.match(/^\d+x\d+/gm); // multiline modifier assumes each line has start and end
for (var size in allSizes)
console.log(allSizes[size]);
Prints:
300x200
50x80
300x200
50x80
100x100
Try this regex out
^[0-9]{1,4}x[0-9]{1,4}|[(\r\n|\r|\n)]+$
It'll match these inputs.
1x1
10x10
100x100
2000x2938
\n
\r
\r\n
but not this 100x100x200

jQuery Regex - Finding All Joined Words

jQuery Regex
/((\b([a-zA-Z]{0,15})\b)([^a-z0-9\$_]))/g
My Attempt So Far: https://regex101.com/r/d3VUpG/1
Example test string:
(options.method==="
|options.method==="
=options.method==="HEAD"
options.method.options.method==="HEAD"
What I'm Trying TO Achieve
Returned as $1 the value of any connected words such as:
options.method - Would = $1
options.method.options.method - Would also = $1
Question
How can I find all words connected with a dot (.) to then wrap in a span like the below example;
.replace(//gi,'<span class="join">$1</span>')
You can use the following expression:
/((?:\w+\.)+\w+)/g
Explanation:
( - Start of capturing group 1
(?: - Start of a non-capturing group
\w+\. - Match [a-zA-Z0-9_] characters one or more times followed by a literal . character
)+ - End of the non-capturing group; match the group one or more times
\w+ - Match [a-zA-Z0-9_] characters one or more times
) - End of capturing group 1
So in other words, the non-capturing group, (?:\w+\.)+, will match a substring like option. one or more times followed by a final \w+ which will match the final word without a literal . character following it. Since there is only one capturing group wrapping everything, you can wrap your span tag around the first group, $1.
Live Example
string.replace(/((?:\w+\.)+\w+)/g, '<span class="join">$1</span>');
As mentioned above, \w includes underscore, numbers and letters ([a-zA-Z0-9_]), so if you only want to match letter characters, then you could swap out \w with [a-z] and use the case-insensitive flag:
/((?:[a-z]+\.)+[a-z]+)/gi

Explain this regex js

I'm using this regex to match some strings:
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
I'm confusing about why it's permitted to enter two dots, like ..
What I understand is that only allowed to put 1 dash or none (-)?
Any digits with no limit or none (\d+)?
One dot or none (\.)?
Why is allowed to put .. or even .4.6?
Testing done in http://www.regextester.com/
[^\s] means anything that is not a whitespace. This includes dots. Trying to match .. will get you:
[^\s] matches .
(-)? doesn't match
(\d+)? doesn't match
(\.)? matches .
(\d+)? doesn't match
I'll assume you wanted to match numbers (possibly negative/floating):
^-?\d+(\.\d+)?$
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
Assert position at the beginning of the string ^
Match the regex below and capture its match into backreference number 1 ([^\s](-)?(\d+)?(\.)?(\d+)?)
Match any single character that is NOT present in the list below and that is NOT a line break character (line feed) [^\s]
A single character from the list “\s” (case sensitive) \s
Match the regex below and capture its match into backreference number 2 (-)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “-” literally -
Match the regex below and capture its match into backreference number 3 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Match the regex below and capture its match into backreference number 4 (\.)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “.” literally \.
Match the regex below and capture its match into backreference number 5 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Assert position at the very end of the string $
Match the character “/” literally /
Created with RegexBuddy
As I mentioned in my comment, [^\n] is a negated character class that matches .. and as there is another (\.)? pattern, the regex can match 2 consecutive dots (since all of the parts except for [^\s] are optional).
In order not to match strings like .4.5 or .. you just need to add the . to the [^\n] negated character class:
^([^\s.](-)?(\d+)?(\.)?(\d+)?)$
^
See demo. This will not let any . in the initial capturing group.
You can use a lookahead to only disallow the first character as a dot:
^(?!\.)([^\s](-)?(\d+)?(\.)?(\d+)?)$
See another demo
All explanation is available at the online regex testers:
In order to match the numbers in the format you expect, use:
^(?:[-]?\d+\.?\d*|-)$
Human-readable explanation:
^ - start of string and then there are 2 alternatives...
[-]? - optional hyphen
\d+ - 1 or more digits
\.? - optional dot
\d* - 0 or more digits
| -OR-
- - a hyphen
$ - end of string
See demo

Categories

Resources