Javascript RegEx Templating Edge Case - javascript

I have a RegEx implemented with JavaScript that is close to doing what I want. However, I am having an issue figuring out the last piece which is causing an issue with an edge case. Here is the RegEx that I have so far:
/\$\{(.+?(}\(.+?\)|}))/g
The idea is that this RegEx would use a templating system to replace/inject variables in a string based on templated variables. Here is an example of the edge case issue:
"Here is a template string ${G:SomeVar:G${G:SomeVar:G} that value gets injected in."
The problem is the RegEx is matching this:
"${G:SomeVar:G${G:SomeVar:G}"
What I want it to match is this:
"${G:SomeVar:G}"
How would I get the RegEx to match the expected variable in this edge case?

You have an alternation in your pattern to either stop at } or also match a following (...) after it.
As the dot can match any character, you can use a negated character class to exclude matching { } ( )
If you want to match ${G:SomeVar:G} but also ${G:SomeVar:G}(test) you can add an optional non capture group after it.
For a match only, you can omit the capture groups.
\$\{[^{}]*}(?:\([^()]*\))?
See a regex101 demo.
If the format of the string with the : and the same character before and after it should be matched, you can use a capture group with a backreference:
\$\{([A-Z]):[^{}]*?:\1}(?:\([^()]*\))?
See a regex101 demo.

Instead of matching anything with (.+?), change it to not match another closing brace or dollar sign, [^{$].
\$\{([^{$]+?(}\(.+?\)|}))

Related

Javascript Regex Conditional

I have strings like this:
#WTK-56491650H #=> want to capture '56491650H'
#M123456 #=> want to capture 'M123456'
I want to match everything after the # unless there is a dash then I want everything after the dash. I have a feeling I'm close but maybe not. I've found a lot of stuff about javascript regex conditionals and I can never get it to do the if then else part. It only matches after the # and that's it.
This is what I have so far:
/((?=-{1})-(.+)|(?!-{0)#(.+))/
And the demo: https://regex101.com/r/bY0yC6/1
You can use this regex with an optional match to consume everything between # and -:
/#(?:[^-]*-)?([^#-]+)$/mg
Updated RegEx Demo
Here's a solution which uses non-capturing groups (?:stuff) which I prefer so I don't have to dig through the result groups to find the string I'm interested in.
(?:#)(?:[\w\d]+-)?([\w\d]+)
First it throws out the # character, then throws out the stuff up to and including the - character, if it is there, then groups the rest as your match.
With a single regular expression, your full match will always contain the hash and/or dash because you are using it to define an acceptable string, but the groupings of a match can provide you the information that you're looking for.
you want the string to start with a hash so your regex should contain the #
next, you don't want anything before and including a dash (.*-)?, and we add a question mark because this is an optional part (ie if there is no dash)
finally, we can grab everything that is left into a final group, which will be your answer (.*)
the full expression is then #(.*-)?(.*) as pointed out by Lux

Select a character if some character from a list is before the character

I have this regular expression:
/([a-záäéěíýóôöúüůĺľŕřčšťžňď])-$\s*/gmi
This regex selects č- from my text:
sme! a Želiezovce 2015: Spoloíč-
ne pre Európu. Oslávili aj 940.
But I want to select only - (without č) (if some character from the list [a-záäéěíýóôöúüůĺľŕřčšťžňď] is before the -).
In other languages you would use a lookbehind
/(?<=[a-záäéěíýóôöúüůĺľŕřčšťžňď])-$\s*/gmi
This matches -$\s* only if it's preceded by one of the characters in the list.
However, Javascript doesn't have lookbehind, so the workaround is to use a capturing group for the part of the regular expression after it.
var match = /[a-záäéěíýóôöúüůĺľŕřčšťžňď](-$\s*)/gmi.match(string);
When you use this, match[1] will contain the part of the string beginning with the hyphen.
First, in regex everything you put in parenthesis will be broken down in the matching process, so that the matches array will contain the full matching string at it's 0 position, followed by all of the regex's parenthesis from left to right.
/[a-záäéěíýóôöúüůĺľŕřčšťžňď](-)$\s*/gmi
Would have returned the following matches for you string: ["č-", "-"] so you can extract the specific data you need from your match.
Also, the $ character indicates in regex the end of the line and you are using the multiline flag, so technically this part \s* is just being ignored as nothing can appear in a line after the end of it.
The correct regex should be /[a-záäéěíýóôöúüůĺľŕřčšťžňď](-)$/gmi

Unable to find a string matching a regex pattern

While trying to submit a form a javascript regex validation always proves to be false for a string.
Regex:- ^(([a-zA-Z]:)|(\\\\{2}\\w+)\\$?)(\\\\(\\w[\\w].*))+(.jpeg|.JPEG|.jpg|.JPG)$
I have tried following strings against it
abc.jpg,
abc:.jpg,
a:.jpg,
a:asdas.jpg,
What string could possible match this regex ?
This regex won't match against anything because of that $? in the middle of the string.
Apparently using the optional modifier ? on the end string symbol $ is not correct (if you paste it on https://regex101.com/ it will give you an error indeed). If the javascript parser ignores the error and keeps the regex as it is this still means you are going to match an end string in the middle of a string which is supposed to continue.
Unescaped it was supposed to match a \$ (dollar symbol) but as it is written it won't work.
If you want your string to be accepted at any cost you can probably use Firebug or a similar developer tool and edit the string inside the javascript code (this, assuming there's no server side check too and assuming it's not wrong aswell). If you ignore the $? then a matching string will be \\\\w\\\\ww.jpg (but since the . is unescaped even \\\\w\\\\ww%jpg is a match)
Of course, I wrote this answer assuming the escaping is indeed the one you showed in the question. If you need to find a matching pattern for the correctly escaped one ^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w].*))+(\.jpeg|\.JPEG|\.jpg|\.JPG)$ then you can use this tool to find one http://fent.github.io/randexp.js/ (though it will find weird matches). A matching pattern is c:\zz.jpg
If you are just looking for a regular expression to match what you got there, go ahead and test this out:
(\w+:?\w*\.[jpe?gJPE?G]+,)
That should match exactly what you are looking for. Remove the optional comma at the end if you feel like it, of course.
If you remove escape level, the actual regex is
^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w].*))+(.jpeg|.JPEG|.jpg|.JPG)$
After ^start the first pipe (([a-zA-Z]:)|(\\{2}\w+)\$?) which matches an alpha followed by a colon or two backslashes followed by one or more word characters, followed by an optional literal $. There is some needless parenthesis used inside.
The second part (\\(\w[\w].*))+ matches a backslash, followed by two word characters \w[\w] which looks weird because it's equivalent to \w\w (don't need a character class for second \w). Followed by any amount of any character. This whole thing one or more times.
In the last part (.jpeg|.JPEG|.jpg|.JPG) one probably forgot to escape the dot for matching a literal. \. should be used. This part can be reduced to \.(JPE?G|jpe?g).
It would match something like
A:\12anything.JPEG
\\1$\anything.jpg
Play with it at regex101. A better readable could be
^([a-zA-Z]:|\\{2}\w+\$?)(\\\w{2}.*)+\.(jpe?g|JPE?G)$
Also read the explanation on regex101 to understand any pattern, it's helpful!

Javascript regex: how to not capture an optional string on the right side

For example /(www\.)?(.+)(\.com)?/.exec("www.something.com") will result with 'something.com' at index 1 of the resulting array. But what if we want to capture only 'something' in a capturing group?
Clarifications:
The above string is just for example - we dont want to assume anything about the suffix string (.com above). It could as well be orange.
Just this part can be solved in C# by matching from right to left (I dont know of a way of doing that in JS though) but that will end up having www. included then!
Sure, this problem as such is easily solvable mixing regex with other string methods like replace / substring. But is there a solution with only regex?
(?:www\.)?(.+?)(?:\.com|$)
This will give only something ingroups.Just make other groups non capturing.See demo.
https://regex101.com/r/rO0yD8/4
Just removing the last character (?) from the regex does the trick:
https://regex101.com/r/uR0iD2/1
The last ? allows a valid output without the (\.com) matching anything, so the (.+) can match all the characters after the www..
Another option is to replace the greedy quantifier +, which always tries to match as much characters as possible, with the +?, which tries to match as less characters as possible:
(www\.)?(.+?)(\.com)?$
https://regex101.com/r/oY7fE0/2
Note that it is necessary to force a match with the entire string through the end of line anchor ($).
If you only want to capture "something", use non-capturing groups for the other sections:
/(?:www\.)?(.+)(?:\.com)?/.exec("www.something.com")
The ?: denotes the groups as non-capturing.

How to make a JS RegEx not to match if a certain string appear?

I am trying to build a RegEx in Javascript that does not match if a certain string appear. So I want to match this curly bracket { but only if in front of it is not the string else.
I am trying to do this ^ *[^else]* *{.*$ but in fact this doe not match if any character in elsestring appear, for example this does not match also this:
erai {
I want to match all the cases when { appear despite of this case else {.
Please could you help me. Here is my DEMO
You can use a negative lookahead. This is supported by JavaScript:
(?!\s*else).+ *({).*$|
DEMO
JavaScript RegEx doesn't support ifs but we can use a trick for it to work:
(?!RegExp)
That's the first part, if RegExp (which is a regex) doesn't appear, then we do the code after that:
.+ *({).*$
That's the RegEx we run. Broken does, it is:
.+ Match anything
* Until 0 - unlimited spaces
({) Capture the {
.*$ Match anything till the end
Now this won't work unless we add a | at the end, or an OR. This will trick it into working like an if statement
Debuggex Demo
What you are looking for is called negative lookbehind.

Categories

Resources