I am beginner in RegEx so I am reading the info page of regEx on stackoverflow.
eg: /(d).\1/ matches and captures 'dad' in "abcdadef" while
/(?:.d){2}/ matches but doesn't capture 'cdad'.
I tried :-
var pattern=/(d).\1/
var val="abcdadef";
console.log(pattern.exec(val));
It shows array of ["dad","d"] but i don't know why ?
As said in info it just only capture the "dad" why it is capturing two values in array?.
And what is the use of '\1' in the end of pattern ?
Please provide me more info how to use it.
Thanks :-)
when you use (), you're telling regex to match the in between () and store it as a capturing group. Each match will have capturing groups of its own. Try your expression here. A regex match object is normally a collection that contains the entire match of the regex followed by capturing groups of that match.
Edit: As per your comment below, here's an another pattern (m).\1 and the text upon which we're executing the regex is mum.
In this example, regex will attempt to do the following:
match the literal m and hence we used (), it's going to store the match in a capturing group. This capturing group will make it to the match collection later.
. will match any character other than newline so in our case, it will match the literal u.
\1 will attempt to match the next character using the first matching group as a pattern and that would be the literal m in our case.
The final result will be the regex match of mum and the only capturing group would be m.
Related
How do I get string word from ({#word#}) using regex in Nodejs?
The regex I'm using right now is: /\(({#[^)]+#})\)/g
It gives me string {#word#} from ({#word#}).
How do I get the word out?
Use this regex:
/\(\{#(.*?)#\}\)/g
and grab the first capturing group match:
/\(\{#(.*?)#\})\)/g.exec("({#test#})")[1] === "test"
Capturing groups are expressions between parentheses in a regex that will save the part of the text that matches it. You can have multiple, but in this case we only need one.
The .*? is simple but not efficient, if you have a regex for word then use that instead.
Learn more about capturing groups:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#grouping-back-references
can you help me write regex which gives me word without specified prefix and suffix?
Every word starts with dot (.) and ends with 'Zacher', e.g:
.mobileZacher => output should be mobile
.carZacher => output should be car
.StevenZacher => => output should be Steven
I tried this str.replace(/(?:.)|(?:Zacher)/, '') but it replace only dot
Just try with following regex:
str.replace(/\.(.+?)Zacher/, '$1')
We're looking for dot character, then match everything to the first occurence of Zacher and replace it with the string between those.
You can also replace (.+?) part (which accepts any char) with ([a-zA-Z]+?) to match only letters.
Or make it even case insensitive with i:
str.replace(/\.([a-z]+?)Zacher/i, '$1')
I would extract the group between . and Zacher using this RegEx:
\.(.*)Zacher
The backslash is used to escape the . character.
It will basically tell RegEx not to interpret the . as a jolly character (its standard function in RegEx) but as a simple ".".
Then I'd use it in a string replace.
Since we want to extract the 1st (and only) group extracted we'll use $1:
str.replace(/\.(.*)Zacher/, '$1')
If you want to know more this kind of result is obtained using RegEx grouping function.
Grouping function syntax makes uses of parenthesis (something_in_here).
Here's a brief explanation from Mozilla Documentation:
(x) Matches x and remembers the match. These are called capturing groups.
For example, /(foo)/ matches and remembers "foo" in "foo bar".
The capturing groups are numbered according to the order of left parentheses of capturing groups, starting from 1. The matched substring can be recalled from the resulting array's elements 2, ..., [n] or from the predefined RegExp object's properties $1, ..., $9.
Capturing groups have a performance penalty. If you don't need the matched substring to be recalled, prefer non-capturing parentheses (see below).
I suggest you to experiment with your RegEx using RegExr.
If you want learn more while doing exercises RegExOne was of great help for me.
I'm trying to validate a string using a regex for the following requirement
its a file path (unix) which should not contain any symbol, whitespace etc.
String that starts with ./
String ends with .xml
Allowed char = a-zA-Z_-0-9./ and a group ##ENV
##ENV can be used anywhere in the file name
I'm new to regex and I managed to identify using this regex below, it covers only the first scenario, i want the text before and after ##ENV group to be captured and they should be optional.
/^(.\/[a-zA-Z_\-0-9\.]+)+(##ENV)\.(xml)$/
Eg. scenarios
'./app/settings/conf-##ENV.xml'
'./app/settings/##ENV-conf.xml'
'./app/settings/system-##ENV-conf.xml'
'./app/settings/##ENV.xml'
I have a added a sample code to test on jsfiddle below
Resources:
JS Fiddle: https://jsfiddle.net/u4pcnbLz/3/
Online Regex tool: https://regex101.com/
I would suggest using a regex like this:
(\.\/(?:[a-zA-Z_0-9-\/]*))##ENV((?:[a-zA-Z_0-9-\/]*)\.xml)
To explain what's going on here:
(\.\/(?:[a-zA-Z_0-9-\/]*))
has an inner non capturing group (?:[a-zA-Z_0-9-\/]*) that matches any letters, numbers, underscores, hyphens, or forward slash zero or more times. The outer (capturing) group also contains \.\/ which first matches one period followed by one forward slash.
After matching ##ENV:
((?:[a-zA-Z_0-9-\/]*)\.xml)
has an inner non capturing group (?:[a-zA-Z_0-9-\/]*) that matches any letters, numbers, underscores, hyphens, or forward slash zero or more times. The outer (capturing) group also contains \.xml which afterwards matches one period followed by the word xml.
Hope this will work.
Regex: ^\.\/([a-z\/\-]+)##ENV[a-z\/\-]*(\.xml)$
Regex demo
I have this regular expression:
/([a-záäéěíýóôöúüůĺľŕřčšťžňď])-$\s*/gmi
This regex selects č- from my text:
sme! a Želiezovce 2015: Spoloíč-
ne pre Európu. Oslávili aj 940.
But I want to select only - (without č) (if some character from the list [a-záäéěíýóôöúüůĺľŕřčšťžňď] is before the -).
In other languages you would use a lookbehind
/(?<=[a-záäéěíýóôöúüůĺľŕřčšťžňď])-$\s*/gmi
This matches -$\s* only if it's preceded by one of the characters in the list.
However, Javascript doesn't have lookbehind, so the workaround is to use a capturing group for the part of the regular expression after it.
var match = /[a-záäéěíýóôöúüůĺľŕřčšťžňď](-$\s*)/gmi.match(string);
When you use this, match[1] will contain the part of the string beginning with the hyphen.
First, in regex everything you put in parenthesis will be broken down in the matching process, so that the matches array will contain the full matching string at it's 0 position, followed by all of the regex's parenthesis from left to right.
/[a-záäéěíýóôöúüůĺľŕřčšťžňď](-)$\s*/gmi
Would have returned the following matches for you string: ["č-", "-"] so you can extract the specific data you need from your match.
Also, the $ character indicates in regex the end of the line and you are using the multiline flag, so technically this part \s* is just being ignored as nothing can appear in a line after the end of it.
The correct regex should be /[a-záäéěíýóôöúüůĺľŕřčšťžňď](-)$/gmi
/test-test-test/test.aspx
Hi there,
I am having a bit difficult to retrieve the first bit out from the the above URL.
test-test-test
I tried this /[\w+|-]/g but it match the last test.aspx as well.
Please help out.
Thanks
One way of doing it is using the Dom Parser as stated here: https://stackoverflow.com/a/13465791/970247.
Then you could access to the segments of the url using for example: myURL.segments; // = Array = ['test-test-test', 'test.aspx']
You need to use a positive lookahead assertion. | inside a character class would match a literal | symbol. It won't act like an alternation operator. So i suggest you to remove that.
[\w-]+(?=\/)
(?=\/) called positive lookahead assertion which asserts that the match must be followed by an forward slash. In our case test-test-test only followed by a forward slash, so it got matched. [\w-]+ matches one or more word character or hyphen. + repeats the previous token one or more times.
Example:
> "/test-test-test/test.aspx".match(/[\w-]+(?=\/)/g)
[ 'test-test-test' ]
[\w+|-] is wrong, should be [\w-]+. "A series of characters that are either word characters or hyphens", not "a single character that is a word character, a plus, a pipe, or a hyphen".
The g flag means global match, so naturally all matches will be found instead of just the first one. So you should remove that.
> '/test-test-test/test.aspx'.match(/[\w-]+/)
< ["test-test-test"]