javascript regexp can't find way to access grouped results - javascript

re = //?(\w+)/(\w+)/
s = '/projects/new'
s.match(re)
I have this regular expression which I will use to sieve out the branch name, e.g., projects, and the 'tip' name, e.g., new
I read that one can have access to the grouped results with $1, $2, and so on, but I can't seem to get it to work, at least in Firebug console
When I run the above code, then run
RegExp.$1
it shows
""
Same goes on for $2.
any ideas?
Thanks!

Without the g flag, str.match(regexp) returns the same as regexp.exec(str). And that is:
The returned array has the matched text as the first item, and then one item for each capturing parenthesis that matched containing the text that was captured. If the match fails, the exec method returns null.
So you can do this:
var match = s.match(re);
match[1]

match gives you an array of the matched expressions:
> s.match(re)[1]
"projects"
> s.match(re)[2]
"new"

I you are accessing the array of matches the wrong way do something like:
re = /\/?(\w+)\/(\w+)/
s = '/projects/new'
var j = s.match(re)
alert(j[1]);
alert(j[2]);

Related

Javascript exec maintaing state

I am currently trying to build a little templating engine in Javascript by replacing tags in a html5 tag by find and replace with a regex.
I am using exec on my regular expression and I am looping over the results. I am wondering why the regular expressions breaks in its current form with the /g flag on the regular expression but is fine without it?
Check the broken example and remove the /g flag on the regular expression to view the correct output.
var TemplateEngine = function(tpl, data) {
var re = /(?:<|<)%(.*?)(?:%>|>)/g, match;
while(match = re.exec(tpl)) {
tpl = tpl.replace(match[0], data[match[1]])
}
return tpl;
}
https://jsfiddle.net/stephanv/u5d9en7n/
Can somebody explain to me a little bit more on depth why my example breaks exactly on:
<p><%more%></p>
The reason is explained in javascript string exec strange behavior.
The solution you need is actually a String.replace with a callback as a replacement:
var TemplateEngine = function(tpl, data) {
var re = /(?:<|<)%(.*?)(?:%>|>)/g, match;
return tpl.replace(re, function($0, $1) {
return data[$1] ? data[$1] : $0;
});
}
See the updated fiddle
Here, the regex finds all non-overlapping matches in the string, sequentially, and passes the match to the callback method. $0 is the full match and $1 is the Group 1 contents. If data[$1] exists, it is used to replace the whole match, else, the whole match is inserted back.
Check this link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex. When using the g flag the object that you store the regex in (re) will keep track of the position of the last match in the lastIndex property and the next time you use that object the search will start from the position of lastIndex.
To solve this you could either manually reset the lastIndex property each time or not save the regex in an object and use it inline like so:
while(match = /(?:<|<)%(.*?)(?:%>|>)/g.exec(tpl)) {

regex javascript doesn't return parenthesis

When I am trying to do my regex in js:
var matc = source.match(/sometext(\d+)/g);
The result I get is "sometext5615", "sometext5616"...etc
But what I want: is to get "5615", "5616"...etc
Do you have any idea how to get only what is inside the parenthese ?
String.prototype.match has two different behaviors:
If the regex doesn't have the global g flag, it returns regex.exec(str). That means that, if there is a match, you will get an array where the 0 key is the match, the key 1 is the first capture group, the 2 key is the second capture group, and so on.
If the regex has the global g flag, it returns am array with all matches, but without the capturing groups.
Therefore, if you didn't use the global flag g, you could use the following to get the first capture group
var matc = (source.match(/sometext(\d+)/) || [])[1];
However, since you use the global flag, you must iterate all matches manually:
var rg = /sometext(\d+)/g,
match;
while(match = rg.exec(source)) {
match[1]; // Do something with it, e.g. push it to an array
}
JavaScript does not have a "match all" for global matches, so you cannot use g in this context and also have capture groups. The simplest solution would be to remove the g and then just use matc[1] to get 5615, etc.
If you need to match multiple of these within the same string then your best bet would be to do a "search and don't replace"
var matc = [];
source.replace(/sometext(\d+)/g, function (_, num) {
matc.push(num);
});

Regular Expression, Get Sub Pattern

Why am I not able to grab the subpattern? The console displays undefined when I am expecting hello to be output. If I change matches[1] to matches[0] I get {{hello}}. So, Why can I not access the subpattern?
var str = "{{hello}}";
var matches = str.match(/{{(.+)}}/ig);
console.log(matches[1]);
Try:
str.match(/{{(.+)}}/i);
instead.
It seems like you're looking for the behavior of RegExp.exec. MDN states this:
If the regular expression does not include the g flag, returns the same result as regexp.exec(string). ...
If the regular expression includes the g flag, the method returns an Array containing all matches.
Since you had the g flag, the RegExp was trying to find all global matches (basically ignoring your groupings), returning ['{{hello}}'].
If you remove the the g flag (or alternatively use /{{(.+)}}/i.exec(str), you can get your groupings returned.

Iterate through regular expression array in Javascript

How can I use an array of Regex expressions and iterate that array with 'exec' operation. I did initialize an array with various regular expressions like this:
var arrRegex = new Array(/(http:\/\/(?:.*)\/)/g, /(http:\/\/(?:.*)\/)/g);
Now I created a for loop that does this:
for(i=0;i<arrRegex.length;i++){
arrRegex[i].exec(somestring);
}
The thing is that this doesn't seems to work. I don't want to use it hardcoded like this:
(/(http:\/\/(?:.*)\/)/g).exec(somestring);
When using the array option, the '.exec' function returns null. When I use the hardcoded option it returns the matches as I wanted.
The exec() returns the match so you should be able to capture it.
somestring = 'http://stackoverflow.com/questions/11491489/iterate-through-regular-expression-array-in-javascript';
var arrRegex = new Array(/(http:\/\/(?:.*)\/)/g, /(http:\/\/(?:.*)\/)/g);
for (i = 0; i < arrRegex.length; i++) {
match = arrRegex[i].exec(somestring);
}
match is an array, with the following structure:
{
[0] = 'string matched by the regex'
[1] = 'match from the first capturing group'
[2] = 'match from the second capturing group'
... and so on
}
Take a look at this jsFiddle http://jsfiddle.net/HHKs2/1/
You can also use test() instead of exec() as a shorthand for exec() != null. test() will return a boolean variable depending on whether the regex matches part of the string or not.
What you probably want to do is to capture the first group:
for(i=0;i<arrRegex.length;i++){
var someotherstring = arrRegex[i].exec(somestring)[1];
// do something with it ...
}
BTW: That is my guess, not sure what you are trying to do. But if you are trying to get the host name of a URL you should use /(http:\/\/(?:.?)\/)/g. The question mark after .* makes the previous quantifier (*) ungreedy.

Regex to extract substring, returning 2 results for some reason

I need to do a lot of regex things in javascript but am having some issues with the syntax and I can't seem to find a definitive resource on this.. for some reason when I do:
var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test)
it shows
"afskfsd33j, fskfsd33"
I'm not sure why its giving this output of original and the matched string, I am wondering how I can get it to just give the match (essentially extracting the part I want from the original string)
Thanks for any advice
match returns an array.
The default string representation of an array in JavaScript is the elements of the array separated by commas. In this case the desired result is in the second element of the array:
var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test[1]);
Each group defined by parenthesis () is captured during processing and each captured group content is pushed into result array in same order as groups within pattern starts. See more on http://www.regular-expressions.info/brackets.html and http://www.regular-expressions.info/refcapture.html (choose right language to see supported features)
var source = "afskfsd33j"
var result = source.match(/a(.*)j/);
result: ["afskfsd33j", "fskfsd33"]
The reason why you received this exact result is following:
First value in array is the first found string which confirms the entire pattern. So it should definitely start with "a" followed by any number of any characters and ends with first "j" char after starting "a".
Second value in array is captured group defined by parenthesis. In your case group contain entire pattern match without content defined outside parenthesis, so exactly "fskfsd33".
If you want to get rid of second value in array you may define pattern like this:
/a(?:.*)j/
where "?:" means that group of chars which match the content in parenthesis will not be part of resulting array.
Other options might be in this simple case to write pattern without any group because it is not necessary to use group at all:
/a.*j/
If you want to just check whether source text matches the pattern and does not care about which text it found than you may try:
var result = /a.*j/.test(source);
The result should return then only true|false values. For more info see http://www.javascriptkit.com/javatutors/re3.shtml
I think your problem is that the match method is returning an array. The 0th item in the array is the original string, the 1st thru nth items correspond to the 1st through nth matched parenthesised items. Your "alert()" call is showing the entire array.
Just get rid of the parenthesis and that will give you an array with one element and:
Change this line
var test = tesst.match(/a(.*)j/);
To this
var test = tesst.match(/a.*j/);
If you add parenthesis the match() function will find two match for you one for whole expression and one for the expression inside the parenthesis
Also according to developer.mozilla.org docs :
If you only want the first match found, you might want to use
RegExp.exec() instead.
You can use the below code:
RegExp(/a.*j/).exec("afskfsd33j")
I've just had the same problem.
You only get the text twice in your result if you include a match group (in brackets) and the 'g' (global) modifier.
The first item always is the first result, normally OK when using match(reg) on a short string, however when using a construct like:
while ((result = reg.exec(string)) !== null){
console.log(result);
}
the results are a little different.
Try the following code:
var regEx = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
var result = sample_string.match(regEx);
console.log(JSON.stringify(result));
// ["1 cat","2 fish"]
var reg = new RegExp('[0-9]+ (cat|fish)','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null) {
console.dir(JSON.stringify(result))
};
// '["1 cat","cat"]'
// '["2 fish","fish"]'
var reg = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null){
console.dir(JSON.stringify(result))
};
// '["1 cat","1 cat","cat"]'
// '["2 fish","2 fish","fish"]'
(tested on recent V8 - Chrome, Node.js)
The best answer is currently a comment which I can't upvote, so credit to #Mic.

Categories

Resources