Regex to extract substring, returning 2 results for some reason - javascript

I need to do a lot of regex things in javascript but am having some issues with the syntax and I can't seem to find a definitive resource on this.. for some reason when I do:
var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test)
it shows
"afskfsd33j, fskfsd33"
I'm not sure why its giving this output of original and the matched string, I am wondering how I can get it to just give the match (essentially extracting the part I want from the original string)
Thanks for any advice

match returns an array.
The default string representation of an array in JavaScript is the elements of the array separated by commas. In this case the desired result is in the second element of the array:
var tesst = "afskfsd33j"
var test = tesst.match(/a(.*)j/);
alert (test[1]);

Each group defined by parenthesis () is captured during processing and each captured group content is pushed into result array in same order as groups within pattern starts. See more on http://www.regular-expressions.info/brackets.html and http://www.regular-expressions.info/refcapture.html (choose right language to see supported features)
var source = "afskfsd33j"
var result = source.match(/a(.*)j/);
result: ["afskfsd33j", "fskfsd33"]
The reason why you received this exact result is following:
First value in array is the first found string which confirms the entire pattern. So it should definitely start with "a" followed by any number of any characters and ends with first "j" char after starting "a".
Second value in array is captured group defined by parenthesis. In your case group contain entire pattern match without content defined outside parenthesis, so exactly "fskfsd33".
If you want to get rid of second value in array you may define pattern like this:
/a(?:.*)j/
where "?:" means that group of chars which match the content in parenthesis will not be part of resulting array.
Other options might be in this simple case to write pattern without any group because it is not necessary to use group at all:
/a.*j/
If you want to just check whether source text matches the pattern and does not care about which text it found than you may try:
var result = /a.*j/.test(source);
The result should return then only true|false values. For more info see http://www.javascriptkit.com/javatutors/re3.shtml

I think your problem is that the match method is returning an array. The 0th item in the array is the original string, the 1st thru nth items correspond to the 1st through nth matched parenthesised items. Your "alert()" call is showing the entire array.

Just get rid of the parenthesis and that will give you an array with one element and:
Change this line
var test = tesst.match(/a(.*)j/);
To this
var test = tesst.match(/a.*j/);
If you add parenthesis the match() function will find two match for you one for whole expression and one for the expression inside the parenthesis
Also according to developer.mozilla.org docs :
If you only want the first match found, you might want to use
RegExp.exec() instead.
You can use the below code:
RegExp(/a.*j/).exec("afskfsd33j")

I've just had the same problem.
You only get the text twice in your result if you include a match group (in brackets) and the 'g' (global) modifier.
The first item always is the first result, normally OK when using match(reg) on a short string, however when using a construct like:
while ((result = reg.exec(string)) !== null){
console.log(result);
}
the results are a little different.
Try the following code:
var regEx = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
var result = sample_string.match(regEx);
console.log(JSON.stringify(result));
// ["1 cat","2 fish"]
var reg = new RegExp('[0-9]+ (cat|fish)','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null) {
console.dir(JSON.stringify(result))
};
// '["1 cat","cat"]'
// '["2 fish","fish"]'
var reg = new RegExp('([0-9]+ (cat|fish))','g'), sampleString="1 cat and 2 fish";
while ((result = reg.exec(sampleString)) !== null){
console.dir(JSON.stringify(result))
};
// '["1 cat","1 cat","cat"]'
// '["2 fish","2 fish","fish"]'
(tested on recent V8 - Chrome, Node.js)
The best answer is currently a comment which I can't upvote, so credit to #Mic.

Related

search text in array using regex in javascript

I am very new to making search text in array some elements in array are in rangers i.e it cant be anything after certain text in this AA and A regex and I have multi-dimensional array and I want search text in each array . So I wrote something like this.I put AA* in array so only first 2 character should match and A* for only one character match.
arr = [
["AA*","ABC","XYZ"] ,
["A*","AXY","AAJ"]
]
var text = "AA3";
for ($i=0; $i<arr.length; $i++ ){
var new_array = [];
new_array = arr[$i];
new_array.filter(function(array_element) {
var result = new RegExp(array_element).test(text);
if( result == true){
console.log(arr[$i]);
}
});
}
So what i want is when text = "AA3" or anything after double A AA[anything] and the output should be first array which is ["AA*","ABC","XYZ"] but I am getting both array as output and when text = "A3" then output should be second array which is ["A*","POI","LKJ"] but I am getting both array.But if text = "ABC" or text = "AAJ" then it should output first array or second array respectively.I dont know anything about how to write regex or is there anyway I can implement this using any other method.
Thanks in advance any advice will be helpful.
Summary
In short, the issue is "*"! The * found in the members of the set array is why you're getting the same array each time.
Detailed Info
Regexp is a one concept most developers find hard to understand (I am one of such btw 😅).
I'll start off with an excerpt intro to Regexp on MDN
Regexp are patterns used to match character combinations in strings - MDN
With that in mind you want to understand what goes on with your code.
When you create a Regex like /A*/ to test "AA3", what would be matched would be A, AA, etc. This is a truthy in javascript. You would want a more stricter matching with ^ or $ or strictly matching a number with \d.
I rewrote your code as a function below:
arr = [
["AA*", "ABC", "XYZ"],
["A*", "AXY", "AAJ"],
];
findInArray(arr, "AA3") // prints both array
findInArray(arr, "AAJ") // prints second array
findInArray(arr, "ABC") // prints first array
function findInArray(array, value) {
return array.filter((subArray) =>
subArray.some((item) => {
const check = new RegExp(value);
return check.test(item);
})
);
}
Problem
The problem lies in the fact you use each of the strings as a regex.
For a string with a * wildcard, this evaluates to zero or more matches of the immediately preceding item, which will always be true.
For a string consisting solely of alphanumerics, this is comparing a string to itself, which similarly will always give true.
For strings containing characters that constitute the regex's syntax definition, this could result in errors or unintended behavior.
MDN article on RegExp quantifiers
Rewrite
Assumptions:
The value with a * wildcard is always only at the 0th position,
There is only one such wildcard in its string,
The question mentions that for text = 'AAJ' only the 2nd array shall be returned, but both the AA* from the 1st array and AAJ from the 2nd would seem to match this text.
As such, I assume the wildcard can only stand for a number (as other examples seem to suggest).
Code:
const abc = (arrs, text) => {
return arrs.filter(arr => {
const regex = new RegExp(`^${arr[0].replace('*', '\\d+')}$`);
return regex.test(text) || arr.includes(text);
})
}
const arr = [
["AA*", "ABC", "XYZ"],
["A*", "AXY", "AAJ"]
];
console.log(
`1=[${abc(arr, "AA3")}]
2=[${abc(arr, "ABC")}]
3=[${abc(arr, "AAJ")}]`);

Extract Twitter handlers from string using regex in JavaScript

I Would like to extract the Twitter handler names from a text string, using a regex. I believe I am almost there, except for the ">" that I am including in my output. How can I change my regex to be better, and drop the ">" from my output?
Here is an example of a text string value:
"PlaymakersZA, Absa, DiepslootMTB"
The desired output would be an array consisting of the following:
PlaymakersZA, Absa, DiepslootMTB
Here is an example of my regex:
var array = str.match(/>[a-z-_]+/ig)
Thank you!
You can use match groups in your regex to indicate the part you wish to extract.
I set up this JSFiddle to demonstrate.
Basically, you surround the part of the regex that you want to extract in parenthesis: />([a-z-_]+)/ig, save it as an object, and execute .exec() as long as there are still values. Using index 1 from the resulting array, you can find the first match group's result. Index 0 is the whole regex, and next indices would be subsequent match groups, if available.
var str = "PlaymakersZA, Absa, DiepslootMTB";
var regex = />([a-z-_]+)/ig
var array = regex.exec(str);
while (array != null) {
alert(array[1]);
array = regex.exec(str);
}
You could just strip all the HTML
var str = "PlaymakersZA, Absa, DiepslootMTB";
$handlers = str.replace(/<[^>]*>|\s/g,'').split(",");

JavaScript regex back references returning an array of matches from single capture group (multiple groups)

I'm fairly sure after spending the night trying to find an answer that this isn't possible, and I've developed a work around - but, if someone knows of a better method, I would love to hear it...
I've gone through a lot of iterations on the code, and the following is just a line of thought really. At some point I was using the global flag, I believe, in order for match() to work, and I can't remember if it was necessary now or not.
var str = "#abc#def#ghi&jkl";
var regex = /^(?:#([a-z]+))?(?:&([a-z]+))?$/;
The idea here, in this simplified code, is the optional group 1, of which there is an unspecified amount, will match #abc, #def and #ghi. It will only capture the alpha characters of which there will be one or more. Group 2 is the same, except matches on & symbol. It should also be anchored to the start and end of the string.
I want to be able to back reference all matches of both groups, ie:
result = str.match(regex);
alert(result[1]); //abc,def,ghi
alert(result[1][0]); //abc
alert(result[1][1]); //def
alert(result[1][2]); //ghi
alert(result[2]); //jkl
My mate says this works fine for him in .net, unfortunately I simply can't get it to work - only the last matched of any group is returned in the back reference, as can be seen in the following:
(additionally, making either group optional makes a mess, as does setting global flag)
var str = "#abc#def#ghi&jkl";
var regex = /(?:#([a-z]+))(?:&([a-z]+))/;
var result = str.match(regex);
alert(result[1]); //ghi
alert(result[1][0]); //g
alert(result[2]); //jkl
The following is the solution I arrived at, capturing the whole portion in question, and creating the array myself:
var str = "#abc#def#ghi&jkl";
var regex = /^([#a-z]+)?(?:&([a-z]+))?$/;
var result = regex.exec(str);
alert(result[1]); //#abc#def#ghi
alert(result[2]); //jkl
var result1 = result[1].toString();
result[1] = result1.split('#')
alert(result[1][1]); //abc
alert(result[1][2]); //def
alert(result[1][3]); //ghi
alert(result[2]); //jkl
That's simply not how .match() works in JavaScript. The returned array is an array of simple strings. There's no "nesting" of capture groups; you just count the ( symbols from left to right.
The first string (at index [0]) is always the overall matched string. Then come the capture groups, one string (or null) per array element.
You can, as you've done, rearrange the result array to your heart's content. It's just an array.
edit — oh, and the reason your result[1][0] was "g" is that array indexing notation applied to a string gets you the individual characters of the string.

javascript regexp can't find way to access grouped results

re = //?(\w+)/(\w+)/
s = '/projects/new'
s.match(re)
I have this regular expression which I will use to sieve out the branch name, e.g., projects, and the 'tip' name, e.g., new
I read that one can have access to the grouped results with $1, $2, and so on, but I can't seem to get it to work, at least in Firebug console
When I run the above code, then run
RegExp.$1
it shows
""
Same goes on for $2.
any ideas?
Thanks!
Without the g flag, str.match(regexp) returns the same as regexp.exec(str). And that is:
The returned array has the matched text as the first item, and then one item for each capturing parenthesis that matched containing the text that was captured. If the match fails, the exec method returns null.
So you can do this:
var match = s.match(re);
match[1]
match gives you an array of the matched expressions:
> s.match(re)[1]
"projects"
> s.match(re)[2]
"new"
I you are accessing the array of matches the wrong way do something like:
re = /\/?(\w+)\/(\w+)/
s = '/projects/new'
var j = s.match(re)
alert(j[1]);
alert(j[2]);

Help with regex in javascript

I have this sample text, which is retrieved from the class name on an html element:
rich-message err-test1 erroractive
rich-message err-test2 erroractive
rich-message erroractive err-test1
err-test2 rich-message erroractive
I am trying to match the "test1"/"test2" data in each of these examples. I am currently using the following regex, which matches the "err-test1" type of word. I can't figure out how to limit it to just the data after the hyphen(-).
/err-(\S*)/ig
Head hurts from banging against this wall.
From what I am reading, your code already works.
Regex.exec() returns an array of results on success.
The first element in the array (index 0) returns the entire string, after which all () enclosed elements are pushed into this array.
var string = 'rich-message err-test1 erroractive';
var regex = new RegExp('err-(\S*)', 'ig');
var result = regex.exec(string);
alert(result[0]) --> returns err-test1
alert(result[1]) --> returns test1
You could try 'err-([^ \n\r]*)' - but are you sure that it is the regex that is the problem? Are you using the whole result, not just the first capture?
The stuff after the - should be in the results array. The first item is all the matching text (e.g. "err-test1") and the next items are the matches from the capture parentheses (e.g. "test1").
myregex = /err-(\S*)/ig;
mymatch = myregex.exec("data with matches in it");
testnum = mymatch[1];
Here's a reference site.

Categories

Resources