Regex working in regex tester but not in JS (wrong matches) - javascript

this is actually the first time I encounter this problem.
I'am trying to parse a string for key value pairs where the seperator can be a different character. It works fine at any regex tester but not in my current JS project. I already found out, that JS regex works different then for example php. But I couldn't find what to change with mine.
My regex is the following:
[(\?|\&|\#|\;)]([^=]+)\=([^\&\#\;]+)
it should match:
#foo=bar#site=test
MATCH 1
1. [1-4] `foo`
2. [5-8] `bar`
MATCH 2
1. [9-13] `site`
2. [14-18] `test`
and the JS is:
'#foo=bar#site=test'.match(/[(\?|\&|\#|\;)]([^=]+)\=([^\&\#\;]+)/g);
Result:
["#foo=bar", "#site=test"]
For me it looks like the grouping is not working properly.
Is there a way around this?

String#match doesn't include capture groups. Instead, you loop through regex.exec:
var match;
while (match = regex.exec(str)) {
// Use each result
}
If (like me) that assignment in the condition bothers you, you can use a !! to make it clearly a test:
var match;
while (!!(match = regex.exec(str))) {
// Use each result
}
Example:
var regex = /[(\?|\&|\#|\;)]([^=]+)\=([^\&\#\;]+)/g;
var str = '#foo=bar#site=test';
var match;
while (!!(match = regex.exec(str))) {
console.log("result", match);
}

I wouldn't rely on a complex regex which could fail anytime for strange reasons and is hard to read but use simple functions to split the string:
var str = '#foo=bar#site=test'
// split it by # and do a loop
str.split('#').forEach(function(splitted){
// split by =
var splitted2 = splitted.split('=');
if(splitted2.length === 2) {
// here you have
// splitted2[0] = foo
// splitted2[1] = bar
// and in the next iteration
// splitted2[0] = site
// splitted2[1] = test
}
}

Related

Using Variable in Regex Character Set

i'm trying to use a variable (save) as a regex character set but keep getting null
function mutation(arr) {
var save = arr[1];
var rgx = /[save]/gi;
return arr[0].match(rgx).join('') == arr[0];
}
mutation(["Mary", "Army"]);
Goal of the function is to see if all the letters of arr[1] are contained in arr[0] by returning true or false. Function does work as i want it to when i manually put arr[1] into the character set (returns true in this situation), just cant get it to work with the variable.
Your exact current approach won't work (I think) due to it not being possible to build a regex pattern using /.../ notation with a variable. But, we can still use RegExp to build the pattern. For the sample data you showed us, here is a regex pattern which would work:
^(?!.*[^Mary]).*$`
In other words, we can assert, on the second string Army, that all its characters can be found in the first string Mary.
function mutation(arr) {
var save = arr[1];
var rgx = "^(?!.*[^" + save + "]).*$";
var re = new RegExp(rgx, "gi");
return re.test(arr[0]);
}
console.log(mutation(["Mary", "Army"]));
console.log(mutation(["Jon Skeet", "Tim Biegeleisen"]));

javascript Reg Backwards not works

I want to match a number in a string:
'abc#2003, or something else #2017'
I want to get result [2003, 2007] with match function.
let strReg = 'abc#2003, or something else #2017';
let reg = new RegExp(/(?=(#\d+))\1/);
strReg.match(reg) //[ '#2003 ', '#2017 ' ]
let reg1 = new RegExp(/(?=#(\d+))\1/)
strReg.match(reg1) //null, but I expect [2003, 2007]
the result mains '\1' match after '?=', ?=()\1 works, ?=#()\1 not.
javascript only supports backwards, how should I do to match '#' but ignore it?
I take it that you want an array of the results, so...
var s = "abc#2003, or something else #2017 not the 2001 though";
var re = /#(\d+)/g;
var result = [];
var match = re.exec(s);
while (match !== null) {
result.push(parseInt(match[1]));
match = re.exec(s);
}
console.log(result);
Outputs:
Array [ 2003, 2017 ]
match(0) is the entire match, match(1) is the captured group.
Also, see How do you access the matched groups in a JavaScript regular expression?
Inspired by javascript regex - look behind alternative?, if you want to do it as almost a one-liner:
var re = /(\d+)(?=#)/g; /* write the regex backwards */
var result = [];
s.split('').reverse().join('').match(re).forEach(function (el) { result.push(parseInt(el.split('').reverse().join(''))); });
console.log(result.reverse());
Caveat: Who wrote this programing saying? “Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.”
Small change to your code does the job as follows:
/#(\d+)/g
number followed by # will be remembered as you required.

Regex extracting multiple matches for string [duplicate]

I'm trying to obtain all possible matches from a string using regex with javascript. It appears that my method of doing this is not matching parts of the string that have already been matched.
Variables:
var string = 'A1B1Y:A1B2Y:A1B3Y:A1B4Z:A1B5Y:A1B6Y:A1B7Y:A1B8Z:A1B9Y:A1B10Y:A1B11Y';
var reg = /A[0-9]+B[0-9]+Y:A[0-9]+B[0-9]+Y/g;
Code:
var match = string.match(reg);
All matched results I get:
A1B1Y:A1B2Y
A1B5Y:A1B6Y
A1B9Y:A1B10Y
Matched results I want:
A1B1Y:A1B2Y
A1B2Y:A1B3Y
A1B5Y:A1B6Y
A1B6Y:A1B7Y
A1B9Y:A1B10Y
A1B10Y:A1B11Y
In my head, I want A1B1Y:A1B2Y to be a match along with A1B2Y:A1B3Y, even though A1B2Y in the string will need to be part of two matches.
Without modifying your regex, you can set it to start matching at the beginning of the second half of the match after each match using .exec and manipulating the regex object's lastIndex property.
var string = 'A1B1Y:A1B2Y:A1B3Y:A1B4Z:A1B5Y:A1B6Y:A1B7Y:A1B8Z:A1B9Y:A1B10Y:A1B11Y';
var reg = /A[0-9]+B[0-9]+Y:A[0-9]+B[0-9]+Y/g;
var matches = [], found;
while (found = reg.exec(string)) {
matches.push(found[0]);
reg.lastIndex -= found[0].split(':')[1].length;
}
console.log(matches);
//["A1B1Y:A1B2Y", "A1B2Y:A1B3Y", "A1B5Y:A1B6Y", "A1B6Y:A1B7Y", "A1B9Y:A1B10Y", "A1B10Y:A1B11Y"]
Demo
As per Bergi's comment, you can also get the index of the last match and increment it by 1 so it instead of starting to match from the second half of the match onwards, it will start attempting to match from the second character of each match onwards:
reg.lastIndex = found.index+1;
Demo
The final outcome is the same. Though, Bergi's update has a little less code and performs slightly faster. =]
You cannot get the direct result from match, but it is possible to produce the result via RegExp.exec and with some modification to the regex:
var regex = /A[0-9]+B[0-9]+Y(?=(:A[0-9]+B[0-9]+Y))/g;
var input = 'A1B1Y:A1B2Y:A1B3Y:A1B4Z:A1B5Y:A1B6Y:A1B7Y:A1B8Z:A1B9Y:A1B10Y:A1B11Y'
var arr;
var results = [];
while ((arr = regex.exec(input)) !== null) {
results.push(arr[0] + arr[1]);
}
I used zero-width positive look-ahead (?=pattern) in order not to consume the text, so that the overlapping portion can be rematched.
Actually, it is possible to abuse replace method to do achieve the same result:
var input = 'A1B1Y:A1B2Y:A1B3Y:A1B4Z:A1B5Y:A1B6Y:A1B7Y:A1B8Z:A1B9Y:A1B10Y:A1B11Y'
var results = [];
input.replace(/A[0-9]+B[0-9]+Y(?=(:A[0-9]+B[0-9]+Y))/g, function ($0, $1) {
results.push($0 + $1);
return '';
});
However, since it is replace, it does extra useless replacement work.
Unfortunately, it's not quite as simple as a single string.match.
The reason is that you want overlapping matches, which the /g flag doesn't give you.
You could use lookahead:
var re = /A\d+B\d+Y(?=:A\d+B\d+Y)/g;
But now you get:
string.match(re); // ["A1B1Y", "A1B2Y", "A1B5Y", "A1B6Y", "A1B9Y", "A1B10Y"]
The reason is that lookahead is zero-width, meaning that it just says whether the pattern comes after what you're trying to match or not; it doesn't include it in the match.
You could use exec to try and grab what you want. If a regex has the /g flag, you can run exec repeatedly to get all the matches:
// using re from above to get the overlapping matches
var m;
var matches = [];
var re2 = /A\d+B\d+Y:A\d+B\d+Y/g; // make another regex to get what we need
while ((m = re.exec(string)) !== null) {
// m is a match object, which has the index of the current match
matches.push(string.substring(m.index).match(re2)[0]);
}
matches == [
"A1B1Y:A1B2Y",
"A1B2Y:A1B3Y",
"A1B5Y:A1B6Y",
"A1B6Y:A1B7Y",
"A1B9Y:A1B10Y",
"A1B10Y:A1B11Y"
];
Here's a fiddle of this in action. Open up the console to see the results
Alternatively, you could split the original string on :, then loop through the resulting array, pulling out the the ones that match when array[i] and array[i+1] both match like you want.

Counting all the occurrences of a substing in a string using regular expression

I've seen many examples of this but didn't helped. I have the following string:
var str = 'asfasdfasda'
and I want to extract the following
asfa asfasdfa asdfa asdfasda asda
i.e all sub-strings starting with 'a' and ending with 'a'
here is my regular expression
/a+[a-z]*a+/g
but this always returns me only one match:
[ 'asdfasdfsdfa' ]
Someone can point out mistake in my implementation.
Thanks.
Edit Corrected no of substrings needed. Please note that overlapping and duplicate substring are required as well.
For capturing overlapping matches you will need to lookahead regex and grab the captured group #1 and #2:
/(?=(a.*?a))(?=(a.*a))/gi
RegEx Demo
Explanation:
(?=...) is called a lookahead which is a zero-width assertion like anchors or word boundary. It just looks ahead but doesn't move the regex pointer ahead thus giving us the ability to grab overlapping matches in groups.
See more on look arounds
Code:
var re = /(?=(a.*?a))(?=(a.*a))/gi;
var str = 'asfasdfasda';
var m;
var result = {};
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex)
re.lastIndex++;
result[m[1]]=1;
result[m[2]]=1;
}
console.log(Object.keys(result));
//=> ["asfa", "asfasdfasda", "asdfa", "asdfasda", "asda"]
parser doesnt goto previous state on tape to match the start a again.
var str = 'asfaasdfaasda'; // you need to have extra 'a' to mark the start of next string
var substrs = str.match(/a[b-z]*a/g); // notice the regular expression is changed.
alert(substrs)
You can count it this way:
var str = "asfasdfasda";
var regex = /a+[a-z]*a+/g, result, indices = [];
while ((result = regex.exec(str))) {
console.log(result.index); // you can instead count the values here.
}

Regular Expression to find complex markers

I want to use JavaScript's regular expression something like this
/marker\d+"?(\w+)"?\s/gi
In a string like this:
IDoHaveMarker1"apple" IDoAlsoHaveAMarker352pear LastPointMakingmarker3134"foo"
And I want it to return an array like this:
[ "apple", "pear", "foo" ]
The quotes are to make clear they are strings. They shouldn't be in the result.
If you are asking about how to actually use the regex:
To get all captures of multiple (global) matches you have to use a loop and exec in JavaScript:
var regex = /marker\d+"?(\w+)/gi;
var result = [];
var match;
while (match = regex.exec(input)) {
result.push(match[1]);
}
(Note that you can omit the trailing "?\s? if you are only interested in the capture, since they are optional anyway, so they don't affect the matched result.)
And no, g will not allow you to do all of that in one call. If you had omitted g then exec would return the same match every time.
As Blender mentioned, if you want to rule out things like Marker13"something Marker14bar (unmatched ") you need to use another capturing group and a backreference. Note that this will push your desired capture to index 2:
var regex = /marker\d+("?)(\w+)\1/gi;
var result = [];
var match;
while (match = regex.exec(input)) {
result.push(match[2]);
}

Categories

Resources