Why `pattern.test(name)` opposite results on consecutive calls [duplicate] - javascript

This question already has answers here:
Why does a RegExp with global flag give wrong results?
(7 answers)
Closed 7 years ago.
Why is this code returning first true, then false
var pattern = new RegExp("mstea", 'gi'), name = "Amanda Olmstead";
console.log('1', pattern.test(name));
console.log('1', pattern.test(name));
Demo: Fiddle

g is for repeating searches. It changes the regular expression object into an iterator. If you want to use the test function to check your string is valid according to your pattern, remove this modifier :
var pattern = new RegExp("mstea", 'i'), name = "Amanda Olmstead";
The test function, contrary to replace or match doesn't consume the whole iteration, which lets it in a "bad" state. You should probably never use this modifier when using the test function.

You don't want to use gi in combination with pattern.test. The g flag means that it keeps track of where you are running so it can be reused. So instead, you should use:
var pattern = new RegExp("mstea", 'i'), name = "Amanda Olmstead";
console.log('1', pattern.test(name));
console.log('1', pattern.test(name));
Also, you can use /.../[flags] syntax for regex, like so:
var pattern = /mstea/i;

Because you set the g modifier.
Remove it for your case.
var pattern = new RegExp("mstea", 'i'), name = "Amanda Olmstead";

It isn't a bug.
The g causes it to carry out the next attempted match for the substring, after the first match. And that is why it returns false in every even attempt.
First attempt:
It is testing "Amanda Olmstead"
Second attempt:
It is testing "d" //match found in previous attempt (performs substring there)
Third attempt:
It is testing "Amanda Olmstead" again //no match found in previous attempt
... so on
MDN page for Regexp.exec states:
If your regular expression uses the "g" flag, you can use the exec
method multiple times to find successive matches in the same string.
When you do so, the search starts at the substring of str specified by
the regular expression's lastIndex property
MDN page for test states:
As with exec (or in combination with it), test called multiple times
on the same global regular expression instance will advance past the
previous match.

Related

javascript regular expression error in test function? [duplicate]

What is the meaning of the g flag in regular expressions?
What is is the difference between /.+/g and /.+/?
g is for global search. Meaning it'll match all occurrences. You'll usually also see i which means ignore case.
Reference: global - JavaScript | MDN
The "g" flag indicates that the regular expression should be tested against all possible matches in a string.
Without the g flag, it'll only test for the first.
Additionally, make sure to check cchamberlain's answer below for details on how it sets the lastIndex property, which can cause unexpected side effects when re-using a regex against a series of values.
Example in Javascript to explain:
> 'aaa'.match(/a/g)
[ 'a', 'a', 'a' ]
> 'aaa'.match(/a/)
[ 'a', index: 0, input: 'aaa' ]
As #matiska pointed out, the g flag sets the lastIndex property as well.
A very important side effect of this is if you are reusing the same regex instance against a matching string, it will eventually fail because it only starts searching at the lastIndex.
// regular regex
const regex = /foo/;
// same regex with global flag
const regexG = /foo/g;
const str = " foo foo foo ";
const test = (r) => console.log(
r,
r.lastIndex,
r.test(str),
r.lastIndex
);
// Test the normal one 4 times (success)
test(regex);
test(regex);
test(regex);
test(regex);
// Test the global one 4 times
// (3 passes and a fail)
test(regexG);
test(regexG);
test(regexG);
test(regexG);
g is the global search flag.
The global search flag makes the RegExp search for a pattern throughout the string, creating an array of all occurrences it can find matching the given pattern.
So the difference between /.+/g and /.+/ is that the g version will find every occurrence instead of just the first.
There is no difference between /.+/g and /.+/ because they will both only ever match the whole string once. The g makes a difference if the regular expression could match more than once or contains groups, in which case .match() will return an array of the matches instead of an array of the groups.
g -> returns all matches
without g -> returns first match
example:
'1 2 1 5 6 7'.match(/\d+/) returns ["1", index: 0, input: "1 2 1 5 6 7", groups: undefined]. As you see we can only take first match "1".
'1 2 1 5 6 7'.match(/\d+/g) returns an array of all matches ["1", "2", "1", "5", "6", "7"].
Beside already mentioned meaning of g flag, it influences regexp.lastIndex property:
The lastIndex is a read/write integer property of regular expression
instances that specifies the index at which to start the next match.
(...) This property is set only if the regular expression instance
used the "g" flag to indicate a global search.
Reference: Mozilla Developer Network
G in regular expressions is a defines a global search, meaning that it would search for all the instances on all the lines.
Will give example based on string. If we want to remove all occurences from a
string.
Lets say if we want to remove all occurences of "o" with "" from "hello world"
"hello world".replace(/o/g,'');
In my case i have a problem that i need to reevaluate string each time from the first letter, for this a have to remove /my_regexp/g(global flag) to stop moving lastIndex.
as mentioned in mdn:
Be sure that the global (g) flag is set, or lastIndex will never be advanced.
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/exec#specifications

NodeJS Regex match 4 digit number in string with 'OR' words [duplicate]

This question already has answers here:
Why does a RegExp with global flag give wrong results?
(7 answers)
Closed 3 years ago.
I need to regex string with a year inside. Template is 'Year-<4 digits>-"high OR low"-level'.
I've built this regex: /Year-\d{4}-\b(low|high)\b-level/gi;
In online regex testers my strings pass the check. Sample code:
const template = /Year-\d{4}-\b(low|high)\b-level/gi;
const txtArr = ['Year-2019-low-level', 'Year-2019-high-level', 'Year-low-level', 'Year-high-level', 'Year-2018-low-level', 'Year-2018-low-level']
for (const s of txtArr) {
console.log(template.test(s), s);
}
I expect 2 of sample strings to not pass, but 4 should pass. But they dont - only 2 of them pass. Also in browser console they don't pass. Tried in FF and Chrome. Can't understand why.
Also, if I copy the string that is not passing the match and just make
console.log(template.test('Year-2018-low-level'), 'Year-2018-low-level');
it passes! I've got only one idea: looks like in every iteration of loop something is not reset in regex, and it keeps something in memory, that is not letting match pass.
P.S. I even copied the same string which must pass the test to array, like that:
const txtArr = ['Year-2019-low-level', 'Year-2019-low-level', 'Year-2019-low-level', 'Year-2019-low-level', 'Year-2019-low-level', 'Year-2019-low-level']
and the results are true-false-true-false-true... Why? And how to fix?
I found an explanation here: https://siderite.dev/blog/careful-when-reusing-javascript-regexp.html
"The moral of the story is to be careful of constructs like _reg.test(input);
when _reg is a global regular expression. It will attempt to match from the index of the last match in any previous string."
So the problem comes from the way the global statement is treated.
The author of the blog also describes the very same problem you have:
"Here is a case that was totally weird. Imagine a javascript function that returns an array of strings based on a regular expression match inside a for loop. In FireFox it would return half the number of items that it should have."
What you could do to avoid this problem is either not using the global keyword, or instanciate a new regex at each iteration:
const txtArr = ['Year-2019-low-level', 'Year-2019-high-level', 'Year-low-level', 'Year-high-level', 'Year-2018-low-level', 'Year-2018-low-level']
for (const s of txtArr) {
console.log(/Year-\d{4}-\b(low|high)\b-level/gi.test(s), s);
}
An alternative is to use !!s.match(template) instead of template.test(s), so you don't need to modify your regex.
Working example: https://codesandbox.io/s/zen-carson-z9cq6
An explanation to the weird behavior:
The RegExp object keeps track of the lastIndex where a match occurred,
so on subsequent matches it will start from the last used index,
instead of 0.
from this StackOverflow question: Why does a RegExp with global flag give wrong results?
I changed your regex and its working, with this one:
const template = /Year-\d{4}-(low|high)-level/

Javascript exec maintaing state

I am currently trying to build a little templating engine in Javascript by replacing tags in a html5 tag by find and replace with a regex.
I am using exec on my regular expression and I am looping over the results. I am wondering why the regular expressions breaks in its current form with the /g flag on the regular expression but is fine without it?
Check the broken example and remove the /g flag on the regular expression to view the correct output.
var TemplateEngine = function(tpl, data) {
var re = /(?:<|<)%(.*?)(?:%>|>)/g, match;
while(match = re.exec(tpl)) {
tpl = tpl.replace(match[0], data[match[1]])
}
return tpl;
}
https://jsfiddle.net/stephanv/u5d9en7n/
Can somebody explain to me a little bit more on depth why my example breaks exactly on:
<p><%more%></p>
The reason is explained in javascript string exec strange behavior.
The solution you need is actually a String.replace with a callback as a replacement:
var TemplateEngine = function(tpl, data) {
var re = /(?:<|<)%(.*?)(?:%>|>)/g, match;
return tpl.replace(re, function($0, $1) {
return data[$1] ? data[$1] : $0;
});
}
See the updated fiddle
Here, the regex finds all non-overlapping matches in the string, sequentially, and passes the match to the callback method. $0 is the full match and $1 is the Group 1 contents. If data[$1] exists, it is used to replace the whole match, else, the whole match is inserted back.
Check this link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/lastIndex. When using the g flag the object that you store the regex in (re) will keep track of the position of the last match in the lastIndex property and the next time you use that object the search will start from the position of lastIndex.
To solve this you could either manually reset the lastIndex property each time or not save the regex in an object and use it inline like so:
while(match = /(?:<|<)%(.*?)(?:%>|>)/g.exec(tpl)) {

Javascript RegExp Producing Unusual Results with Global Modifer [duplicate]

This question already has answers here:
Why does a RegExp with global flag give wrong results?
(7 answers)
Closed 9 years ago.
I was able to work around this issue as it turned out I didn't need /g. But I was wondering if anyone would be able to explain why the following behavior occurred.
x = RegExp( "w", "gi" )
x.test( "Women" )
= true
x.test( "Women" )
= false
It would continue to alternate between true and false when evaluating the expression. Which was an issue because I was using the same compiled RegExp on a list of strings, leading some to evaluate to false when they should have been true.
You should not be using global modifier in a regex used for test, because it preserves the index of the last search and starts the next test from there.
I'd asked the same question.
When you use the g flag, the regex stores the end position of the match in its lastIndex property. The next time you call any of test(), exec(), or match(), the regex will start from that index in the string to try and find a match.
When no match is found, it will return null, and lastIndex is reset to 0. This is why your test kept alternating. It would match the W, and then lastIndex would be set to 1. The next time you called it, null would be returned, and lastIndex would be reset.
A pitfall related to this is when your regex can match the empty string. In that case, lastIndex will not change, and if you are getting all matches, there will be an infinite loop. In this case you should manually adjust lastIndex if it matched the empty string.
https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/RegExp/test
As with exec (or in combination with it), test called multiple times on the same global regular expression instance will advance past the previous match.
Essentially, the RegExp object x keeps track of its last match internally. When you call .test again, it attempts to match starting after the "w"
Of course this is only true of a regex object instance.
> /w/gi.test('Women')
true
> /w/gi.test('Women')
true

Why does Javascript's regex.exec() not always return the same value? [duplicate]

This question already has answers here:
Why does a RegExp with global flag give wrong results?
(7 answers)
Closed 6 years ago.
In the Chrome or Firebug console:
reg = /ab/g
str = "abc"
reg.exec(str)
==> ["ab"]
reg.exec(str)
==> null
reg.exec(str)
==> ["ab"]
reg.exec(str)
==> null
Is exec somehow stateful and depends on what it returned the previous time? Or is this just a bug? I can't get it to happen all the time. For example, if 'str' above were "abc abc" it doesn't happen.
A JavaScript RegExp object is stateful.
When the regex is global, if you call a method on the same regex object, it will start from the index past the end of the last match.
When no more matches are found, the index is reset to 0 automatically.
To reset it manually, set the lastIndex property.
reg.lastIndex = 0;
This can be a very useful feature. You can start the evaluation at any point in the string if desired, or if in a loop, you can stop it after a desired number of matches.
Here's a demonstration of a typical approach to using the regex in a loop. It takes advantage of the fact that exec returns null when there are no more matches by performing the assignment as the loop condition.
var re = /foo_(\d+)/g,
str = "text foo_123 more text foo_456 foo_789 end text",
match,
results = [];
while (match = re.exec(str))
results.push(+match[1]);
DEMO: http://jsfiddle.net/pPW8Y/
If you don't like the placement of the assignment, the loop can be reworked, like this for example...
var re = /foo_(\d+)/g,
str = "text foo_123 more text foo_456 foo_789 end text",
match,
results = [];
do {
match = re.exec(str);
if (match)
results.push(+match[1]);
} while (match);
DEMO: http://jsfiddle.net/pPW8Y/1/
From MDN docs:
If your regular expression uses the "g" flag, you can use the exec method multiple times to find successive matches in the same string. When you do so, the search starts at the substring of str specified by the regular expression's lastIndex property (test will also advance the lastIndex property).
Since you are using the g flag, exec continues from the last matched string until it gets to the end (returns null), then starts over.
Personally, I prefer to go the other way around with str.match(reg)
Multiple Matches
If your regex need the g flag (global match), you will need to reset the index (position of the last match) by using the lastIndex property.
reg.lastIndex = 0;
This is due to the fact that exec() will stop on each occurence so you can run again on the remaining part. This behavior also exists with test()) :
If your regular expression uses the "g" flag, you can use the exec
method multiple times to find successive matches in the same string.
When you do so, the search starts at the substring of str specified by
the regular expression's lastIndex property (test will also advance
the lastIndex property)
Single Match
When there is only one possible match, you can simply rewrite you regex by omitting the g flag, as the index will be automatically reset to 0.

Categories

Resources