Regular expressions in JavaScript with the global flag - javascript

It appears that the RegExp intrinsic is stateful.
So calling it twice on the same string will yield different results when the global flag g is supplied, as it advances a search along the string.
So:
var r = /(\d{3})/g;
console.log(r.test('123')); // true
console.log(r.test('123')); // false - because the search has moved past the first match
But if I add an intermediate test, I get the following:
var r = /(\d{3})/g;
console.log(r.test('123')); // true
console.log(r.test('456')); // true
console.log(r.test('123')); // true!
So is it correct to say that RegExp instances operate on the principle of considering only the last string evaluated? If the string differs from the last, it is effectively reset?

So is it correct to say that RegExp instances operate on the principle of considering only the last string evaluated?
yes
If the string differs from the last, it is effectively reset?
correct
If the global flag is omitted, is the regular expression reset in between tests?
right
Check out RegExp#lastIndex

Related

Why does [[]][0]++ work but []++ throws run-time exception?

Why does the first line work while the second line throws run-time exception?
The first line:
[[]][0]++; //this line works fine
The second line:
[]++; //this lines throws exception
[[]][0]++
is equivalent to
var tmp = [[]];
tmp[0] = tmp[0]+1;
tmp[0] is an empty array, which is cast to the number 0, which increments to 1.
This only works because <array>[<index>]++ looks valid. It takes some type juggling, but it gets there.
But []++ is outright invalid. There's no way to make it make sense.
[] = []+1;
The left-hand side here is indeed invalid. You can't assign to an empty array.
The ++ operator (or indeed any postfix operator) requires the operand to be a "reference" - that is, a value that can be assigned to. [] is a literal, so you can't assign to it. [[]][0] is a valid reference to an element of a temporary array.
0++; // not valid, `0` is a literal.
var a = [];
a++; // ok, `a` is assignable
This is a rare case in which Javascript does something that actually makes sense. Consider
x[3]++; // Valid
3++; // Not valid
If this make sense for you, then what is surprising about
[[]][0]++; // valid
[]++; // not valid
<array>[index] is "a place" that you can assign or increment. That's all. The fact that you can increment a[<expr>] doesn't imply that you can increment <expr>.
The absurd part is that you can use [] as an index, that has the meaning of converting the array to an empty string "" and then to the number 0, but this is the well known problem of absurd implicit conversions of Javascript. Those implicit conversion rules are a big wart of Javascript and for example imply that 1 == [1] or that both []==false and (![])==false.
Javascript is pure nonsense in a lot of places... but not really here.

Javascript regex test weirdness

Can anyone explain this (using node version 4.2.4 repl)?
var n; //undefined
/^[a-z0-9]+$/.test(n); // true!
/^[a-f0-9]+$/.test(n); // false
The variable passed to .test() is first converted to string. This is specified in the spec:
http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.6.3
which points to:
http://www.ecma-international.org/ecma-262/5.1/#sec-15.10.6.2
which says:
Let R be this RegExp object.
Let S be the value of ToString(string).
So basically you're testing:
/^[a-z0-9]+$/.test("undefined"); // true!
/^[a-f0-9]+$/.test("undefined"); // false
It should now be obvious why the second test returns false. The letters u, n and i are not included in the test pattern.
Note: The function ToString() in the spec refers to the type coercion function in the underlying implementation (most probably C or C++ though there exist other implementations of js in other languages like Java and Go). It does not refer to the global function toString() in js. As such, that second line in the spec basically means that undefined will be treated as "" + undefined which returns "undefined".
Probably it's converting undefined to a string. So:
var pattern1 = /^[a-z0-9]+$/
var pattern2 = /^[a-f0-9]+$/
pattern1.test("undefined") // There are only letters
pattern2.test("undefined") // defed match, but unin does not.
RegExp.test treats n as a string "undefined".So, the range [a-f] does not cover all the characters of undefined string.In your case, the "mimimum allowable" range for passing regexp check would be [a-u]
var n; //undefined
console.log(/^[a-u]+$/.test(n)); // true

Local Regex Match Variable Not Updated

I am looping through an array of objects and mapping them to my own custom objects. I am extracting data via regular expressions. My first run through the loop works fine, but in subsequent iterations, although they match, the match variables do not get set.
Here is one of the regex's:
var gameRegex = /(\^?)([A-z]+)\s?(\d+)?\s+(at\s)?(\^?)([A-z]+)\s?(\d+)?\s\((.*)\)/g;
Here is the initial part of my loop:
for(var i = 1; i <= data.count; i++) {
var gameMatch = gameRegex.exec(data["left" + i]);
var idMatch = idRegex.exec(data["url" + i]);
First time through, gameMatch and idMatch have values. The following iterations do not work even though I have tested that they do work.
Is there something about regular expressions, maybe especially in Node.js, that I need to do if I use them more than once?
When you have a regular expression with a global flag /.../g and use exec() with it, JavaScript sets a property named lastIndex on that regex.
s = "abab";
r = /a/g;
r.exec(s); // ["a"]
r.lastIndex // 1
r.exec(s); // ["a"]
r.lastIndex // 3
r.exec(s); // null
r.lastIndex // 0
This is meant to be used for multiple matches in the same string. You could call exec() again and again and with every call lastIndex is increased - defining automagically where the next execution will start:
while (match = r.exec(s)) {
console.log(match);
}
Now lastIndex will be off after the first invocation of exec(). But since you pass in a different string every time, the expression will not match anymore.
There are two ways to solve this:
manually set the r.lastIndex = 0 every time or
remove the g global flag
In your case the latter option would be the right one.
Further reading:
.exec() on the MDN
.exec() on the MSDN
"How to Use The JavaScript RegExp Object" on regular-expressions.info

javascript string exec strange behavior [duplicate]

This question already has answers here:
Why does a RegExp with global flag give wrong results?
(7 answers)
Closed 8 months ago.
have funciton in my object which is called regularly.
parse : function(html)
{
var regexp = /...some pattern.../
var match = regexp.exec(html);
while (match != null)
{
...
match = regexp.exec(html);
}
...
var r = /...pattern.../g;
var m = r.exec(html);
}
with unchanged html the m returns null each other call. let's say
parse(html);// ok
parse(html);// m is null!!!
parse(html);// ok
parse(html);// m is null!!!
// ...and so on...
is there any index or somrthing that has to be reset on html ... I'm really confused. Why match always returns proper result?
This is a common behavior when you deal with patterns that have the global g flag, and you use the exec or test methods.
In this case the RegExp object will keep track of the lastIndex where a match was found, and then on subsequent matches it will start from that lastIndex instead of starting from 0.
Edit: In response to your comment, why doesn't the RegExp object being re-created when you call the function again:
This is the behavior described for regular expression literals, let me quote the specification:
§ 7.8.5 - Regular Expression Literals
...
The object is created before evaluation of the containing program or function begins. Evaluation of the literal produces a reference to that object; it does not create a new object.
....
You can make a simple proof by:
function createRe() {
var re = /foo/g;
return re;
}
createRe() === createRe(); // true, it's the same object
You can be sure that is the same object, because "two regular expression literals in a program evaluate to regular expression objects that never compare as === to each other even if the two literals' contents are identical", e.g.:
/foo/ === /foo/; // always false...
However this behavior is respected on all browser but not by IE, which initializes a new RegExp object every time.
To avoid this behavior as it might be needed in this case, simply set
var r = /...pattern.../g;
var m = r.exec(html);
r.lastIndex=0;
This worked for me.

Using regular expression ,if and else conflict with .test() function

In the given code, best.test(password) is returning true but when I am using it in if()
condition in takes it as a false.
Code:
if(best.test(password)) //It takes it as a false .
{
document.write(best.test(password));
tdPwdStrength.innerHTML="best"+best.test(password); //but in actual it is true and returning true.
}
Please Suggest!
What is best? Is it a ‘global’ RegExp, that is, one with the g flag set?
If so, then each time you call test or exec you will get a different answer, as it remembers the previous string index and searches from there:
var r= /a/g; // or new RegExp('a', 'g')
alert(r.test('aardvark')); // true. matches first `a`
alert(r.test('aardvark')); // true. matches second `a`
alert(r.test('aardvark')); // true. matches third `a`
alert(r.test('aardvark')); // false! no more matches found
alert(r.test('aardvark')); // true. back to the first `a` again
JavaScript's RegExp interface is full of confusing little traps like this. Be careful.

Categories

Resources