dynamic variable written into regular expression - javascript

I have a simple regular expression:
str.match(/SK=([\w‌​\-]+)/i);
I would like the SK part to be dynamic so I can return the match for SK=, WP= etc...
So I'm looking for something like:
var attr = 'SK';
str.match(/' + attr + '=([\w‌​\-]+)/i);

Use the constructor syntax to create your regex:
var myRegex = new RegExp(attr + "([\\w‌​\\-]+)","i");
and then:
str.match(myRegex);
See here (emphasis mine): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
The literal notation provides compilation of the regular expression when the expression is evaluated. Use literal notation when the regular expression will remain constant. For example, if you use literal notation to construct a regular expression used in a loop, the regular expression won't be recompiled on each iteration.
The constructor of the regular expression object, for example, new RegExp("ab+c"), provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.

Related

Passing a variable to javascript regex [duplicate]

I am studying about RegExp but everywhere I can see two syntax
new RegExp("[abc]")
And
/[abc]/
And if with modifiers then what is the use of additional backslash (\)
/\[abc]/g
I am not getting any bug with these two but I wonder is there any difference between these two. If yes then what is it and which is best to use?
I referred Differences between Javascript regexp literal and constructor but there I didn't find an explanation of which is best and what the difference is.
The key difference is that literal REGEX can't accept dynamic input, i.e. from variables, whereas the constructor can, because the pattern is specified as a string.
Say you wanted to match one or more words from an array in a string:
var words = ['foo', 'bar', 'orange', 'platypus'];
var str = "I say, foo, what a lovely platypus!";
str.match(new RegExp('\\b('+words.join('|')+')\\b', 'g')); //["foo", "platypus"]
This would not be possible with a literal /pattern/, as anything between the two forward slashes is interpreted literally; we'd have to specify the allowed words in the pattern itself, rather than reading them in from a dynamic source (the array).
Note also the need to double-escape (i.e. \\) special characters when specifying patterns in this way, because we're doing so in a string - the first backslash must be escaped by the second so one of them makes it into the pattern. If there were only one, it would be interpreted by JS's string parser as an escaping character, and removed.
As you can see, the RegExp constructor syntax requires string to be passed. \ in the string is used to escape the following character. Thus,
new RegExp("\s") // This gives the regex `/s/` since s is escaped.
will produce the regex s.
Note: to add modifiers/flags, pass the flags as second parameter to the constructor function.
While, /\s/ - the literal syntax, will produce the regex which is predictable.
The RegExp constructor syntax allows to create regular expression from the dynamically.
So, when the regex need to be crafted dynamically, use RegExp constructor syntax otherwise use regex literal syntax.
They are kind of the same but "Regular expression literals should be used when possible" because it is easier to read and does not require escaping like a string literal does.
Escaping example:
new RegExp("\\d+");
/\d+/;
Using the RegExp constructor is suitable when the pattern is computed dynamically, e.g. when it is provided by the user.
Source SonarLint Rule.
There are 2 ways of defining regular expressions.
Through an object constructor
Can be changed at runtime.
Through a literal.
Compiled at load of the script
Better performance
The literal is the best to use with known regular expressions, while the constructor is better for dynamically constructed regular expressions such as those from user input.
You could use any of the two and they will be handled in exactly the same way..

Not able to escape "?" but "\" works fine in javascript regex

I have following example (in node):
var reg = new RegExp("aa\?b", 'g');
var msgg = "aa?b"
if(msgg.match(reg)){
console.log(1);
} else {
console.log(0);
}
This prints 0 or returns null. I don't understand why it works if ? is replaced with \ but not in case of ?.
Is ? some more special than any others??
You need to double escape, like this:
var reg = new RegExp("aa\\?b", 'g');
Or use RegExp literal:
var reg = /aa\?b/g;
Reason is that JavaScript string "\?" evaluates to "?" because ? is not a special escape character. Hence your RegExp receives a literal question mark and not an escaped one. Double escaping ensures the "\" is treated literally.
Passing the literal notation instead of the string notation to the constructor your regexp works:
var reg = new RegExp(/aa\?b/, 'g');
var msgg = "aa?b"
console.log(msgg.match(reg))
From MDN:
There are 2 ways to create a RegExp object: a literal notation and a constructor. To indicate strings, the parameters to the literal notation do not use quotation marks while the parameters to the constructor function do use quotation marks. So the following expressions create the same regular expression:
/ab+c/i;
new RegExp('ab+c', 'i');
new RegExp(/ab+c/, 'i');
The literal notation provides a compilation of the regular expression when the expression is evaluated. Use literal notation when the regular expression will remain constant. For example, if you use literal notation to construct a regular expression used in a loop, the regular expression won't be recompiled on each iteration.
The constructor of the regular expression object, for example, new RegExp('ab+c'), provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.
Starting with ECMAScript 6, new RegExp(/ab+c/, 'i') no longer throws a TypeError ("can't supply flags when constructing one RegExp from another") when the first argument is a RegExp and the second flags argument is present. A new RegExp from the arguments is created instead.
When using the constructor function, the normal string escape rules (preceding special characters with \ when included in a string) are necessary. For example, the following are equivalent:
var re = /\w+/;
var re = new RegExp('\\w+');

JavaScript: RegExp constructor vs RegEx literal

I am studying about RegExp but everywhere I can see two syntax
new RegExp("[abc]")
And
/[abc]/
And if with modifiers then what is the use of additional backslash (\)
/\[abc]/g
I am not getting any bug with these two but I wonder is there any difference between these two. If yes then what is it and which is best to use?
I referred Differences between Javascript regexp literal and constructor but there I didn't find an explanation of which is best and what the difference is.
The key difference is that literal REGEX can't accept dynamic input, i.e. from variables, whereas the constructor can, because the pattern is specified as a string.
Say you wanted to match one or more words from an array in a string:
var words = ['foo', 'bar', 'orange', 'platypus'];
var str = "I say, foo, what a lovely platypus!";
str.match(new RegExp('\\b('+words.join('|')+')\\b', 'g')); //["foo", "platypus"]
This would not be possible with a literal /pattern/, as anything between the two forward slashes is interpreted literally; we'd have to specify the allowed words in the pattern itself, rather than reading them in from a dynamic source (the array).
Note also the need to double-escape (i.e. \\) special characters when specifying patterns in this way, because we're doing so in a string - the first backslash must be escaped by the second so one of them makes it into the pattern. If there were only one, it would be interpreted by JS's string parser as an escaping character, and removed.
As you can see, the RegExp constructor syntax requires string to be passed. \ in the string is used to escape the following character. Thus,
new RegExp("\s") // This gives the regex `/s/` since s is escaped.
will produce the regex s.
Note: to add modifiers/flags, pass the flags as second parameter to the constructor function.
While, /\s/ - the literal syntax, will produce the regex which is predictable.
The RegExp constructor syntax allows to create regular expression from the dynamically.
So, when the regex need to be crafted dynamically, use RegExp constructor syntax otherwise use regex literal syntax.
They are kind of the same but "Regular expression literals should be used when possible" because it is easier to read and does not require escaping like a string literal does.
Escaping example:
new RegExp("\\d+");
/\d+/;
Using the RegExp constructor is suitable when the pattern is computed dynamically, e.g. when it is provided by the user.
Source SonarLint Rule.
There are 2 ways of defining regular expressions.
Through an object constructor
Can be changed at runtime.
Through a literal.
Compiled at load of the script
Better performance
The literal is the best to use with known regular expressions, while the constructor is better for dynamically constructed regular expressions such as those from user input.
You could use any of the two and they will be handled in exactly the same way..

Why are regular expression strings not encapsulated in quotes in Javascript?

Aside from Javascript, all instances of regular expressions use something like (for finding a number in brackets) "\\[[0-9]+\\]" or r"\[[0-9]+\]". That string is then used in a function like Contains("\\[[0-9]+\\]", "[1009] is a number."). Regex strings in Javascripts are not encapsulated at all, so I see things like var patt = /w3schools/i. Why is this? How does Javascript tell the difference between this and other content? Why not just use normal strings?
Why is this?
That's just how regex literals work. Regular expressions are objects in JS, not plain strings.
How does Javascript tell the difference between this and other content?
That's just how the language grammar is defined. In fact it makes it much easier to tell the difference between a string and a regex than in other languages.
Why not just use normal strings?
Because escaping works different. Other languages use "raw" strings for this, which JavaScript doesn't (didn't) have. Instead, they introduced a literal notation for regular expressions - using / as a delimiter (borrowed from Perl).
Of course, you still can use normal strings, and create a regex object using the RegExp constructor, but for static expressions the literal syntax is much simpler.
Well, they are not strings to begin with. The are regex literals.
How does Javascript tell the difference between this and other content?
Just like the " are used to delimit string literals, or [...] are used to delimit array literals, / are used to delimit regular expression literals.
Why not just use normal strings?
Regular expression have different special characters and different escaping rules. That's why you have to use double escapes if you use a string with RegExp (e.g. "\\[[0-9]+\\]"). Many people get that wrong and it's a bit confusing.
So it makes sense to have a representation of regular expression that is not "inside" of another abstraction (strings).
Regular expressions in JavaScript are objects not strings.
var regex = /[0-9]/;
console.log(typeof regex); // "objec"
Regular expressions are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects. These patterns are used with the exec and test methods of RegExp, and with the match, replace, search, and split methods of String. This chapter describes JavaScript regular expressions.
Regular Expressions
The opening and closing / are not part of the expression they are just marking a regex literal just like {} is marking an object literal.

Quoting regex literals in javascript? Why not?

In this answer to a question, and lots of other places, I see unquoted strings in javascript.
For example:
var re = /\[media id="?(\d+)"?\]/gi;
Why shouldn't it instead be:
var re = '/\[media id="?(\d+)"?\]/gi';
Is it some kind of special handling of regular expressions, or can any string be declared like that?
var re = /\[media id="?(\d+)"?\]/gi;
is regex literal, not a string.
it's only for regular expressions, not for strings.
Because, in JavaScript, Regex is a built-in type, not a string-pattern that is passed to some parser like e.g. in C# or Java.
That means that when you write var regex = /pattern/, JavaScript automatically uses that literal as a regular expression pattern, making regex an object of the RegExp type.
See: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions
Is it some kind of special handling of regular expressions?
Yes, regular expressions get special handling. As MDN points out, there is a built-in JavaScript regular expression type, with its own syntax for literals.
or can any string be declared like that?
No. Since regular expressions are objects and are not strings, if you tried to write a string with a regular expression literal you would get a regular expression object, not a string.

Categories

Resources