Quoting regex literals in javascript? Why not? - javascript

In this answer to a question, and lots of other places, I see unquoted strings in javascript.
For example:
var re = /\[media id="?(\d+)"?\]/gi;
Why shouldn't it instead be:
var re = '/\[media id="?(\d+)"?\]/gi';
Is it some kind of special handling of regular expressions, or can any string be declared like that?

var re = /\[media id="?(\d+)"?\]/gi;
is regex literal, not a string.

it's only for regular expressions, not for strings.

Because, in JavaScript, Regex is a built-in type, not a string-pattern that is passed to some parser like e.g. in C# or Java.
That means that when you write var regex = /pattern/, JavaScript automatically uses that literal as a regular expression pattern, making regex an object of the RegExp type.
See: https://developer.mozilla.org/en/JavaScript/Guide/Regular_Expressions

Is it some kind of special handling of regular expressions?
Yes, regular expressions get special handling. As MDN points out, there is a built-in JavaScript regular expression type, with its own syntax for literals.
or can any string be declared like that?
No. Since regular expressions are objects and are not strings, if you tried to write a string with a regular expression literal you would get a regular expression object, not a string.

Related

Passing a variable to javascript regex [duplicate]

I am studying about RegExp but everywhere I can see two syntax
new RegExp("[abc]")
And
/[abc]/
And if with modifiers then what is the use of additional backslash (\)
/\[abc]/g
I am not getting any bug with these two but I wonder is there any difference between these two. If yes then what is it and which is best to use?
I referred Differences between Javascript regexp literal and constructor but there I didn't find an explanation of which is best and what the difference is.
The key difference is that literal REGEX can't accept dynamic input, i.e. from variables, whereas the constructor can, because the pattern is specified as a string.
Say you wanted to match one or more words from an array in a string:
var words = ['foo', 'bar', 'orange', 'platypus'];
var str = "I say, foo, what a lovely platypus!";
str.match(new RegExp('\\b('+words.join('|')+')\\b', 'g')); //["foo", "platypus"]
This would not be possible with a literal /pattern/, as anything between the two forward slashes is interpreted literally; we'd have to specify the allowed words in the pattern itself, rather than reading them in from a dynamic source (the array).
Note also the need to double-escape (i.e. \\) special characters when specifying patterns in this way, because we're doing so in a string - the first backslash must be escaped by the second so one of them makes it into the pattern. If there were only one, it would be interpreted by JS's string parser as an escaping character, and removed.
As you can see, the RegExp constructor syntax requires string to be passed. \ in the string is used to escape the following character. Thus,
new RegExp("\s") // This gives the regex `/s/` since s is escaped.
will produce the regex s.
Note: to add modifiers/flags, pass the flags as second parameter to the constructor function.
While, /\s/ - the literal syntax, will produce the regex which is predictable.
The RegExp constructor syntax allows to create regular expression from the dynamically.
So, when the regex need to be crafted dynamically, use RegExp constructor syntax otherwise use regex literal syntax.
They are kind of the same but "Regular expression literals should be used when possible" because it is easier to read and does not require escaping like a string literal does.
Escaping example:
new RegExp("\\d+");
/\d+/;
Using the RegExp constructor is suitable when the pattern is computed dynamically, e.g. when it is provided by the user.
Source SonarLint Rule.
There are 2 ways of defining regular expressions.
Through an object constructor
Can be changed at runtime.
Through a literal.
Compiled at load of the script
Better performance
The literal is the best to use with known regular expressions, while the constructor is better for dynamically constructed regular expressions such as those from user input.
You could use any of the two and they will be handled in exactly the same way..

JavaScript: RegExp constructor vs RegEx literal

I am studying about RegExp but everywhere I can see two syntax
new RegExp("[abc]")
And
/[abc]/
And if with modifiers then what is the use of additional backslash (\)
/\[abc]/g
I am not getting any bug with these two but I wonder is there any difference between these two. If yes then what is it and which is best to use?
I referred Differences between Javascript regexp literal and constructor but there I didn't find an explanation of which is best and what the difference is.
The key difference is that literal REGEX can't accept dynamic input, i.e. from variables, whereas the constructor can, because the pattern is specified as a string.
Say you wanted to match one or more words from an array in a string:
var words = ['foo', 'bar', 'orange', 'platypus'];
var str = "I say, foo, what a lovely platypus!";
str.match(new RegExp('\\b('+words.join('|')+')\\b', 'g')); //["foo", "platypus"]
This would not be possible with a literal /pattern/, as anything between the two forward slashes is interpreted literally; we'd have to specify the allowed words in the pattern itself, rather than reading them in from a dynamic source (the array).
Note also the need to double-escape (i.e. \\) special characters when specifying patterns in this way, because we're doing so in a string - the first backslash must be escaped by the second so one of them makes it into the pattern. If there were only one, it would be interpreted by JS's string parser as an escaping character, and removed.
As you can see, the RegExp constructor syntax requires string to be passed. \ in the string is used to escape the following character. Thus,
new RegExp("\s") // This gives the regex `/s/` since s is escaped.
will produce the regex s.
Note: to add modifiers/flags, pass the flags as second parameter to the constructor function.
While, /\s/ - the literal syntax, will produce the regex which is predictable.
The RegExp constructor syntax allows to create regular expression from the dynamically.
So, when the regex need to be crafted dynamically, use RegExp constructor syntax otherwise use regex literal syntax.
They are kind of the same but "Regular expression literals should be used when possible" because it is easier to read and does not require escaping like a string literal does.
Escaping example:
new RegExp("\\d+");
/\d+/;
Using the RegExp constructor is suitable when the pattern is computed dynamically, e.g. when it is provided by the user.
Source SonarLint Rule.
There are 2 ways of defining regular expressions.
Through an object constructor
Can be changed at runtime.
Through a literal.
Compiled at load of the script
Better performance
The literal is the best to use with known regular expressions, while the constructor is better for dynamically constructed regular expressions such as those from user input.
You could use any of the two and they will be handled in exactly the same way..

Why are regular expression strings not encapsulated in quotes in Javascript?

Aside from Javascript, all instances of regular expressions use something like (for finding a number in brackets) "\\[[0-9]+\\]" or r"\[[0-9]+\]". That string is then used in a function like Contains("\\[[0-9]+\\]", "[1009] is a number."). Regex strings in Javascripts are not encapsulated at all, so I see things like var patt = /w3schools/i. Why is this? How does Javascript tell the difference between this and other content? Why not just use normal strings?
Why is this?
That's just how regex literals work. Regular expressions are objects in JS, not plain strings.
How does Javascript tell the difference between this and other content?
That's just how the language grammar is defined. In fact it makes it much easier to tell the difference between a string and a regex than in other languages.
Why not just use normal strings?
Because escaping works different. Other languages use "raw" strings for this, which JavaScript doesn't (didn't) have. Instead, they introduced a literal notation for regular expressions - using / as a delimiter (borrowed from Perl).
Of course, you still can use normal strings, and create a regex object using the RegExp constructor, but for static expressions the literal syntax is much simpler.
Well, they are not strings to begin with. The are regex literals.
How does Javascript tell the difference between this and other content?
Just like the " are used to delimit string literals, or [...] are used to delimit array literals, / are used to delimit regular expression literals.
Why not just use normal strings?
Regular expression have different special characters and different escaping rules. That's why you have to use double escapes if you use a string with RegExp (e.g. "\\[[0-9]+\\]"). Many people get that wrong and it's a bit confusing.
So it makes sense to have a representation of regular expression that is not "inside" of another abstraction (strings).
Regular expressions in JavaScript are objects not strings.
var regex = /[0-9]/;
console.log(typeof regex); // "objec"
Regular expressions are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects. These patterns are used with the exec and test methods of RegExp, and with the match, replace, search, and split methods of String. This chapter describes JavaScript regular expressions.
Regular Expressions
The opening and closing / are not part of the expression they are just marking a regex literal just like {} is marking an object literal.

dynamic variable written into regular expression

I have a simple regular expression:
str.match(/SK=([\w‌​\-]+)/i);
I would like the SK part to be dynamic so I can return the match for SK=, WP= etc...
So I'm looking for something like:
var attr = 'SK';
str.match(/' + attr + '=([\w‌​\-]+)/i);
Use the constructor syntax to create your regex:
var myRegex = new RegExp(attr + "([\\w‌​\\-]+)","i");
and then:
str.match(myRegex);
See here (emphasis mine): https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
The literal notation provides compilation of the regular expression when the expression is evaluated. Use literal notation when the regular expression will remain constant. For example, if you use literal notation to construct a regular expression used in a loop, the regular expression won't be recompiled on each iteration.
The constructor of the regular expression object, for example, new RegExp("ab+c"), provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.

Regex To Sort A String Containing Digits

I have a string which contains digits. I need to sort this string using regular expression.
var myString = "85762034834126745305743";
I'm looking for a complete solution which only use regular expression. Just need your thought on this whether it can be achieved or not.
Regular expressions are not suited for this kind of task. Plain old JavaScript is a lot simpler and easier:
"85762034834126745305743".split("").sort().join("") // "00122333344445556677788"

Categories

Resources