Not able to escape "?" but "\" works fine in javascript regex - javascript

I have following example (in node):
var reg = new RegExp("aa\?b", 'g');
var msgg = "aa?b"
if(msgg.match(reg)){
console.log(1);
} else {
console.log(0);
}
This prints 0 or returns null. I don't understand why it works if ? is replaced with \ but not in case of ?.
Is ? some more special than any others??

You need to double escape, like this:
var reg = new RegExp("aa\\?b", 'g');
Or use RegExp literal:
var reg = /aa\?b/g;
Reason is that JavaScript string "\?" evaluates to "?" because ? is not a special escape character. Hence your RegExp receives a literal question mark and not an escaped one. Double escaping ensures the "\" is treated literally.

Passing the literal notation instead of the string notation to the constructor your regexp works:
var reg = new RegExp(/aa\?b/, 'g');
var msgg = "aa?b"
console.log(msgg.match(reg))
From MDN:
There are 2 ways to create a RegExp object: a literal notation and a constructor. To indicate strings, the parameters to the literal notation do not use quotation marks while the parameters to the constructor function do use quotation marks. So the following expressions create the same regular expression:
/ab+c/i;
new RegExp('ab+c', 'i');
new RegExp(/ab+c/, 'i');
The literal notation provides a compilation of the regular expression when the expression is evaluated. Use literal notation when the regular expression will remain constant. For example, if you use literal notation to construct a regular expression used in a loop, the regular expression won't be recompiled on each iteration.
The constructor of the regular expression object, for example, new RegExp('ab+c'), provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.
Starting with ECMAScript 6, new RegExp(/ab+c/, 'i') no longer throws a TypeError ("can't supply flags when constructing one RegExp from another") when the first argument is a RegExp and the second flags argument is present. A new RegExp from the arguments is created instead.
When using the constructor function, the normal string escape rules (preceding special characters with \ when included in a string) are necessary. For example, the following are equivalent:
var re = /\w+/;
var re = new RegExp('\\w+');

Related

Error to test a string with regex [duplicate]

This question already has an answer here:
regular expression for [allchars]
(1 answer)
Closed 7 years ago.
I want to find an occurrence of a word in a long string. This word changes with every iteration of the for loop:
array_word = '....'
for(var i.........) {
var regex=new RegExp('/.*'+array_word[i]+'.*/');
if(regex.test(array_word[i]) {
return true;
}
}
The problem could be that I used the wrong regex because the program doesn't return true, can anyone help me?
You don't need to use forward slashes.
var regex = new RegExp('.*'+array_word[i]+'.*');
.* won't be needed for this case.
var regex = new RegExp(array_word[i]);
From the docs,
There are 2 ways to create a RegExp object: a literal notation and a constructor. To indicate strings, the parameters to the literal notation do not use quotation marks while the parameters to the constructor function do use quotation marks. So the following expressions create the same regular expression:
/ab+c/i;
new RegExp('ab+c', 'i');
new RegExp(/ab+c/, 'i');
The literal notation provides compilation of the regular expression when the expression is evaluated. Use literal notation when the regular expression will remain constant. For example, if you use literal notation to construct a regular expression used in a loop, the regular expression won't be recompiled on each iteration.

build Regex string with js

<script>
var String = "1 Apple and 13 Oranges";
var regex = /[^\d]/g;
var regObj = new RegExp(regex);
document.write(String.replace(regObj,''));
</script>
And it works fine - return all the digits in the string.
However when I put quote marks around the regex like this:
var regex = "/[^\d]/g"; This doesn't work.
How can I turn a string to a working regex in this case?
Thanks
You can create regular expressions in two ways, using the regular expression literal notation, or RegExp constructor. It seems you have mixed up the two. :)
Here is the literal way:
var regex = /[^\d]/g;
In this case you don't have use quotes. / characters at the ends serve as the delimiters, and you specify the flags at the end.
Here is how to use the RegExp constructor, in which you pass the pattern and flags (optional) as string. When you use strings you have to escape any special characters inside it using a '\'.
Since the '\' (backslash) is a special character, you have to escape the backslash using another backslash if you use double quotes.
var regex = new RegExp("[^\\d]", "g");
Hope this makes sense.
As slash(\) has special meaning for strings (e.g. "\n","\t", etc...), you need to escape that simbol, when you are passing to regexp:
var regex = "[^\\d]";
Also expression flags (e.g. g,i,etc...) must be passed as separate parameter for RegExp.
So overall:
var regex = "[^\\d]";
var flags = "g";
var regObj = new RegExp(regex, flags);

javascript casting a string of regex into a type object

I have a box where the user inputs a regex, and in Javascript I take that value and have another string tested with it like so: (this is an abstraction of my real issue)
var regex = $('input').val();
regex.test('some string');
The only way I know to make sure to cast the regex to a safe Object type, is to use eval().
Is that the best way of casting it?
Use the RegExp constructor to create a pattern.
// The next line would escape special characters. Since you want to support
// manually created RegExps, the next line is commented:
// regex = regexp.replace(/([[^$.|?*+(){}])/g, '\\$1')
regex = new RegExp(regex);
// ^ Optionally, add the flags as a second argument, eg:
//regex=new RegExp(regex, 'i'); //Case-insensitive
UPDATE
You seem to misunderstand the usage of the RegExp constructor. The "slash-notation" is the "primitive" way to create a Regular expression. For comparsion, consider (new is optional):
"123" === new String(123)
false === new Boolean(1)
// Because a RegExp is an object, the strict compare `===` method evaluates to
// false if the pattern is not the same object.
// Example: /\d/ == /\d/ evaluates to false
// To compare a regex pattern, use the `pattern` property
/[a-z]/i.pattern === (new RegExp("[a-z]", "i")).pattern
The RegExp constructor takes two arguments, with the second one being optional:
String Pattern (without trailing and ending slashes)
String (optional) Flags A combination of:
i (ignore case)
g (global match)
m (multi-line (rarely used)).
Examples (new is optional):
Using constructor using slash-notation # Notice:
RegExp('[0-9]'); /[0-9]/ # no slashes at RegExp
RegExp('/path/to/file\.html$') /path\/to\/file\.html$/ # the escaped \
RegExp('i\'m', 'i') /i'm/i # \' vs ', 'i' vs /i
Implementing a "RegExp" form field using slash-notation
var regex = $('input').val(); //Example: '/^[0-9]+$/i'
// Using a RegEx to implement a Reg Exp, ironically..
regex = regex.match(/^\/([\S\s]+)\/([gim]{0,3})$/);
regex = regex || [, regex, ""]; // If the previous match is null,
// treat the string as a slash-less RegEx
regex = new RegExp(regex[1], regex[2]);
regex.test('some string');
try
var regex = new RegExp( $('input').val() );

Building regexp from JS variables not working

I am trying to build a regexp from static text plus a variable in javascript. Obviously I am missing something very basic, see comments in code below. Help is very much appreciated:
var test_string = "goodweather";
// One regexp we just set:
var regexp1 = /goodweather/;
// The other regexp we built from a variable + static text:
var regexp_part = "good";
var regexp2 = "\/" + regexp_part + "weather\/";
// These alerts now show the 2 regexp are completely identical:
alert (regexp1);
alert (regexp2);
// But one works, the other doesn't ??
if (test_string.match(regexp1))
alert ("This is displayed.");
if (test_string.match(regexp2))
alert ("This is not displayed.");
First, the answer to the question:
The other answers are nearly correct, but fail to consider what happens when the text to be matched contains a literal backslash, (i.e. when: regexp_part contains a literal backslash). For example, what happens when regexp_part equals: "C:\Windows"? In this case the suggested methods do not work as expected (The resulting regex becomes: /C:\Windows/ where the \W is erroneously interpreted as a non-word character class). The correct solution is to first escape any backslashes in regexp_part (the needed regex is actually: /C:\\Windows/).
To illustrate the correct way of handling this, here is a function which takes a passed phrase and creates a regex with the phrase wrapped in \b word boundaries:
// Given a phrase, create a RegExp object with word boundaries.
function makeRegExp(phrase) {
// First escape any backslashes in the phrase string.
// i.e. replace each backslash with two backslashes.
phrase = phrase.replace(/\\/g, "\\\\");
// Wrap the escaped phrase with \b word boundaries.
var re_str = "\\b"+ phrase +"\\b";
// Create a new regex object with "g" and "i" flags set.
var re = new RegExp(re_str, "gi");
return re;
}
// Here is a condensed version of same function.
function makeRegExpShort(phrase) {
return new RegExp("\\b"+ phrase.replace(/\\/g, "\\\\") +"\\b", "gi");
}
To understand this in more depth, follows is a discussion...
In-depth discussion, or "What's up with all these backslashes!?"
JavaScript has two ways to create a RegExp object:
/pattern/flags - You can specify a RegExp Literal expression directly, where the pattern is delimited using a pair of forward slashes followed by any combination of the three pattern modifier flags: i.e. 'g' global, 'i' ignore-case, or 'm' multi-line. This type of regex cannot be created dynamically.
new RegExp("pattern", "flags") - You can create a RegExp object by calling the RegExp() constructor function and pass the pattern as a string (without forward slash delimiters) as the first parameter and the optional pattern modifier flags (also as a string) as the second (optional) parameter. This type of regex can be created dynamically.
The following example demonstrates creating a simple RegExp object using both of these two methods. Lets say we wish to match the word "apple". The regex pattern we need is simply: apple. Additionally, we wish to set all three modifier flags.
Example 1: Simple pattern having no special characters: apple
// A RegExp literal to match "apple" with all three flags set:
var re1 = /apple/gim;
// Create the same object using RegExp() constructor:
var re2 = new RegExp("apple", "gim");
Simple enough. However, there are significant differences between these two methods with regard to the handling of escaped characters. The regex literal syntax is quite handy because you only need to escape forward slashes - all other characters are passed directly to the regex engine unaltered. However, when using the RegExp constructor method, you pass the pattern as a string, and there are two levels of escaping to be considered; first is the interpretation of the string and the second is the interpretation of the regex engine. Several examples will illustrate these differences.
First lets consider a pattern which contains a single literal forward slash. Let's say we wish to match the text sequence: "and/or" in a case-insensitive manner. The needed pattern is: and/or.
Example 2: Pattern having one forward slash: and/or
// A RegExp literal to match "and/or":
var re3 = /and\/or/i;
// Create the same object using RegExp() :
var re4 = new RegExp("and/or", "i");
Note that with the regex literal syntax, the forward slash must be escaped (preceded with a single backslash) because with a regex literal, the forward slash has special meaning (it is a special metacharacter which is used to delimit the pattern). On the other hand, with the RegExp constructor syntax (which uses a string to store the pattern), the forward slash does NOT have any special meaning and does NOT need to be escaped.
Next lets consider a pattern which includes a special: \b word boundary regex metasequence. Say we wish to create a regex to match the word "apple" as a whole word only (so that it won't match "pineapple"). The pattern (as seen by the regex engine) needs to be: \bapple\b:
Example 3: Pattern having \b word boundaries: \bapple\b
// A RegExp literal to match the whole word "apple":
var re5 = /\bapple\b/;
// Create the same object using RegExp() constructor:
var re6 = new RegExp("\\bapple\\b");
In this case the backslash must be escaped when using the RegExp constructor method, because the pattern is stored in a string, and to get a literal backslash into a string, it must be escaped with another backslash. However, with a regex literal, there is no need to escape the backslash. (Remember that with a regex literal, the only special metacharacter is the forward slash.)
Backslash SOUP!
Things get even more interesting when we need to match a literal backslash. Let's say we want to match the text sequence: "C:\Program Files\JGsoft\RegexBuddy3\RegexBuddy.exe". The pattern to be processed by the regex engine needs to be: C:\\Program Files\\JGsoft\\RegexBuddy3\\RegexBuddy\.exe. (Note that the regex pattern to match a single backslash is \\ i.e. each must be escaped.) Here is how you create the needed RegExp object using the two JavaScript syntaxes
Example 4: Pattern to match literal back slashes:
// A RegExp literal to match the ultimate Windows regex debugger app:
var re7 = /C:\\Program Files\\JGsoft\\RegexBuddy3\\RegexBuddy\.exe/;
// Create the same object using RegExp() constructor:
var re8 = new RegExp(
"C:\\\\Program Files\\\\JGsoft\\\\RegexBuddy3\\\\RegexBuddy\\.exe");
This is why the /regex literal/ syntax is generally preferred over the new RegExp("pattern", "flags") method - it completely avoids the backslash soup that can frequently arise. However, when you need to dynamically create a regex, as the OP needs to here, you are forced to use the new RegExp() syntax and deal with the backslash soup. (Its really not that bad once you get your head wrapped 'round it.)
RegexBuddy to the rescue!
RegexBuddy is a Windows app that can help with this backslash soup problem - it understands the regex syntaxes and escaping requirements of many languages and will automatically add and remove backslashes as required when pasting to and from the application. Inside the application you compose and debug the regex in native regex format. Once the regex works correctly, you export it using one of the many "copy as..." options to get the needed syntax. Very handy!
You should use the RegExp constructor to accomplish this:
var regexp2 = new RegExp(regexp_part + "weather");
Here's a related question that might help.
The forward slashes are just Javascript syntax to enclose regular expresions in. If you use normal string as regex, you shouldn't include them as they will be matched against. Therefore you should just build the regex like that:
var regexp2 = regexp_part + "weather";
I would use :
var regexp2 = new RegExp(regexp_part+"weather");
Like you have done that does :
var regexp2 = "/goodweather/";
And after there is :
test_string.match("/goodweather/")
Wich use match with a string and not with the regex like you wanted :
test_string.match(/goodweather/)
While this solution may be overkill for this specific question, if you want to build RegExps programmatically, compose-regexp can come in handy.
This specific problem would be solved by using
import {sequence} from 'compose-regexp'
const weatherify = x => sequence(x, /weather/)
Strings are escaped, so
weatherify('.')
returns
/\.weather/
But it can also accept RegExps
weatherify(/./u)
returns
/.weather/u
compose-regexp supports the whole range of RegExps features, and let one build RegExps from sub-parts, which helps with code reuse and testability.

Easy Javascript Regex Question

Why doesn't this assign prepClass to the string selectorClass with underscores instead of non alpha chars? What do I need to change it to?
var regex = new RegExp("/W/", "g");
var prepClass = selectorClass.replace(regex, "_");
A couple of things:
If you use the RegExp constructor, you don't need the slashes, you are maybe confusing it with the syntax of RegExp literals.
You want match the \W character class.
The following will work:
var regex = new RegExp("\\W", "g");
The RegExp constructor accepts a string containing the pattern, note that you should double escape the slash, in order to get a single slash and a W ("\W") in the string.
Or you could simply use the literal notation:
var regex = /\W/g;
Recommended read:
Regular Expressions (MDC)

Categories

Resources