I'm aware of the differences between using a RegExp constructor and a regular expression literal, but is the use of one versus the other just a matter of preference?
Or are there instances where one should use the RegExp constructor versus a regular expression literal? If so, is there an example of this?
MDN says:
Regular expression literals provide compilation of the regular expression when the script is loaded. When the regular expression will remain constant, use this for better performance.
Using the constructor function provides runtime compilation of the regular expression. Use the constructor function when you know the regular expression pattern will be changing, or you don't know the pattern and are getting it from another source, such as user input.
In practice, I use the literal form for simple regular expressions. For more complex expressions, I put them together piece-wise with string concatenation and use the constructor.
BONUS: For those complicated regular expressions, a tool like debuggex really helps.
Related
Earlier questions on StackOverflow discuss escaping for JavaScript regular expressions, e.g.:
How to escape regular expression in javascript?
Escape string for use in Javascript regex
An implementation suggested is the following:
RegExp.quote = function(str) {
return str.replace(/([.?*+^$[\]\\(){}|-])/g, "\\$1");
};
Given that regular expressions in the two languages are not identical, is anyone aware of a JavaScript method that properly escapes strings to be used for Java regular expressions?
There's no need for any escaping at all. Those questions are about what needs to be done when the regular expression is being constructed as a string in the source language. Since you're reading the string from an input field, there's no layer of interpretation to worry about.
Just send the string to the server, where it will be discovered to be a valid regex or not.
edit — though I can't think of any, the real thing to worry about might be any sort of "injection" attack that could be conducted through this avenue. Seems to me that if you're just passing a regex to Pattern.compile() there aren't any side-effect channels that could be exploited.
I have an arithmetic expression
((20+30)-25)/5
I want to validate by using regular expression. The expression can only have integers, floating point numbers, operands and parenthesis.
How can I generate regular expression to validate please help or suggest any other way to validate that string using javascript.
As I said in a comment, this is impossible using one JavaScript regular expression. However, you can do it using a loop: replace subexpressions with atoms, repeat until you get an atom. If you can't reduce any more, and whatever is left is not an atom, it does not validate. This is actually pretty much the same procedure you'd do to evaluate it (just skipping the abstract syntax tree). You can search for \(\d+\)|\d+[-+/*]\d+ and replace with 0:
Example:
((20+30)-25)/5
((0)-25)/5
(0-25)/5
(0)/5
0/5
0
Done
If you failed to match and didn't have just 0, it's a fail.
(To evaluate as opposed to validate, you'd just have to be replacing with with the actual value rather than a dummy stand-in, everything else is the same).
JavaScript "eval" function is the best validator.
Try to do this:
eval("((20+30)-25)5");
and you will get sufficiently detailed error description.
You will only be able to do this with regular expressions if you impose a maximum depth to the parenthesis nesting. Otherwise, the set of arithmetic expressions forms a context free language but not a regular language.
If I had to use regex, the approach I would use is to write a regular grammar for your set of arithmetic expressions and then convert that to a regular expression.
Another approach is to write a recursive descent parser, which is a fairly simple project and works very nicely for arithmetic expressions.
Just out of curiosity, is it possible to parse a string that is totally made out of random but valid regular expressions with a single regular expression?
given the string of regex:
<[^>]*>\xA9
parses to:
<[^>]*>
\xA9
in which the first one match html and second one match a copyright symbol.
Edit:
I found a similar question asked at SO claiming that it maybe possible. Here, I'm referring to regex in JavaScript ECMA-262 only.
No, it is not possible: regular expression language allows parenthesized expressions representing capturing and non-capturing groups, lookarounds, etc., where parentheses must be balanced. It is not possible even in theory to write a regular expression that verifies if parentheses are balanced in a given string. Without an ability to do that you wouldn't know where one regexp ends and the other one starts.
In general, regex grammar is relatively complex. To get an idea of just how complex it is, take a look at the parser in the source of Java's Pattern class.
I hope this question isn't too broad, but then again I would expect the Javascript (and other languages) regular expression engine's to share most of it's functionality with what is considered standard / expected regular expression behavior.
I made a statement about C# having unique regular expression capabilities in this post :: RegEx match open tags except XHTML self-contained tags
Specifically, here is the statement:
C# is unique when it comes to regular expressions in that it supports
Balancing Group
Definitions.
See Matching Balanced Constructs with .NET Regular Expressions
See .NET Regular Expressions: Regex and Balanced Matching
See Microsoft's docs on Balancing Group Definitions
I'm curious what unique regular expression capabilities javascript has if any.
Although JavaScript’s regular expression library supports features that are considered as common (see comparison table), there is one particular expression that I haven’t seen in other:
/[^]/
This matches any arbitrary character similar to /[\s\S]/ (or any other union of complementary character classes) and can be handy as JavaScript does not have a s modifier like others have to have . match line breaks too.
Similar to that:
/[]/
This evaluates to an empty character set and can’t match anything at all.
javascript regexes are a subset of perl regexes.
Meaning, it has no unique features, but it's missing quite a few.
Javascript regular expressions are modeled on Perl's regular expressions.
See: http://www.regular-expressions.info/javascript.html
JavaScript's regex engine is merely a subset of Perl's engine, meaning that it doesn't add anything new and is missing many of the features Perl contains.
You can read more about it here: http://www.regular-expressions.info/javascript.html.
I am trying to create a regular expression that checks for letters, numbers, and underscores. In .NET, I can do "^\w+$". However, I am not that familiar with the JavaScript syntax. Can somebody help me out?
Thank you!
One obvious difference is that in JavaScript, you write the regex as /pattern/flags -- this is Perl-style. Your "example" would then be ^\w+$ → /^\w+$/.
For example, replace multiple e's with one e, case-insensitive search (hence the i flag):
var s='qweEEerty';
s=s.replace(/e+/i, 'e');
Returns: qwerty.
That same expression will work in JavaScript (there are some differences between .NET regular expressions and JavaScript regular expressions but not in this example).
I recommend that you read Using Regular Expressions with JavaScript and ActionScript to learn a bit more about JavaScript's regular expression implementation.