Match special characters including square braces - javascript

I want to have a regex for text field in ExtJs(maskRe) which matches all java code pattern
I've used
maskRe:/^[A-Za-z0-9 _=//~'"|{}();*:?+,.]*$/
I also want to include [,], but it seems /[, /], //[, //] is not working..
Any inputs please

The problem is you need to escape your forward slash. Change // to \/:
/^[A-Za-z0-9 _=\/~'"|{}();*:?+,.]*$/
However this regular expression does not match any Java code. Java code can contain almost any Unicode character. int møøse = 42; is valid Java.

To strip special characters from its magic powers you have to escape them, by putting backslash \ in front of character. I.e. to match [ you type \[.
And since backslash acts as special character as well, to match it literally, you escape it the same way: \\.
And since you used / as patter delimiter, you need to escape its occurrences within pattern:
/^[A-Za-z0-9 _=\/~'"|{}();*:?+,.]*$/

The way to escape regex meta-characters is using a backslash (\), not a forwards slash (/).
[,] should be \[,\]
// should be \/

Related

Find single backslashes followed by alphabet

I need to a regres that find all single backslashes followed by an alphabet.
So I want to find backslashes that exist in patterns like these:
\a
\f
\test
and not in these patterns:
\\a
\"
Thanks
Updated:
As #Amadan points out in the comments below, JavaScript does not implement lookbehind, which basically breaks my original answer.
There is an approach suggested in this stackoverflow post that may be a reasonable path to take for this problem.
Basically the poster suggests reversing the string and using lookahead to match. If we were to do that, then we would want to match a string of alphabetic characters followed by a single backslash, but not followed by multiple backslashes. The regex for that would look like this:
/[a-zA-Z]+\\(?![\\]+)/g
[a-zA-Z]+ - match one or more alphabetic characters
\\ - followed by a single backslash
(?![\\]+) - not followed by one or more backslashes
g - match it globally (more than one occurrence)
The downside of this approach (aside from having to reverse your string) is that you can't match only the backslash, but will have to also match the alphabetic characters that come before it (since JS doesn't have lookbehind).
Original Answer (using lookbehind):
/(?<!\\)\\[a-zA-Z]+/g (using negative lookbehind) will match a single backslash followed by one or more letters of the alphabet, regardless of case. This regular expression breaks down as follows:
(?<!\\)\\ - use negative lookbehind to match a \ that is not preceded by a \
[a-zA-Z]+ - match one or more letters of the alphabet, regardless of case
g - match it globally
If you only want to match the \ and not the alphabetic characters, then you can use positive lookahead. The regex for that would look like: /(?!>\\)\\(?=[a-zA-Z]+)/g and would break down like this:
(?<!\\)\\ - use negative lookbehind to match a \ that is not preceded by a \
(?=[a-zA-Z]+) - and is followed by one or more alphabetic characters
g - match it globally
If you only want the regex to match backslashes at the beginning of a line, prepend a ^ to it.
You can use a tool like Rubular to test and play with regular expressions.

Regex to allow special characters

I need a regex that will allow alphabets, hyphen (-), quote ('), dot (.), comma(,) and space. this is what i have now
^[A-Za-z\s\-]$
Thanks
I removed \s from your regex since you said space, and not white space. Feel free to put it back by replacing the space at the end with \s Otherwise pretty simple:
^[A-Za-z\-'., ]+$
It matches start of the string. Any character in the set 1 or more times, and end of the string. You don't have to escape . in a set, in case you were wondering.
You probably tried new RegExp("^[A-Za-z\s\-\.\'\"\,]$"). Yet, you have a string literal there, and the backslashes just escape the following characters - necessary only for the delimiting quote (and for backslashes).
"^[A-Za-z\s\-\.\'\"\,]$" === "^[A-Za-zs-.'\",]$" === '^[A-Za-zs-.\'",]$'
Yet, the range s-. is invalid. So you would need to escape the backslash to pass a string with a backslash in the RegExp constructor:
new RegExp("^[A-Za-z\\s\\-\\.\\'\\\"\\,]$")
Instead, regex literals are easier to read and write as you do not need to string-escape regex escape characters. Also, they are parsed only once during script "compilation" - nothing needs to be executed each time you the line is evaluated. The RegExp constructor only needs to be used if you want to build regexes dynamically. So use
/^[A-Za-z\s\-\.\'\"\,]$/
and it will work. Also, you don't need to escape any of these chars in a character class - so it's just
/^[A-Za-z\s\-.'",]$/
You are pretty close, try the following:
^[A-Za-z\s\-'.,]+$
Note that I assumed that you want to match strings that contain one or more of any of these characters, so I added + after the character class which mean "repeat the previous element one or more times".
Note that this will currently also allow tabs and line breaks in addition to spaces because \s will match any whitespace character. If you only want to allow spaces, change it to ^[A-Za-z \-'.,]+$ (just replaced \s with a space).

javascript replace() function strange behaviour with regexp

Am i doing sth wrong or there is a problem with JS replace ?
<input type="text" id="a" value="(55) 55-55-55" />​
document.write($("#a").val().replace(/()-/g,''));​
prints (55) 555555
http://jsfiddle.net/Yb2yV/
how can i replace () and spaces too?
In a JavaScript regular expression, the ( and ) characters have special meaning. If you want to list them literally, put a backslash (\) in front of them.
If your goal is to get rid of all the (, ), -, and space characters, you could do it with a character class combined with an alternation (e.g., either-or) on \s, which stands for "whitespace":
document.write($("#a").val().replace(/[()\-]|\s/g,''));​
(I didn't put backslashes in front of the () because you don't need to within a character class. I did put one in front of the - because within a character class, - has special meaning.)
Alternately, if you want to get rid of anything that isn't a digit, you can use \D:
document.write($("#a").val().replace(/\D/g,''));​
\D means "not a digit" (note that it's a capital, \d in lower case is the opposite [any digit]).
More info on the MDN page on regular expressions.
You need to use a character class
/[-() ]/
Using "-" as the first character solves the ambiguity because a dash is normally used for ranges (e.g. [a-zA-Z0-9]).
document.write($("#a").val().replace(/[\s()-]/g,''));​
That will remove all whitespace (\s), parens, and dashes
Use this
.replace(/\(|\)|-| /g,'')
You have to escape the parenthesis (i.e. \( instead of (). In your regexp, you want to list the four items: \(, \), '-' and (space) and as you want to replace any of them, not just a string of them four together, you have to use OR | between them.
May be very bad but a very basic approach would be,
document.write($("#a").val().replace(/(\()|(\))|-| |/g,''));​​
| means OR,
\ is used for escaping reserved symbols
You want to match any character in the set, so you should use square brackets to make a character set:
document.write($("#a").val().replace(/[()\- ]/g,''));
Normally, parentheses have a special meaning in regular expressions, so they were being ignored in your regex, leaving just the dash. Normally, to get literal parentheses, you need to escape them with \ (but in a square bracket block, as above, you don't).
The dash above is escaped because it has normally indicates range in a character set, e.g., [a-z].
The brackets indicate a capturing group in the regexp. You'd need to escape them (/\(\)-/) to match the sequence "()-". Yet I guess you want to use a character class, i.e. a expression that matches "(", ")" or "-"; for whitespaces include the \s shorthand:
value.replace(/[()-\s]/g, "");
You might want to read some documentation or tutorial.

Strange javascript regular expressions

I have found the following regular expression
new RegExp("(^|\\s)hello(\\s|$)");
I refer http://www.javascriptkit.com/jsref/escapesequence.shtml for regular expressions..
But i cannot see \s escape sequence there..I know \s indicate whitespace character...
But what does the preceding \ do ..Which character is escaped?
I found similar regular expression in the Treewalker code in the following document http://ejohn.org/blog/getelementsbyclassname-speed-comparison/
The double \\ is to escape the backslash inside the string. In other word, \\ will be interpreted as \ for the regular expression.
The extra \ in this case is to escape the \ in the \s. Because we are inside a string declaration, you have to double up the \ to escape it. Once the string is processed and saved, it is reduced down to (^|\s)hello(\s|$)
The character immediately following the first \ is escaped. Normally \s escapes the s to mean "whitespace". In your example, the character which is escaped is \.
What you have is an expression which builds a regex (presumably to pass elsewhere) of (^|\s)hello(\s|$) — the word "hello" preceded either by whitespace or the start of the string, and followed by whitespace or the end of the string.
Essentially what the reg ex is doing, is looking for the opening and closing items of text surrounding the word hello and literally interpreting the '\s' as string content at the same time.
In laymans terms it's looking for a string that exactly matches:
|\shello\s|
As others have said the double \ is to escape the single \ so that instead of the reg ex engine looking for white-space it actually looks for '\s' as a string.
The ^ means start of line, the $ means end of line and the 2 | are interpreted as actual chars to look for
Lastly your start and end markers are bracketed () which means they will be extracted and placed in matches, which for you using C# means you can get at them by using:
myRegex.Matches.Group[1].Value
myRegex.Matches.Group[2].Value
1 being the beginning grouping, and 2 being the end.

Why this Regex, matches incorrect characters?

I need to match these characters. This quote is from an API documentation (external to our company):
Valid characters: 0-9 A-Z a-z & # - . , ( ) / : ; ' # "
I used this Regex to match characters:
^[0-9a-z&#-\.,()/:;'""#]*$
However, this wrongly matches characters like %, $, and many other characters. What's wrong?
You can test this regular expression online using http://regexhero.net/tester/, and this regular expression is meant to work in both .NET and JavaScript.
You are not escaping the dash -, which is a reserved character. If you add replace the dash with \- then the regex no longer matches those characters between # and \
Move the literal - to the front of the character set:
^[-0-9a-z&#\.,()/:;'""#]*$
otherwise it is taken as specifying a range like when you use it in 0-9.
- sign, when not escaped, has special meaning in square brackets. #-\. is transformed into #-. (BTW, backslash before dot is not necessary in square brackets), which means "any character between # (ASCII 0x23) and . (ASCII 0x2E). The correct notation is
^[0-9a-z&#\-.,()/:;'"#]*$
The special characters in a character class are the closing bracket (]), the backslash (\), the caret (^) and the hyphen (-).
As such, you should either escape them with a backslash (\), or put them in a position where there is no ambiguity and they do not need escaping. In the case of a hyphen, this would be the first or last position.
You also do not need to escape the dot (.).
Your regex thus becomes:
^[-0-9a-z&#.,()/:;'"#]*$
As a side note, there are many available regex evaluators which provide code hinting. This way, you can simply hover your mouse over your regular expression and it can be explained in English words.
One such free one is RegExr.
Typing your original regex in it and hovering over the hyphen shows:
Matches characters in the range '#-\'
Try that
^[0-9a-zA-Z\&\#\-\.\,\(\)\/\:\;\'\"\#]*$

Categories

Resources