Regex: Select Everything Else - javascript

I would like to select everything what's not [a-z0-9 ]
So if the string is:
Hello! How are you Mr. 007?
Then the result would be: !,?,.

Use a negated character class:
[^a-z0-9]
You will also need a case-insensitive modifier, or alternatively use this: [^a-zA-Z0-9].
Note that your expected result is incorrect: you missed the space character. If you want to also not match the space character you need to include it in the character class: [^a-zA-Z0-9 ].

With ^ at the beginning of a character class, you negate that class:
str.match(/[^a-z0-9 ]/gi)
You should also add space (or \s (which is space, tab,...)) and the modifier i to make the match case insensitive.
Read more about regular expressions in JavaScript and regular expressions in general.

Related

Reg: Javascript regex

How can I match the pattern abc_[someArbitaryStringHere]_xyz?
To clarify, I would want the regex to match strings of the nature:
abc_xyz, abc_asdfsdf_xyz, abc_32rwrd_xyz etc.
I tried with /abc_*_xyz/ but this seems to be an incorrect expression.
Use
/^abc(?:_.*_|_)xyz$/
Be sure to include the ^ and $, they guard the beginning and end of the string. Otherwise strings like "123abc_foo_xyz" will match.
(?:_.*_|_) Is a non-capture group that matches either _[someArbitaryStringHere]_ or a single _
Your regex would be,
abc(?:(?:_[^_]+)+)?_xyz
DEMO
Assuming abc_xyz is indeed a string you want to match, and isn't just a typo, then your regex is:
/abc(?:_[^_]+)?_xyz/
This will match abc, then optionally match a _ followed by greedily matching anything but _s. After this optional part, it will match the ending _xyz.
If this is to match an entire string (as opposed to just extracting substrings from a bigger string), then you can just put ^ at the start and $ at the end, like so:
/^abc(?:_[^_]+)?_xyz$/
EDIT: Just noticed that JavaScript doesn't support possessive matching, only greedy. Changed ++ to +.
EDIT2: The above regexes also assume that your "arbitrary string" does not contain more underscores. They can be expanded to allow more rules.
For example, to allow just anything, a truly arbitrary string, try:
/abc(?:_.*)?_xyz/ or /^abc(?:_.*)?_xyz$/
But if you want to be really clever, and disallow consecutive underscores, you can do:
/abc(?:_[^_]+)*_xyz/ or /^abc(?:_[^_]+)*_xyz$/
And lastly, if you want to "only allow letters or numbers" in your arbitrary strings, just replace [^_] with [a-zA-Z0-9].
The '*' mean you will match 0 or more. but of what?
/abc_[a-z0-9]*_xyz/im
The DOT. will match any character ANY.
/abc_(.*)_xyz/im
You need to check for at least one underscore as well if you want to match abc_xyz:
abc_+.*xyz

regular expression incorrectly matching % and $

I have a regular expression in JavaScript to allow numeric and (,.+() -) character in phone field
my regex is [0-9-,.+() ]
It works for numeric as well as above six characters but it also allows characters like % and $ which are not in above list.
Even though you don't have to, I always make it a point to escape metacharacters (easier to read and less pain):
[0-9\-,\.+\(\) ]
But this won't work like you expect it to because it will only match one valid character while allowing other invalid ones in the string. I imagine you want to match the entire string with at least one valid character:
^[0-9\-,\.\+\(\) ]+$
Your original regex is not actually matching %. What it is doing is matching valid characters, but the problem is that it only matches one of them. So if you had the string 435%, it matches the 4, and so the regex reports that it has a match.
If you try to match it against just one invalid character, it won't match. So your original regex doesn't match the string %:
> /[0-9\-,\.\+\(\) ]/.test("%")
false
> /[0-9\-,\.\+\(\) ]/.test("44%5")
true
> "444%6".match(/[0-9\-,\.+\(\) ]/)
["4"] //notice that the 4 was matched.
Going back to the point about escaping, I find that it is easier to escape it rather than worrying about the different rules where specific metacharacters are valid in a character class. For example, - is only valid in the following cases:
When used in an actual character class with proper-order such as [a-z] (but not [z-a])
When used as the first or last character, or by itself, so [-a], [a-], or [-].
When used after a range like [0-9-,] or [a-d-j] (but keep in mind that [9-,] is invalid and [a-d-j] does not match the letters e through f).
For these reasons, I escape metacharacters to make it clear that I want to match the actual character itself and to remove ambiguities.
You just need to anchor your regex:
^[0-9-,.+() ]+$
In character class special char doesn't need to be escaped, except ] and -.
But, these char are not escaped when:
] is alone in the char class []]
- is at the begining [-abc] or at the end [abc-] of the char class or after the last end range [a-c-x]
Escape characters with special meaning in your RegExp. If you're not sure and it isn't an alphabet character, it usually doesn't hurt to escape it, too.
If the whole string must match, include the start ^ and end $ of the string in your RegExp, too.
/^[\d\-,\.\+\(\) ]*$/

javascript replace() function strange behaviour with regexp

Am i doing sth wrong or there is a problem with JS replace ?
<input type="text" id="a" value="(55) 55-55-55" />​
document.write($("#a").val().replace(/()-/g,''));​
prints (55) 555555
http://jsfiddle.net/Yb2yV/
how can i replace () and spaces too?
In a JavaScript regular expression, the ( and ) characters have special meaning. If you want to list them literally, put a backslash (\) in front of them.
If your goal is to get rid of all the (, ), -, and space characters, you could do it with a character class combined with an alternation (e.g., either-or) on \s, which stands for "whitespace":
document.write($("#a").val().replace(/[()\-]|\s/g,''));​
(I didn't put backslashes in front of the () because you don't need to within a character class. I did put one in front of the - because within a character class, - has special meaning.)
Alternately, if you want to get rid of anything that isn't a digit, you can use \D:
document.write($("#a").val().replace(/\D/g,''));​
\D means "not a digit" (note that it's a capital, \d in lower case is the opposite [any digit]).
More info on the MDN page on regular expressions.
You need to use a character class
/[-() ]/
Using "-" as the first character solves the ambiguity because a dash is normally used for ranges (e.g. [a-zA-Z0-9]).
document.write($("#a").val().replace(/[\s()-]/g,''));​
That will remove all whitespace (\s), parens, and dashes
Use this
.replace(/\(|\)|-| /g,'')
You have to escape the parenthesis (i.e. \( instead of (). In your regexp, you want to list the four items: \(, \), '-' and (space) and as you want to replace any of them, not just a string of them four together, you have to use OR | between them.
May be very bad but a very basic approach would be,
document.write($("#a").val().replace(/(\()|(\))|-| |/g,''));​​
| means OR,
\ is used for escaping reserved symbols
You want to match any character in the set, so you should use square brackets to make a character set:
document.write($("#a").val().replace(/[()\- ]/g,''));
Normally, parentheses have a special meaning in regular expressions, so they were being ignored in your regex, leaving just the dash. Normally, to get literal parentheses, you need to escape them with \ (but in a square bracket block, as above, you don't).
The dash above is escaped because it has normally indicates range in a character set, e.g., [a-z].
The brackets indicate a capturing group in the regexp. You'd need to escape them (/\(\)-/) to match the sequence "()-". Yet I guess you want to use a character class, i.e. a expression that matches "(", ")" or "-"; for whitespaces include the \s shorthand:
value.replace(/[()-\s]/g, "");
You might want to read some documentation or tutorial.

.replace(' ', '-') will only replace first whitespace

i am trying
To convert: 'any string separated with blankspaces' into
'any-string-separated-with-blankspaces'
i am tying with .replace(' ','-') but it would only replace first... why? how can i replace all?
http://jsfiddle.net/7ycg3/
You need a regular expression for that
.replace(/\s/g,'-')
\s will replace any kind of white-space character. If you're strictly after a "normal" whitespace use
/ /g
instead.
You need to use a regular expression as the first parameter, using the /g modifier to make it replace all occurrences:
var replaced = input.replace(/ /g,'-');
If you want to replace any whitespace character instead of a literal space, you need to use \s instead of in the regex; and if you want to replace any number of consecutive spaces with one hyphen, then add + after the or \s.
It's not stated particularly clearly in the MDN docs for String.replace, but String.replace only does one replacement, unless the g flag is included in it, using a regular expression rather than a string:
To perform a global search and replace, either include the g switch in the regular expression or if the first parameter is a string, include g in the flags parameter.
(But be aware that the flags parameter is non-standard, as they also note there.)
Thus, you want tag.replace(/ /g,'-').
http://jsfiddle.net/7ycg3/1/
Use regex with /g modifier
Use /\s/g inplace of ' ' in your question

Regex not working as expected

Whats wrong with this regular expression?
/^[a-zA-Z\d\s&#-\('"]{1,7}$/;
when I enter the following valid input, it fails:
a&'-#"2
Also check for 2 consecutive spaces within the input.
The dash needs to be either escaped (\-) or placed at the end of the character class, or it will signify a range (as in A-Z), not a literal dash:
/^[A-Z\d\s&#('"-]{1,7}$/i
would be a better regex.
N. B: [#-\(] would have matched #, $, %, &, ' or (.
To address the added requirement of not allowing two consecutive spaces, use a lookahead assertion:
/^(?!.*\s{2})[A-Z\d\s&#('"-]{1,7}$/i
(?!.*\s{2}) means "Assert that it's impossible to match (from the current position) any string followed by two whitespace characters". One caveat: The dot doesn't match newline characters.
The - (hyphen) has a special meaning inside a character class, used for specifying ranges. Did you mean to escape it?:
/^[a-zA-Z\d\s&#\-\('"]{1,7}$/;
This RegExp matches your input.
You have an unescaped - in the middle of your character class. This means that you're actually searching for all characters between and including # and ( (which are #, $, %, &, ', and (). Either move it to the end or escape it with a backslash. Your regex should read:
/^[a-zA-Z\d\s&#\('"-]{1,7}$/
or
/^[a-zA-Z\d\s&#\-\('"]{1,7}$/
remove the ; at the end and
^[a-zA-Z\d\s\&\#\-\(\'\"]+$
Your input does not match the regular expression. The problem here is the hyphen in you regexp. If you move it from its position after the '#' character to the start of the regex, like so:
/^[-a-zA-Z\d\s&#\('"]{1,7}$/;
everything is fine and dandy.
You can always use Rubular for checking your regular expressions. I use it on a regular (no pun intended) basis.

Categories

Resources