Represent any set of characters in a regex expression - javascript

I want to treat expressions like these:
expression 1 -> sum$4,8,'x'$
expression 2 -> sum$2,15,'(x^3+3)/(x+1)'$
and i am using regular expressions to recognize the pattern:
sum\$[0-9]+[,][0-9]+[,]['][.]*[\w]*[']\$
But it only works for expression 1, why dot (.) which represents any character seems like is not working? Do I have to treat the parenthesis in a special way?

. is a meta character that will match any 1 character. There are other meta characters like \d which will match 1 letter between 0-9. SOME of the meta characters loose their special meaning inside character class []. So dot inside character class [] will not be a meta character any more and will be treated as literal dot.
.* and .*? are two different things. Former will greedily match everything and later is lazy. For eg. Take a string like: abbcbbbc. Now
a.*c will match the complete string abbcbbbc while a.*?c will match only abbc
You can try Something like this.
sum\$\d+,\d+,'.*?'\$
Demo: https://regex101.com/r/6PEfBh/1

The way [] work is they match characters exactly so in your case you have to have the . without [] like so:
sum\$[0-9]+[,][0-9]+[,]['].*[\w]*[']\$
This regex also matches expression 2

You can try this
sum\$[0-9]+[,][0-9]+[,]['][^']*[\w]*[']\$
Replaced [.] with [^']

Related

how to replace all occurrances of "\\" string in java script

This seems a very simple question but I haven't been able to get this to work.
How do I convert the following string:
var origin_str = "abc/!/!"; // Original string
var modified_str = "abc!!"; // replaced string
I tried this:
console.log(origin_str.replace(/\\/,''));
This only removes the first occurrence of backslash. I want to replaceAll. I followed this instruction in SO: How to replace all occurrences of a string in JavaScript?
origin_str.replace(new RegExp('\\', 'g'), '');
This code throws me an error SyntaxError: Invalid regular expression: /\/: \ at end of pattern. What's the regex for removing backslash in javascript.
A quick basic overview of regular expressions in JavaScript
When using regular expressions you can define the expression on two ways.
Either directly in the function or variable by using /regular expression/
Or by using the regExp contructor: new RegExp('regular expression').
Please note the difference between the two ways of defining. In the first the search pattern is encapsuled by forward slashes, while in the second one the search pattern is passed as a string.
Remember that regular expressions is in fact a search language with it's own syntax. Some characters are used to define actions: /, \, ^, $, . (dot), |, ?, *, +, (, ), [, {, ', ". These characters are called metacharacters and need to be escaped if you want them to be part of the search pattern. If not they will be treated as an option or generate script errors. Escaping is done by using the backslash. E.g. \\ escapes the second backslash and the search pattern will now search for backslashes.
There are a multitude of options you can add to your search pattern.:
Examples
adding \d will make the pattern search for a numeric value between [0-9] and/or the underscore. Simple regular expressions are parsed from left to right.
/javascript/
Searches for the word javascript in a string.
/[a-z]/
When a pattern is put between square bracket the search pattern searches for a character matching any one of the values inside the square brackets. This will find d in 229302d34330
You can build a regular expression with multiple blocks.
/(java)|(emca)script/
Find javascript or emcascript in a string. The | is the or operator.
/a/ vs. /a+/
The first matches the first a in aaabbb, the second matches a repetition of a until another character is found. So the second matches: aaa.
The plus sign + means find a one or more times. You can also use * which means zero or more times.
/^\d+$/
We've seen the \d earlier and also the plus sign. This means find one or more numeric characters. The ^ (caret) and $ (dollar sign) are new. The ^ says start searching from the begin of the string, while the $ says until the end of the string. This expression will match: 574545485 but not d43849343, 549854fff or 4348d8788.
Flags
Flags are operators and are declared after the regular expression /regular expression/flags
JavaScript has three flags you can use:
g (global) Searches multiples times for the pattern.
i (ignore case) Ignores case in pattern.
m (multiline) treat beginning and end characters (^ and $) as working over multiple lines (i.e., match the beginning or end of each line (delimited by \n or \r), not only the very beginning or end of the whole input string)
So a regular expression like this:
/d[0-9]+/ig
matches D094938 and D344783 in 98498D094938A37834D344783.
The i makes the search case-insensitive. Matching a D because of the d in the pattern. If D is followed by one or more numbers then the pattern is matched. The g flag commands the expression to look for the pattern globally or simply said: multiple times.
In your case #Qwerty provided the correct regex:
origin_str.replace(/\//g, "")
Where the search pattern is a single forward slash /. Escaped by the backslash to prevent script errors. The g flags commands the replace function to search for all occurrences of the forward slash in the string and replace them with an empty string "".
For a comprehensive tutorial and reference : http://www.regular-expressions.info/tutorial.html
Looking for this?
origin_str.replace(/\//g, "")
The syntax for replace is
.replace(/pattern/flags, replacement)
So in my case the pattern is \/ - an escaped slash
and g is global flag.

regular expression incorrectly matching % and $

I have a regular expression in JavaScript to allow numeric and (,.+() -) character in phone field
my regex is [0-9-,.+() ]
It works for numeric as well as above six characters but it also allows characters like % and $ which are not in above list.
Even though you don't have to, I always make it a point to escape metacharacters (easier to read and less pain):
[0-9\-,\.+\(\) ]
But this won't work like you expect it to because it will only match one valid character while allowing other invalid ones in the string. I imagine you want to match the entire string with at least one valid character:
^[0-9\-,\.\+\(\) ]+$
Your original regex is not actually matching %. What it is doing is matching valid characters, but the problem is that it only matches one of them. So if you had the string 435%, it matches the 4, and so the regex reports that it has a match.
If you try to match it against just one invalid character, it won't match. So your original regex doesn't match the string %:
> /[0-9\-,\.\+\(\) ]/.test("%")
false
> /[0-9\-,\.\+\(\) ]/.test("44%5")
true
> "444%6".match(/[0-9\-,\.+\(\) ]/)
["4"] //notice that the 4 was matched.
Going back to the point about escaping, I find that it is easier to escape it rather than worrying about the different rules where specific metacharacters are valid in a character class. For example, - is only valid in the following cases:
When used in an actual character class with proper-order such as [a-z] (but not [z-a])
When used as the first or last character, or by itself, so [-a], [a-], or [-].
When used after a range like [0-9-,] or [a-d-j] (but keep in mind that [9-,] is invalid and [a-d-j] does not match the letters e through f).
For these reasons, I escape metacharacters to make it clear that I want to match the actual character itself and to remove ambiguities.
You just need to anchor your regex:
^[0-9-,.+() ]+$
In character class special char doesn't need to be escaped, except ] and -.
But, these char are not escaped when:
] is alone in the char class []]
- is at the begining [-abc] or at the end [abc-] of the char class or after the last end range [a-c-x]
Escape characters with special meaning in your RegExp. If you're not sure and it isn't an alphabet character, it usually doesn't hurt to escape it, too.
If the whole string must match, include the start ^ and end $ of the string in your RegExp, too.
/^[\d\-,\.\+\(\) ]*$/

Can it be done with regex?

Having the following regex: ([a-zA-Z0-9//._-]{3,12}[^//._-]) used like pattern="([a-zA-Z0-9/._-]{3,12}[^/._-])" to validate an HTML text input for username, I wonder if is there anyway of telling it to check that the string has only one of the following: ., -, _
By that I mean, that I'm in need of regex that would accomplish the following (if possible)
alex-how => Valid
alex-how. => Not valid, because finishing in .
alex.how => Valid
alex.how-ha => Not valid, contains already a .
alex-how_da => Not valid, contains already a -
The problem with my current regex, is that for some reason, accepts any character at the end of the string that is not ._-, and can't figure it out why.
The other problem, is that it doesn't check to see that it contains only of the allowed special characters.
Any ideas?
Try this one out:
^(?!(.*[.|_|-].*){2})(?!.*[.|_|-]$)[a-zA-Z0-9//._-]{3,12}$
Regexpal link. The regex above allow at max one of ., _ or -.
What you want is one or more strings containing all upper, lower and digit characters
followed by either one or none of the characters in "-", ".", or "_", followed by at least one character:
^[a-zA-Z0-9]+[-|_|\.]{0,1}[a-zA-Z0-9]+$
Hope this will work for you:-
It says starts with characters followed by (-,.,_) and followed and end with characters
^[\w\d]*[-_\.\w\d]*[\w\d]$
Seems to me you want:
^[A-Za-z0-9]+(?:[\._-][A-Za-z0-9]+)?$
Breaking it down:
^: beginning of line
[A-Za-z0-9]+: one or more alphanumeric characters
(?:[\._-][A-Za-z0-9]+)?: (optional, non-captured) one of your allowed special characters followed by one or more alphanumeric characters
$: end of line
It's unclear from your question if you wanted one of your special characters (., -, and _) to be optional or required (e.g., zero-or-one versus exactly-one). If you actually wanted to require one such special character, you would just get rid of the ? at the very end.
Here's a demonstration of this regular expression on your example inputs:
http://rubular.com/r/SQ4aKTIEF6
As for the length requirement (between 3 and 12 characters): This might be a cop-out, but personally I would argue that it would make more sense to validate this by just checking the length property directly in JavaScript, rather than over-complicating the regular expression.
^(?=[a-zA-Z0-9/._-]{3,12}$)[a-zA-Z0-9]+(?:[/._-][a-zA-Z0-9]+)?$
or, as a JavaScript regex literal:
/^(?=[a-zA-Z0-9\/._-]{3,12})[a-zA-Z0-9]+(?:[\/._-][a-zA-Z0-9]+)?$/
The lookahead, (?=[a-zA-Z0-9/._-]{3,12}$), does the overall-length validation.
Then [a-zA-Z0-9]+ ensures that the name starts with at least one non-separator character.
If there is a separator, (?:[/._-][a-zA-Z0-9]+)? ensures that there's at least one non-separator following it.
Note that / has no special meaning in a regex. You only have to escape it if you're using a regex literal (because / is the regex delimiter), and you escape it by prefixing with a backslash, not another forward-slash. And inside a character class, you don't need to escape the dot (.) to make it match a literal dot.
The dot in regex has a special meaning: "any character here".
If you mean a literal dot, you should escape it to tell the regex parser so.
Escape dot in a regex range

javascript replace() function strange behaviour with regexp

Am i doing sth wrong or there is a problem with JS replace ?
<input type="text" id="a" value="(55) 55-55-55" />​
document.write($("#a").val().replace(/()-/g,''));​
prints (55) 555555
http://jsfiddle.net/Yb2yV/
how can i replace () and spaces too?
In a JavaScript regular expression, the ( and ) characters have special meaning. If you want to list them literally, put a backslash (\) in front of them.
If your goal is to get rid of all the (, ), -, and space characters, you could do it with a character class combined with an alternation (e.g., either-or) on \s, which stands for "whitespace":
document.write($("#a").val().replace(/[()\-]|\s/g,''));​
(I didn't put backslashes in front of the () because you don't need to within a character class. I did put one in front of the - because within a character class, - has special meaning.)
Alternately, if you want to get rid of anything that isn't a digit, you can use \D:
document.write($("#a").val().replace(/\D/g,''));​
\D means "not a digit" (note that it's a capital, \d in lower case is the opposite [any digit]).
More info on the MDN page on regular expressions.
You need to use a character class
/[-() ]/
Using "-" as the first character solves the ambiguity because a dash is normally used for ranges (e.g. [a-zA-Z0-9]).
document.write($("#a").val().replace(/[\s()-]/g,''));​
That will remove all whitespace (\s), parens, and dashes
Use this
.replace(/\(|\)|-| /g,'')
You have to escape the parenthesis (i.e. \( instead of (). In your regexp, you want to list the four items: \(, \), '-' and (space) and as you want to replace any of them, not just a string of them four together, you have to use OR | between them.
May be very bad but a very basic approach would be,
document.write($("#a").val().replace(/(\()|(\))|-| |/g,''));​​
| means OR,
\ is used for escaping reserved symbols
You want to match any character in the set, so you should use square brackets to make a character set:
document.write($("#a").val().replace(/[()\- ]/g,''));
Normally, parentheses have a special meaning in regular expressions, so they were being ignored in your regex, leaving just the dash. Normally, to get literal parentheses, you need to escape them with \ (but in a square bracket block, as above, you don't).
The dash above is escaped because it has normally indicates range in a character set, e.g., [a-z].
The brackets indicate a capturing group in the regexp. You'd need to escape them (/\(\)-/) to match the sequence "()-". Yet I guess you want to use a character class, i.e. a expression that matches "(", ")" or "-"; for whitespaces include the \s shorthand:
value.replace(/[()-\s]/g, "");
You might want to read some documentation or tutorial.

Regex not working as expected

Whats wrong with this regular expression?
/^[a-zA-Z\d\s&#-\('"]{1,7}$/;
when I enter the following valid input, it fails:
a&'-#"2
Also check for 2 consecutive spaces within the input.
The dash needs to be either escaped (\-) or placed at the end of the character class, or it will signify a range (as in A-Z), not a literal dash:
/^[A-Z\d\s&#('"-]{1,7}$/i
would be a better regex.
N. B: [#-\(] would have matched #, $, %, &, ' or (.
To address the added requirement of not allowing two consecutive spaces, use a lookahead assertion:
/^(?!.*\s{2})[A-Z\d\s&#('"-]{1,7}$/i
(?!.*\s{2}) means "Assert that it's impossible to match (from the current position) any string followed by two whitespace characters". One caveat: The dot doesn't match newline characters.
The - (hyphen) has a special meaning inside a character class, used for specifying ranges. Did you mean to escape it?:
/^[a-zA-Z\d\s&#\-\('"]{1,7}$/;
This RegExp matches your input.
You have an unescaped - in the middle of your character class. This means that you're actually searching for all characters between and including # and ( (which are #, $, %, &, ', and (). Either move it to the end or escape it with a backslash. Your regex should read:
/^[a-zA-Z\d\s&#\('"-]{1,7}$/
or
/^[a-zA-Z\d\s&#\-\('"]{1,7}$/
remove the ; at the end and
^[a-zA-Z\d\s\&\#\-\(\'\"]+$
Your input does not match the regular expression. The problem here is the hyphen in you regexp. If you move it from its position after the '#' character to the start of the regex, like so:
/^[-a-zA-Z\d\s&#\('"]{1,7}$/;
everything is fine and dandy.
You can always use Rubular for checking your regular expressions. I use it on a regular (no pun intended) basis.

Categories

Resources