JavaScript Regex - custom characters and numbers - javascript

I am building a RegEx that is almost complete, but I can not get it to check for digits (0 - 9):
So for example: Jones-Parry is valid but Jones-Parry1 is not. The regex at present looks like this:
^([\\w\\s,'\\-ÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜŸäëïöüŸçÇŒœßØøÅåÆæÞþÐð]){0,80}$
I have tried using \d and [0-9] but to no avail. All else is working with the regex aside from the numbers. It validates special characters etc.
Any pointers greatly appreciated!

The problem is \w expands to A-Za-z0-9_, which includes digits 0-9. This explains why strings with digit pass your test.
You may want to specify A-Za-z_ directly instead of \w in your regex. It will fix your problem.
As georg has pointed out in the comment, your regex is very weak, since aside from the length requirement, it only checks whether it does not contain any character outside your allowed character set. A string with only spaces, or a string with only punctuation would pass the test.
Anyway, I doubt validating name is a good idea in general. Many assumptions programmers make about name are wrong. Depending on your requirement, you can give user a field for display name, where user can type anything in, and another field for username, where you only allow a strict set of characters.

Related

Easy way to remove what's NOT in regex

Is there an easy way to get the opposite of a regex or do I need to build a new regex that produces the opposite of what I have?
For example, I use this regex to make sure I'm getting proper currency values -- without $ sign.
/^[0-9]+(\.[0-9][0-9]?)?$/
I want to now remove everything that falls outside this pattern. For example, if user enters letters or tries to enter two periods e.g. 7.50.6, I want to remove undesired characters. How do I do that?
I think you're going at this in the wrong way. First of all, trying to hide input error in such a way is a bad idea. If a user has to type a number and they put an extra dot, what tells you which is the good part and which is the bad? You're better off telling the user there's something wrong with their input.
But typically, you use a regex by specifying what it has to look like AND what are the significant portions you want to keep using capture groups.
This is a capture group: ([a-z0-9])#example.com; this is a non-capture group: (?:hello|hi).
In case of a phone number, all that matters are the digits, so you can capture them and accept multiple forms of in-between characters. Here's a simple one for a postal code:
([A-Z][0-9][A-Z]) ?([0-9][A-Z][0-9])
Then all you have to do is combine the captured groups. If present, the space won't be captured.
Find more examples on MDN.

Regular expression to avoid control characters

I am working on flex and using regExp to check the value entered from UI. I want to ensure that entered value does not have any control characters and will give warning based on that. Since we support many languages, I can't have regex which have all possible positive values, thus I need to use blacklist control characters regular expression. I tried ^[^\x00-\x1F\x7F\u2028\u2029]*$ but it matches successfully if there is any regular character other than control character. I want it should return no match in case even a single control character is present. What should I change in this regular expression?
Will appreciate for the help.
You can use the following trick (put your negated set in a lookahead followed by a . and capture as a whole):
^((?=[^\x00-\x1F\x7F\u2028\u2029]).)*$

Checking for a string which contains 3 consecutive letters and 2 digits in any order

I can't seem to wrap my head around this one and thought I'd ask for some help here!
Basically I am validating a password field and the requirements are as follows:
- Must contain 3 consecutive letters
- Must contain at least 2 digits
- Can be in any order (e.g. 1abc342, abc24g3, 11abcsjf)
Here is what I have so far but I believe it needs some tweaking:
/[a-z]{3}[0-9][0-9]/i
The regex you are describing can be written like so:
/(?=.*?[a-z]{3})(?=.*?\d.*?\d)/
The first lookahead searches for three letters in a row, in any position. The second lookahead looks for a digit in any position, followed by a digit further ahead.
You should probably do this in two separate regular expressions: one to test for three consecutive letters and one to test for at least two digits:
/[a-z]{3}/i
/\d.*d/
Make sure both conditions are met. You could use lookahead to combine this into one regex, but I think two regexes is clearer code and a better solution.
But if I may inject some opinion on the matter: Unless you have no control over this (client specified this), I'd highly recommend not imposing password restrictions like this. They actually make your password system far less secure, not more secure. Some reading on why:
http://jimpravetz.com/blog/2011/06/cheap-gpus-are-rendering-strong-passwords-use/
http://jimpravetz.com/blog/2012/02/stupid-password-rules/

Email Regular Expression - Excluded Specified Set

I have been researching a regular expression for the better part of about six hours today. For the life of me, I can not figure it out. I have tried what feels like about a hundred different approaches to no avail. Any help is greatly appreciated!
The basic rules:
1 - Exclude these characters in the address portion (before the # symbol): "()<>#,;:\[]*&^%$#!{}/"
2 - The address can contain a ".", but not two in a row.
I have an elegant solution to the rule number one, however, rule number two is killing me! Here is what I have so far. (I'm only including the portion up to the # sign to keep it simple). Also, it is important to note that this regular expression is being used in JavaScript, so no conditional IF is allowed.
/^[^()<>#,;:\\[\]*&^%$#!{}//]+$/
First of all, I would suggest you always choose what characters you want to allow instead of the opposite, you never know what dangerous characters you might miss.
Secondly, this is the regular expression I always use for validating emails and it works perfectly. Hope it helps you out.
/^[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,6}$/i
Rule number 2
/^(?:\.?[^.])+\.?$/
which means any number of sequences of (an optional dot followed by a mandatory non dot) with an optional dot at the end.
Consider four two character sequences
xx matches as two non dot characters.
.x matches as an optional dot followed by a non-dot.
x. matches as a non-dot followed by an optional dot at the end.
.. does not match because there is no non-dot after the first dot.
One thing to remember about email addresses is that dots can appear in tricky places
"..#"#.example.com
is a valid email address.
The "..#" is a perfectly valid quoted local-part production, and .example.com is just a way of saying example.com but resolved against the root DNS instead of using a host search path. example.com might resolve to example.com.myintranet.com if myintranet.com is on the host search path but .example.com always resolves to the absolute host example.com.
First of all, to your specifications:
^(?![\s\S]*\.\.)[^()<>#,;:\\[\]*&^%$#!{}/]#.*$
It's just your regex with (?!.*\.\.) tacked onto the front. That's a negative lookahead, which doesn't match if there are any two consecutive periods anywhere in the string.
Properly matching email addresses is quite a bit harder, however.

How to detect what allowed character in current Regular Expression by using JavaScript?

In my web application, I create some framework that use to bind model data to control on page. Each model property has some rule like string length, not null and regular expression. Before submit page, framework validate any binded control with defined rules.
So, I want to detect what character that is allowed in each regular expression rule like the following example.
"^[0-9]+$" allow only digit characters like 1, 2, 3.
"^[a-zA-Z_][a-zA-Z_\-0-9]+$" allow only a-z, - and _ characters
However, this function should not care about grouping, positioning of allowed character. It just tells about possible characters only.
Do you have any idea for creating this function?
PS. I know it easy to create specified function like numeric only for allowing only digit characters. But I need share/reuse same piece of code both data tier(contains all model validator) and UI tier without modify anything.
Thanks
You can't solve this for the general case. Regexps don't generally ‘fail’ at a particular character, they just get to a point where they can't match any more, and have to backtrack to try another method of matching.
One could make a regex implementation that remembered which was the farthest it managed to match before backtracking, but most implementations don't do that, including JavaScript's.
A possible way forward would be to match first against ^pattern$, and if that failed match against ^pattern without the end-anchor. This would be more likely to give you some sort of match of the left hand part of the string, so you could count how many characters were in the match, and say the following character was ‘invalid’. For more complicated regexps this would be misleading, but it would certainly work for the simple cases like [a-zA-Z0-9_]+.
I must admit that I'm struggling to parse your question.
If you are looking for a regular expression that will match only if a string consists entirely of a certain collection of characters, regardless of their order, then your examples of character classes were quite close already.
For instance, ^[A-Za-z0-9]+$ will only allow strings that consist of letters A through Z (upper and lower case) and numbers, in any order, and of any length.

Categories

Resources