What regex is good for email validation?

What regex is good for email validation? - javascript

I'm using the following for email validation:
var filter = /^(\w+)(\.\w+)*#(\w+)(\.\w+)+$/;
Just noticed that it does not support xxxx+wildcard#gmail.com (it does not support the +wildcard part). Any way to get that added?

You should use \S+#\S+\.\S+, which will match anything with an # and a ..
Anything more than that will reject valid but obscure addresses.
Even this will reject valid but obscure addresses, such as "Test Me"#localhost.
However, these are never used in practice. [citation needed]

I liked SLak's answer. I actually have a regular expression I wrote awhile back that's even more open-ended.
^.+#.+\..+$
The idea behind it is similar. Don't try so hard. Instead, err on the side of accepting too much. To my knowledge this regex should accept every conceivable valid email address (and some invalid ones as well).

Related

Is this input sanitization regex safe?

I have an input field where I expect the user to enter the name of a place (city/town/village/whatever). I have this function which is use to sanitize the content of the input field.
sanitizeInput: function (input) {
return input.replaceAll(/[&/\\#,+()$~%.^'":*?<>{}]/g, "");
}
I want to remove all special characters that I expect not to appear in place name. I thought a blacklist regex is better than a whitelist regex because there are still many characters that might appear in a place name.
My questions are:
Is this regex safe?
Could it be improved?
Do you see a way to attack the program using this regex?
EDIT: This is a tiny frontend-only project. There is no backend.

Your regex is perfect to remove any special characters.
The answers are :
1.the regex is safe , but as you mentioned it is a vuejs project so the js function will run on browser. Browsers basically not safe for doing user input sanitization. You should do that in backend server also , to be 100% safe
You can not improve the regex itself in this example. But instead of regex , you could use indexOf for each special characters also ( it will be fastest process but more verbose and too much code)
Like :
str.indexOf('&') !== -1
str.indexOf('#') !== -1
Etc
3.same as answer 1,the regex is safe but as it is used in browser js , the code an be disabled , so please do server side validation also.
If you have any issue with this answer ,please let me know by comment or reply.

It is important to remember that front end sanitization is mainly to improve the user experience and protect against accidental data input errors. There are ways to get past front end controls. For this reason, it is important to rely on sanitizing data on the backend for security purposes. This may not be the answer to your question, but based on what you are using for a backend, you may need to sanitize certain things or it may have built in controls and you may not need to worry about further sanitization.
ps.
Please forgive my lack of references. But it is worth researching on your own.

Incorrect Email validation hints

I'm not a big fan of email validation regexes as I myself have come across a lot of sites that have too strict regexes and as a result have not been able to user my preferred email.
Basically I use only .+#.+ just to make sure they are not forgetting the #.
What I'd like to do though is to give the user hints if he/she has entered an email that LIKELY is incorrect. Like typos or weird characters.
So if they enter for instance mike3292#hotmaik.com then I can ask the user if he is sure, and maybe even hint to the right solution in some cases.
So basically what I want is to know if there is any existing source of top email providers and common spelling mistakes. Also maybe a regex for unusual characters to warn the user about, asking him to double check.

Regex is really bad for validating emails. If you want to do a full/real validation you'll need a very complicated expression
What I would recommend is to simply make sure it's .*#.*\..* which would check for ---#---.---
And have the user enter it twice.
It makes it easy for you, easy for the user, and not annoying. I wouldn't like it if a pop-up suggested my name was an invalid email address.

Email validation in JavaScript when there are (soon to be) 1000's of TLD's

I just read an article which states:
Internet domain addresses opened up to wave of new suffixes
Internet naming board approves huge
expansion of approved domain
extensions with .hotel, .bank, or
.sport auctions likely.
Twenty-six years after .com was first
unveiled to the world, officials have
swept away tight regulations governing
website naming, opening up a whole
world of personalised web address
suffixes.
But... I just learned how to validate email addresses by checking (among others variables) the number of characters used after the dot (i.e., .com, .fr, etc.). What now?
Analysts say they expect 500 to 1,000
domain suffixes, mostly for companies
and products looking to stamp their
mark on web addresses, but also for
cities and generic names such as .bank
or .hotel.
Maybe this is not a problem. But how are we going to validate email addresses? What’s the plan?

IMO, the answer is to screw email validation beyond <anything>#<anything>, and deal with failed delivery attempts and errors in the email address (both of which are going to happen anyway).
Related:
How far should one take e-mail address validation?

As I've answered elsewhere, this regex is pretty good at handling localization and the new TLDs:
(?!^[.+&'_-]*#.*$)(^[_\w\d+&'-]+(\.[_\w\d+&'-]*)*#[\w\d-]+(\.[\w\d-]+)*\.(([\d]{1,3})|([\w]{2,}))$)
It does validate Jean+François#anydomain.museum and 试#例子.测试.مثال.آزمایشی, but it does not validate weird abuse of those nonalphanumeric characters, for example '.+#you.com'.

Validating email addresses beyond a check for basic, rough syntax is pointless. No matter how good a job you do, you cannot know that an address is valid without sending mail to it and getting an expected reply. The syntax for email addresses is complex and hard to check properly, and turning away a valid email address because your validator is inadequate is a terrible user experience mistake.

See What is the best regular expression for validating email addresses?.
It’s with the current TLD's already quite impossible to verify email address using regex (and that’s not the fault of the TLD's). So don't worry about new TLD's.

The way I see it, the number of TLDs, while much larger than today's, will still be finite and deterministic - so a regex that checks against a complete list of possible domain suffixes (whether that list is your own or, hopefully, provided by a reliable third-party such as ICANN) would do the trick.

Regex Comma Separated Emails

I am trying to get this Regex statement to work
^([_a-z0-9-]+(\.[_a-z0-9-]+)*#[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,3})+(\s?[,]\s?|$))+$
for a string of comma separated emails in a textbox using jQuery('#textbox').val(); which passes the values into the Regex statement to find errors for a string like:
"test#test.com, test1#test.com,test2#test.com"
But for some reason it is returning an error. I tried running it through http://regexpal.com/ but i'm unsure ?
NB: This is just a basic client-side test. I validate emails via the MailClass on the server-side using .NET4.0 - so don't jump down my throat re-this. The aim here is to eliminate simple errors.
Escaped Version:
^([_a-z0-9-]+(\\.[_a-z0-9-]+)*#[a-z0-9-]+(\\.[a-z0-9-]+)*(\\.[a-z]{2,3})+(\\s?[,]\\s?|$))+$

You can greatly simplify things by first splitting on commas, as Pablo said, then repeatedly applying the regex to validate each individual email. You can also then point out the one that's bad -- but there's a big caveat to that.
Take a look at the regex in the article Comparing E-mail Address Validating Regular Expressions. There's another even better regex that I couldn't find just now, but the point is a correct regex for checking email is incredibly complicated, because the rules for a valid email address as specified in the RFC are incredibly complicated.
In yours, this part (\.[a-z]{2,3})+ jumped out at me; the two-or-three-letters group {2,3} I often see as an attempt to validate the top-level domain, but (1) your regex allows one or more of these groups and (2) you will exclude valid email addresses from domains such as .info or .museum (Many sites reject my .us address because they thought only 3 letter domains were legal.)
My advice to reject seriously invalid addresses, while leaving the final validation to the server, is to allow basically (anything)#(anything).(anything) -- check only for an "at" and a "dot", and of course allow multiple dots.
EDIT: Example for "simple" regex
[^#]+#[^.]+(\.[^.]+)+
This matches
test#test.com
test1#test.com
test2#test.com
foo#bar.baz.co.uk
myname#modern.museum
And doesn't match foo#this....that
Note: Even this will reject some valid email addresses, because anything is allowed on the left of the # - even another # - if it's all escaped properly. But I've never seen that in 25 years of using email in Real Life.

Regex Javascript

I am using the below regex in JavaScript for password policy check:
^.*(?=.{8,})(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[##$_])(?=.*[\d\W]).*$
I tried the above regex using online regex checker
http://www.nvcc.edu/home/drodgers/ceu/resources/test_regexp.asp
Test cases passed as expected, negative test cases failed. But same regex when deployed in application does not validate properly.
For eg:
Tracker#123 does not work, where tRacker#123 works
Asd56544#12 also works fine.
Can you please point out what's wrong in regex above?

My advice is to separate this regex into several simple regex'es.
You may assign rules for your password, and for every rule you can assign a regex.
For example
Rule №1. Minimal length of password = 8 characters (can be done without regex)
Rule №2. At least one digit is required. ( /[0-9]/ )
Rule №3. At least one letter is required ( /[a-z]/i)
Rule №4. Illegal characters for password ( regex for some characters you don't want users to use in passwords)
Rule №n - some little regex
(and so on)
With this approach, it will be more easier to manage your validation in sooner time. For example after a year, you'll have to change your password policy. You'll forget what your big regex is meaning (and will spend a lot of time changing that big regex, or doing a new one). But with little separates regexes (meaning rules) you easily configure your password policy

Are you sure you syntax is correct?
Have a look at this JSfiddle, in it all the test cases pass
http://jsfiddle.net/pCLpX/

Develop Reference

JavaScript is the programming language of the Web.

What regex is good for email validation? - javascript

I'm using the following for email validation: var filter = /^(\w+)(\.\w+)*#(\w+)(\.\w+)+$/; Just noticed that it does not support xxxx+wildcard#gmail.com (it does not support the +wildcard part). Any way to get that added?

You should use \S+#\S+\.\S+, which will match anything with an # and a .. Anything more than that will reject valid but obscure addresses. Even this will reject valid but obscure addresses, such as "Test Me"#localhost. However, these are never used in practice. [citation needed]

Related

Is this input sanitization regex safe?

Incorrect Email validation hints

Email validation in JavaScript when there are (soon to be) 1000's of TLD's

Regex Comma Separated Emails

Regex Javascript

Categories

Resources