Email Validation RegEx username/local name length check not running - javascript

I've debugged for a few hours now and have hit a wall - regex has never been my strongsuit. I have been able to alter the following regex to restrict 255 characters for domain fine, however, in trying to restrict the local/username portion of an email address I'm running into issues implementing a 64 character limit. I've gone through regex101 replacing +s and *s and attempting to understand what each pass is doing - however, even when I add a check against all non-whitespace characters with a limit of 64 it seems like the other checks pass and take precedence - although I'm not sure. Below is my regex currently without any of the 64 character checks that I've broken it with:
var emailCheck = new RegExp(/^((([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+(\.{0,1}([a-z]|\d|[!#\$%&'\*\+\-\/=\?\^_`{\|}~]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])+)*)|((\x22)((((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(([\x01-\x08\x0b\x0c\x0e-\x1f\x7f]|\x21|[\x23-\x5b]|[\x5d-\x7e]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(\\([\x01-\x09\x0b\x0c\x0d-\x7f]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]))))*(((\x20|\x09)*(\x0d\x0a))?(\x20|\x09)+)?(\x22)))#((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]){1,255}([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF]){1,255}([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.*$/i);
What I have so far can be seen at http://jsfiddle.net/mtqx0tz1/ , I've made other slight alterations (e.g. not allowing consecutive dots) but for the most part this regex comes from another stack post without the character limits.
Lastly, I'm aware this isn't the 'standard' so to speak and emails are checked server-side, however, I would like to be more safe than sorry...as well as work on some of my regex. Sorry if this question isn't worthy of an actual post - I'm just simply not seeing where in my passes {1,64} is failing. At this point I'm thinking about just sub-stringing the portion of the string up to the # sign and checking length that way...but it would be nice to include it in this statement since all the checks are done here to begin with.

I have used this regex validation and it works good.
The e-mail address is in the variable strIn
try
{
return Regex.IsMatch(strIn,
#"^(?("")("".+?(?<!\\)""#)|(([0-9a-z]((\.(?!\.))|[-!#\$%&'\*\+/=\?\^`\{\}\|~\w])*)(?<=[0-9a-z])#))" +
#"(?(\[)(\[(\d{1,3}\.){3}\d{1,3}\])|(([0-9a-z][-\w]*[0-9a-z]*\.)+[a-z0-9][\-a-z0-9]{0,22}[a-z0-9]))$",
RegexOptions.IgnoreCase, TimeSpan.FromMilliseconds(250));
}
catch (RegexMatchTimeoutException)
{
return false;
}

Related

Validating client-side data input using a pattern

I am currently working on a project whereby data can be added into a database via a website. Currently I have managed to configure it so that the form accepts title, postcode, vehicle reg and ID number.
Javascript validation is working fine for these entries, with the exception of ID number. All ID numbers are a specific format (2 numbers followed by a . followed by 4 numbers).
I cannot seem to work out how to define the pattern.
Due to the size of my code, I have not posted the full code here (all is validating except this ID validation), but I've provided a snip of the 'if' statement below which I'm trying to come up with.
if (inputElement.id == "wid") {
pattern = /^[a-zA-Z0-9 ]*$/;
feedback = "Only 2 numbers followed by a . followed by 4 numbers are
permitted";
I know that the pattern isn't correct here but I have searched for hours trying to locate some easy to explain guidance and cannot find anything which appears to be relevant.
Any thoughts would be appreciated.
Thank you
You can try out something like https://regex101.com/ to test you regexes, and see an explanation of it.
I think your pattern should be this: /^[0-9]{2}\.[0-9]{4}$/.
The first part ([0-9]{2}) makes sure that the id starts with 2 digits, then a dot \. (which must be escaped, because it means "every character" otherwise) and then 4 digits [0-9]{4}

Regex in Google Apps Script practical issue. Forms doesn't read regex as it should

I hope its just something i'm not doing right.
I've been using a simple script to create a form out of a spreadsheet. The script seems to be working fine. The output form is going to get some inputs from third parties so i can analyze them in my consulting activity.
Creating the form was not a big deal, the structure is good to go. However, after having the form creator script working, i've started working on its validations, and that's where i'm stuck at.
For text validations, i will need to use specific Regexes. Many of the inputs my clients need to give me are going to be places' and/or people's names, therefore, i should only allow them usign A-Z, single spaces, apostrophes and dashes.
My resulting regexes are:
//Regex allowing a **single name** with the first letter capitalized and the occasional use of "apostrophes" or "dashes".
const reg1stName = /^[A-Z]([a-z\'\-])+/
//Should allow (a single name/surname) like Paul, D'urso, Mac'arthur, Saint-Germaine ecc.
//Regex allowing **composite names and places names** with the first letter capitalized and the occasional use of "apostrophes" or "dashes". It must avoid double spaces, however.
const regNamesPlaces = /^[^\s]([A-Z]|[a-z]|\b[\'\- ])+[^\s]$/
//This should allow (names/surnames/places' names) like Giulius Ceasar, Joanne D'arc, Cosimo de'Medici, Cosimo de Medici, Jean-jacques Rousseau, Firenze, Friuli Venezia-giulia, L'aquila ecc.
Further in the script, these Regexes are called as validation pattern for the forms text items, in accordance with each each case.
//Validation for single names
var val1stName = FormApp.createTextValidation()
.setHelpText("Only the person First Name Here! Use only (A-Z), a single apostrophe (') or a single dash (-).")
.requireTextMatchesPattern(reg1stName)
.build();
//Validation for composite names and places names
var valNamesPlaces = FormApp.createTextValidation()
.setHelpText(("Careful with double spaces, ok? Use only (A-Z), a single apostrophe (') or a single dash (-)."))
.requireTextMatchesPattern(regNamesPlaces)
.build();
Further yet, i have a "for" loop that creates the form based on the spreadsheets fields. Up to this point, things are working just fine.
for(var i=0;i<numberRows;i++){
var questionType = data[i][0];
if (questionType==''){
continue;
}
else if(questionType=='TEXTNamesPlaces'){
form.addTextItem()
.setTitle(data[i][1])
.setHelpText(data[i][2])
.setValidation(valNamesPlaces)
.setRequired(false);
}
else if(questionType=='TEXT1stName'){
form.addTextItem()
.setTitle(data[i][1])
.setHelpText(data[i][2])
.setValidation(val1stName)
.setRequired(false);
}
The problem is when i run the script and test the resulting form.
Both validations types get imported just fine (as can be seen in the form's edit mode), but when testing it in preview mode i get an error, as if the Regex wasn't matching (sry the error message is in portuguese, i forgot to translate them as i did with the code up there):
A screenshot of the form in edit mode
A screeshot of the form in preview mode
However, if i manually remove the bars out of this regex "//" it starts working!
A screenshot of the form in edit mode, Regex without bars
A screenshot of the form in preview mode, Regex without bars
What am i doing wrong? I'm no professional dev but in my understanding, it makes no sense to write a Regex without bars.
If this is some Gforms pattern of reading regexes, i still need all of this to be read by the Apps script that creates this form after all. If i even try to pass the regex without the bars there, the script will not be able to read it.
const reg1stName = ^[A-Z]([a-z\'])+
const regNamesPlaces = ^[^\s]([A-Z]|[a-z]|\b[\'\- ])+[^\s]$
//Can't even be saved. Returns: SyntaxError: Unexpected token '^' (line 29, file "Code.gs")
Passing manually all the validations is not an option. Can anybody help me?
Thanks so much
This
/^[A-Z]([a-z\'\-])+/
will not work because the parser is trying to match your / as a string literal.
This
^[A-Z]([a-z\'\-])+
also will not work, because if the name is hyphenated, you will only match up to the hyphen. This will match the 'Some-' in 'Some-Name', for example. Also, perhaps you want a name like 'Saint John' to pass also?
I recommend the following :)
^[A-Z][a-z]*[-\.' ]?[A-Z]?[a-z]*
^ anchors to the start of the string
[A-Z] matches exactly 1 capital letter
[a-z]* matches zero or more lowercase letters (this enables you to match a name like D'Urso)
[-\.' ]? matches zero or 1 instances of - (hyphen), . (period), ' (apostrophe) or a single space (the . (period) needs to be escaped with a backslash because . is special to regex)
[A-Z]? matches zero or 1 capital letter (in case there's a second capital in the name, like D'Urso, St John, Saint-Germaine)

using regular expression to hide email address from spam bots

i am dynamically rendering multiple email addresses (mail to: ) on a webpage.
i obliviously need to hide these from spam bots.
the simplest solution that i found is this:
link
this involves putting a fake characters: "X" within the email address and then removing these once the link is click, copied or pasted.
it works- however the drawback is that it remove all "x"'s from the address. since i cannot guarantee that my dynamically rendered emails will not contain "x" this solution-as is, it not right for me.
a better solution would be to put 3 or more 'X' at the start/end of each email address and then using the above code to remove them once the link is clicked
i.e:
<a href="mailto:XXXcontact#domain.comXXX"
onmouseover="this.href=this.href.replace(/x/g,'');">link</a>
what i now need to do is use regular expression to THEN remove the first 3 'x' from the email address when its clicked
i tried the below but it did not work:
<a href="mailto:xxxcontact#domain.comXXX"
onmouseover="this.href=this.href.replace(^[\s\S]{0,3});">link</a>
The replace method expects two parameters - first the regex you're matching against, and second the value you want to replace matches with. It is also expected that your regex pattern will have flags to explain the behaviour of matches. For instance, g will match over the string it is operating on, globally, and i will match in a case-insensitive manner.
The regex you're after here would probably be more along the lines of:
^(mailto\:)x{3}(.*)x{3}$
That is, you're aiming to capture mailto:, which is expected at the beginning of the string, then to discard 3 x or X chars, followed by capturing the email address, but not the 3 x or X chars that are expected at the end of the string.
This would fit into the replace method in the following manner:
.replace(/^(mailto\:)x{3}(.*)x{3}$/i, '$1$2')
That said, would it not be fair to say that an email address could be inclined to include x or X characters consecutively? If so, you should either replace each occurrence of x{3} and the corresponding matches that you're prepending/appending to the email address with something less likely to be contained in an email address, or devise an alternative approach to the problem.
You could try something along the lines of
link
It would basically replace the occurences of ^$^ instead of something common as X or XXX
I would avoid adding more or less common characters in your mail address for obfuscation purposes. Rather try some kind of very basic encryption, such as toggling the bits or taking the string char by char, and increasing the char code by a fixed value.
Example:
var mailto = "mailto:contact#domain.com";
var obfuscated = "";
for (let i = 0; i < mailto.length; i++) {
obfuscated += String.fromCharCode(mailto.charCodeAt(i) + 7);
}
//obfuscated now looks like this: "thps{vAjvu{hj{Gkvthpu5jvt"
//to reverse the process, do the same thing and subtract 7.
//You could extract the code to a method that you simply call with "onmouseover"
Hope this helps, despite not precisely answering your question :)

Regular expression fails to match the plus sign ('+') in Angular 2, but it works fine in testers

Here is the code:
export const PASSWORD_PATTERN: RegExp = /^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])[a-zA-Z0-9`~!##$%^&*()\-_+={[}\]|\\:;"'<,>.?/]{8,}$/;
This variable is being used like this elsewhere:
Validators.pattern(PASSWORD_PATTERN);
The intention is for this code to validate passwords by making sure that they contain one lowercase letter, one uppercase letter, and one number. Passwords may contain any number of special characters, and those characters are the ones that can be found on a standard keyboard (e.g. ~ * ( } ; + ). As of now, the regular expression will match passwords containing every single special character except for the plus sign ('+'). I've tried replacing '+' with '\+' and '\\+' in the regex, but that hasn't changed the result. At one point, I got rid of every special character in the regex except for the plus sign, to test it by itself, and once again using '+', '\+', and '\\+' in the regex wouldn't produce any matches for passwords containing a plus sign.
Using the regexp I pasted earlier, this password is considered a match:
Password1`~!##$%^&*()-_=[{]}\|;:'",<.>/?
While this password isn't considered a match:
Password1`~!##$%^&*()-_=[{]}\|;:'",<.>/?+
The only difference between those two passwords is the single plus sign at the end, and the second password isn't a match whether the regex contains +, \+, or \\+.
The regular expression is working completely on the backend, though it has been modified for the language being used primarily on the backend.
Not the prettiest thing, but this seems to work (or at least go in the right direction):
let characters = /((.*[A-Z])((.*[a-z].*\d)|(.*\d.*[a-z])).*)|((.*[a-z])((.*[A-Z].*\d)|(.*\d.*[A-Z])).*)|((.*\d)((.*[a-z].*[A-Z])|(.*[A-Z].*[a-z])).*)/;
let length = /.{8,}/;
let good1 = 'abc+Def8';
let good2 = '8b2cDde+';
let bad1 = 'abc+def8';
let bad2 = 'Ab+cde';
let bad3 = 'ab+cD2'
console.log(characters.test(good1) && length.test(good1), characters.test(good2) && length.test(good2), characters.test(bad1) && length.test(bad1), characters.test(bad2) && length.test(bad2), characters.test(bad3) && length.test(bad3));
Testing it using this site, and the JS code at the bottom it looks like it is working. How are you testing to make sure the passwords are valid?
In the new regEx I changed all the symbols to [\S]. That will match all chars that are not whitespace. Do you want to limit the passwords to just special chars that are on are normally used on a keyboard, or all of the possible ones? If so you should use the [\S]. Infact there is no real reason why you shouldn't allow all characters (except end line) in a password. In which case you should replace [\S] with ..
How are you handling the password value? Are you passing it to the back end in plain text? A nice way to handle it might be to check if the password matched the regex you have on the front end, hash the password using sha-256 (and salt it with a unique user string if you want to go nice and overkill), then pass that back the the server. The server would then salt and hash the password again before storing it in a table to be compared to when the user logs in next.
This helps add an extra layer of security for the user. Assuming you are using a ssl connection between the user and the server this is not really needed (ALWAYS USE SSL), but it is always nice to be a little more on the safe side when it comes to users passwords. That being said this will not prevent someone from logging in as a user if they get a successful man in the middle attack off, because when hashing a password user side it just becomes the validation that is sent to the server and validated against. However, this avoids you ever having knowledge of the users actual password, so if someone does intercept the hash (or say the server leaks some of its pre server side hashing data. Looking at you heartbleed!) someone can't easily go a bunch of other sites and try the user's username/password combo.
const regEx = /^(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])[\S]{8,}$/;
const good1 = 'A12345678a+'
const good2 = 'Aaascfsfas1##$#+#$%'
const example1 = 'Password1`~!##$%^&*()-_=[{]}\|;:",<.>/?'
const example2 = 'Password1`~!##$%^&*()-_=[{]}\|;:",<.>/?+'
const bad1 = 'a'
const bad2 = 'aasdfasfdsfads#+#$%$##'
console.log(
regEx.test(good1), regEx.test(good2),
regEx.test(example1), regEx.test(example2),
regEx.test(bad1), regEx.test(bad2)
)
It turns out that when I was appending the password to parameters I needed to wrap it in the encodeURIcomponent() method. Not sure why, but now all of the regular expression patterns I create are now working as expected.

Performance issue while evaluating email address with a regular expression

I am using below regular expression to validate email address.
/^\w+([\.-]?\w+)*#\w+([\.-]?w+)*(\.\w{2,3})+$/
Javascript Code:
var email = 'myname#company.com';
var pattern = /^\w+([\.-]?\w+)*#\w+([\.-]?w+)*(\.\w{2,3})+$/;
if(pattern.test(email)){
return true;
}
The regex evaluates quickly when I provide the below invalid email:
aseflj#$kajsdfklasjdfklasjdfklasdfjklasdjfaklsdfjaklsdjfaklsfaksdjfkasdasdklfjaskldfjjdkfaklsdfjlak#company.com
(I added #$ in the middle of the name)
However when I try to evaluate this email it takes too much time and the browser hangs.
asefljkajsdfklasjdfklasjdfklasdfjklasdjfaklsdfjaklsdjfaklsfaksdjfkasdasdklfjaskldfjjdkfaklsdfjlak#company.com1
(I added com1 in the end)
I'm sure that the regex is correct but not sure why its taking so much time to evaluate the second example. If I provide an email with shorter length it evaluates quickly. See the below example
dfjjdkfaklsdfjlak#company.com1
Please help me fix the performance issue
Your regex runs into catastrophic backtracking. Since [\.-]? in ([\.-]?\w+)* is optional, it makes the group degenerates to (\w+)*, which is a classic case of catastrophic backtracking.
Remove the ? resolves the issue.
I also remove the redundant escape of . inside character class, and changed the regex a bit.
^\w+([.-]\w+)*#\w+([.-]\w+)*\.\w{2,3}$
Do note that many new generic TLDs have more than 3 characters. Even some of the gTLD before the expansion have more than 3 characters, such as .info.
And as it is, the regex also doesn't support internationalized domain name.

Categories

Resources