I'm trying to create a regex pattern that allows the user to create a username with the following specifications. (For the purposes of this initial pattern, I'm only using standard american English alphabet.
The first character must be an alphabetic letter (uppercase or lowercase). [a-zA-Z]
The last character must be a alphanumeric (uppercase or lowercase). [a-zA-Z0-9]
Any characters in between must be letters or numbers with one rule:
The user can use a period(.), dash(-), or underscore(_) but it must be followed by an alphanumeric character. So no repeats of one or more of these characters at a time.
I've tried the following regex pattern but am not getting the results I was hoping for. Thanks for taking the time to help me on this.
^[a-zA-Z]([a-zA-Z0-9]+[._-]?[a-zA-Z0-9]+)+$
EDIT
It might actually be working the way I expected. But I'm always getting two matches returned to me. The first one being the entire valid string, the second being a shortened version of the first string usually chopping off the first couple of characters.
Examples of valid inputs:
Spidy
Spidy.Man
Ama-za-zing_Spidy
Examples of invalid inputs:
Extreme___Spidy (repeated underscores)
The_-_Spidy (repeated special characters)
_ _ SPIDY _ _ (starts and ends with special characters)
Sounds like this pattern:
^[a-zA-Z]([._-]?[a-zA-Z0-9])*$
^[a-zA-Z]([._-]?[a-zA-Z0-9]+)*$
Related
I have added a regex which will check for least 8 characters, at least one number, one uppercase letter, one lowercase letter and one special character
This is the regex Used
'(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[$#$!%*?&])[A-Za-zd$#$!%*?&].{8,}'
The above regex works fine for most of the scenarios but when I use
1oB!gb0s5
or
Pass#123
It fails. Can anyone tell me the issue here.
Here is the portion of your regex which actually consumes the input:
[A-Za-zd$#$!%*?&].{8,}
This means that the password must start with one of the characters in the above character class. It also means that a valid password must have nine or more characters, because the class counts for one, and {8,} means 8 or more. So the following would fail because it does not begin with any such character:
1oB!gb0s5
The second example you gave fails for a different reason, because it only has 8 characters:
Pass#123
I don't know exactly what logic you want here. If you just want to ensure that a password has a lowercase, uppercase, number, and special character, then maybe you can remove the leading character class and just stick with the lookaheads:
(?=.*[a-z])(?=.*[A-Z])(?=.*[0-9])(?=.*[$#$!%*?&]).{8,}
Here is a demo which shows that your two example passwords would pass using the above pattern.
Demo
I need regular expression for validating a hashtag. Each hashtag should starts with hashtag("#").
Valid inputs:
1. #hashtag_abc
2. #simpleHashtag
3. #hashtag123
Invalid inputs:
1. #hashtag#
2. #hashtag#hashtag
I have been trying with this regex /#[a-zA-z0-9]/ but it is accepting invalid inputs also.
Any suggestions for how to do it?
The current accepted answer fails in a few places:
It accepts hashtags that have no letters in them (i.e. "#11111", "#___" both pass).
It will exclude hashtags that are separated by spaces ("hey there #friend" fails to match "#friend").
It doesn't allow you to place a min/max length on the hashtag.
It doesn't offer a lot of flexibility if you decide to add other symbols/characters to your valid input list.
Try the following regex:
/(^|\B)#(?![0-9_]+\b)([a-zA-Z0-9_]{1,30})(\b|\r)/g
It'll close up the above edge cases, and furthermore:
You can change {1,30} to your desired min/max
You can add other symbols to the [0-9_] and [a-zA-Z0-9_] blocks if you wish to later
Here's a link to the demo.
To answer the current question...
There are 2 issues:
[A-z] allows more than just letter chars ([, , ], ^, _, ` )
There is no quantifier after the character class and it only matches 1 char
Since you are validating the whole string, you also need anchors (^ and $)to ensure a full string match:
/^#\w+$/
See the regex demo.
If you want to extract specific valid hashtags from longer texts...
This is a bonus section as a lot of people seek to extract (not validate) hashtags, so here are a couple of solutions for you. Just mind that \w in JavaScript (and a lot of other regex libraries) equal to [a-zA-Z0-9_]:
#\w{1,30}\b - a # char followed with one to thirty word chars followed with a word boundary
\B#\w{1,30}\b - a # char that is either at the start of string or right after a non-word char, then one to thirty word (i.e. letter, digit, or underscore) chars followed with one to thirty word chars followed with a word boundary
\B#(?![\d_]+\b)(\w{1,30})\b - # that is either at the start of string or right after a non-word char, then one to thirty word (i.e. letter, digit, or underscore) chars (that cannot be just digits/underscores) followed with a word boundary
And last but not least, here is a Twitter hashtag regex from https://github.com/twitter/twitter-text/tree/master/js... Sorry, too long to paste in the SO post, here it is: https://gist.github.com/stribizhev/715ee1ee2dc1439ffd464d81d22f80d1.
You could try the this : /#[a-zA-Z0-9_]+/
This will only include letters, numbers & underscores.
A regex code that matches any hashtag.
In this approach any character is accepted in hashtags except main signs !##$%^&*()
(?<=(\s|^))#[^\s\!\#\#\$\%\^\&\*\(\)]+(?=(\s|$))
Usage Notes
Turn on "g" and "m" flags when using!
It is tested for Java and JavaScript languages via https://regex101.com and VSCode tools.
It is available on this repo.
Unicode general categories can help with that task:
/^#[\p{L}\p{Nd}_]+$/gu
I use \p{L} and \p{Nd} unicode categories to match any letter or decimal digit number. You can add any necessary category for your regex. The complete list of categories can be found here: https://unicode.org/reports/tr18/#General_Category_Property
Regex live demo:
https://regexr.com/5tvmo
useful and tested regex for detecting hashtags in the text
/(^|\s)(#[a-zA-Z\d_]+)/ig
examples of valid matching hashtag:
#abc
#ab_c
#ABC
#aBC
/\B(?:#|#)((?![\p{N}_]+(?:$|\b|\s))(?:[\p{L}\p{M}\p{N}_]{1,60}))/ug
allow any language characters or characters with numbers or _.
numbers alone or numbers with _ are not allowed.
It's unicode regex, so if you are using Python, you may need to install regex.
to test it https://regex101.com/r/NLHUQh/1
I am trying to figure how to require both letters and numbers only without any other characters. So literally [a-z] and ( \d or [0-9] ) depending what is better way of doing it for numbers.
So if I had a string that requires validation:
$toValidate = 'Q23AS9D0APQQ2'; // It may start with letter or number, both cases possible.
And then if I had validation for it:
return /([a-z].*[0-9])|([0-9].*[a-z])/i.test($toValidate);
I used an i flag here because it could be that user enters it lowercase or uppercase, it's user preference... So that regex fails... It accepts special characters also, so that is not desired effect.
With the validation above, this passes as well:
$toValidate = 'asdas12312...1231#asda___213-1';
Then I tried something crazy and I don't even know what I have done, so if anyone could tell me beside the correct answer, I'll truly appreciate.
return /([a-z].*\d)+$|(\d.*[a-z])+$/i.test($toValidate);
This seemed to work great. But then when I tried to continue typing letters or numbers after an special character it still validates as true.
Example:
$toValidate = 'q2IasK231#!#!#_+123';
So please help me understand regularExpressions better and tell me what is the way to validate the string at the beginning of my question. Letters and numbers expected in the string.
To allow only letters and digits with at least one letter and at least one digit use:
/^(?=.*?\d)(?=.*?[a-zA-Z])[a-zA-Z\d]+$/
Regex breakdown:
^ # start of input
(?=.*?\d) # lookahead to make sure at least one digit is there
(?=.*?[a-zA-Z]) # lookahead to make sure at least one letter is there
[a-zA-Z\d]+ # regex to match 1 or more of digit or letters
$ # end of input
RegEx Demo
You should not use .* in your regex otherwise it will allow any character in the input.
what you are looking for in pseudo is:
START-ANCOR [a-zA-Z0-9] 0->Inf times , ([a-zA-Z][0-9] OR [0-9][a-zA-Z]), [a-zA-Z0-9] 0->Inf times END-ANCOR
in words, start with anything from your lang, end with anything from your lang and contain a seam between letters and digits or the other way around
Should be like this:
/^([a-z0-9]* (([a-z][0-9]) | ([0-9][a-z])) [a-z0-9]*)$/i.
You can combine more than one type of character within the brackets. So, the following regex should work:
/^([a-z0-9]+)$/i.
The ^ and $ match the start and end of the string, so the whole string will be tested for the conditions. The [a-z0-9] makes it match only letters and numbers. The + makes it match one or more character. And the "i" at the end, as you know, makes it case insensitive.
I wrote Regular expression for the below cases :
only numbers(length:4)
only alphabets(should contain vowel)
([0-9]{1,4})|((?=[a-z]*[aeiou])[a-z]*)
eg: 9987, tyde
How to add the below condition?
Ignore the first two cases if the string contains alphanumeric
characters.
eg: 9ty87
If I decypher well your question, I think your are looking for that:
a string with only digits and between one and four characters
a string with only letters with at least a vowel
a string with only letters and digits with at least one letter and one digit.
pattern:
/^(?:[0-9]{1,4}|[bcdfghj-np-tv-z]*[aeiou][a-z]*|[a-z]+[0-9][a-z0-9]*|[0-9]+[a-z][a-z0-9]*)$/i
or more factorized
/^(?:[0-9]{1,4}(?:[0-9]*[a-z][a-z0-9]*)?|[bcdfghj-np-tv-z]*(?:[aeiou][a-z]*|[a-z]+[0-9][a-z0-9]*))$/i
It is a simple alternation (I don't think you need something more complicated). So only one of the branches will succeed.
Note that anchors ^ and $ are essential for this kind of task to ensure that whole string is taken in account.
I'm learning Javascript via an online tutorial, but nowhere on that website or any other I googled for was the jumble of symbols explained that makes up a regular expression.
Check if all numbers: /^[0-9]+$/
Check if all letters: /^[a-zA-Z]+$/
And the hardest one:
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
What do all the slashes and dollar signs and brackets mean? Please explain.
(By the way, what languages are required to create a flexible website? I know a bit of Javascript and wanna learn jQuery and PHP. Anything else needed?)
Thanks.
There are already a number of good sites that explain regular expressions so I'll just dive a bit into how each of the specific examples you gave translate.
Check if all numbers: ^ anchors the start of the expression (e.g. start at the beginning of the text). Without it a match could be found anywhere. [0-9] finds the characters in that character class (e.g. the numbers 0-9). The + after the character class just means "one or more". The ending $ anchors the end of the text (e.g. the match should run to the end of the input). So if you put that together, that regular expression would allow for only 1 or more numbers in a string. Note that the anchors are important as without them it might match something like "foo123bar".
Check if all letters: Pretty much the same as above but the character classes are different. In this example the character class [a-zA-Z] represents all lowercase and uppercase characters.
The last one actually isn't any more difficult than the other two it's just longer. This answer is getting quite long so I'll just explain the new symbols. A \w in a character class will match word characters (which are defined per regex implementation but are generally 0-9a-zA-Z_ at least). The backslash before the # escapes the # so that it isn't seen as a token in the regex. A period will match any character so .+ will match one or more of any character (e.g. a, 1, Z, 1a, etc). The last part of the regex ({2,4}) defines an interval expression. This means that it can match a minimum of 2 of the thing that precedes it, and a maximum of 4.
Hope you got something out of the above.
There is an awesome explanation of regular expressions at http://www.regular-expressions.info/ including notes on language and implementation specifics.
Let me explain:
Check if all numbers: /^[0-9]+$/
So, first thing we see is the "/" at the beginning and the end. This is a deliminator, and only serves to show the beginning and end of the regular expression.
Next, we have a "^", this means the beginning of the string. [0-9] means a number from 0-9. + is a modifier, which modifies the term in front of it, in this case, it means you can have one or more of something, so you can have one or more numbers from 0-9.
Finally, we end with "$", which is the opposite of "^", and means the end of the string. So put that all together and it basically makes sure that inbetween the start and end of the string, there can be any number of digits from 0-9.
Check if all letters: /^[a-zA-Z]+$/
We notice this is very similar, but instead of checking for numbers 0-9, it checks for letters a-z (lowercase) and A-Z (uppercase).
And the hardest one:
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
"\w" means that it is a word, in this case we can have any number of letters or numbers, as well as the period means that it can be pretty much any character.
The new thing here is escape characters. Many symbols cannot be used without escaping them by placing a slash in front, as is the case with "\#". This means it is looking directly for the symbol "#".
Now it looks for letters and symbols, a period (this one seems incorrect, it should be escaping the period too, though it will still work, since an unescaped period will make any symbol). Numbers inside {} mean that there is inbetween this many terms in the previous term, so of the [a-zA-Z0-9], there should be 2-4 characters (this part here is the website domain, such as .com, .ca, or .info). Note there's another error in this one here, the [a-zA-z0-9] should be [a-zA-Z0-9] (capital Z).
Oh, and check out that site listed above, it is a great set of tutorials too.
Regular Expressions is a complex beast and, as already pointed out, there are quite a few guides off of google you can go read.
To answer the OP questions:
Check if all numbers: /^[0-9]+$/
regexps here are all delimated with //, much like strings are quoted with '' or "".
^ means start of string or line (depending on what options you have about multiline matching)
[...] are called character classes. Anything in [] is a list of single matching characters at that position in this case 0-9. The minus sign has a special meaning of "sequence of characters between". So [0-9] means "one of 0123456789".
+ means "1 or more" of the preceeding match (in this case [0-9]) so one or more numbers
$ means end of string/line match.
So in summary find any string that contains only numbers, i.e '0123a' will not match as [0-9]+ fails to match a before $).
Check if all letters: /^[a-zA-Z]+$/
Hopefully [A-Za-z] makes sense now (A-Z = ABCDEF...XYZ and a-z abcdef...xyz)
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
Not all regexp parses know the \w sequence. Javascript, java and perl I know do support it.
I have already have covered '/^ at the beginning, for this [] match we are looking for
\w - . and +. I think that regexp is incorrect. Either the minus sign should be escaped with \ or it should be at the end of the [] (i.e [\w+.-]). But that is an aside they are basically attempting to allow anything of abcdefghijklmnopqrstuvwxyz01234567890-.+
so fred.smith-foo+wee#mymail.com will match but fred.smith%foo+wee#mymail.com wont (the % is not matched by [\w.+-]).
\# is the litteral atsil sign (it is escaped as perl expands # an array variable reference)
[a-zA-Z0-9.-]+ is the same as [\w.-]+. Very much like the user part of the match, but does not match +. So this matches foo.com. and google.co. but not my+foo.com or my***domain.co.
. means match any one character. This again is incorrect as fred#foo%com will match as . matches %*^%$£! etc. This should of been written as \.
The last character class [a-zA-z0-9]{2,4} looks for between 2 3 or 4 of the a-zA-Z0-9 specified in the character class (much like + looks for "1 more more" {2,4} means at least 2 with a maximum of 4 of the preceeding match. So 'foo' matches, '11' matches, '11111' does not match and 'information' does not.
The "tweaked" regexp should be:
/^[\w.+-]+\#[a-zA-Z0-9.-]+\.[a-zA-z0-9]{2,4}$/
I'm not doing a tutorial on RegEx's, that's been done really well already, but here are what your expressions mean.
/^<something>$/ String begins, has something in the middle, and then immediately ends.
/^foo$/.test('foo'); // true
/^foo$/.test('fool'); // false
/^foo$/.test('afoo'); // false
+ One or more of something:
/a+/.test('cot');//false
/a+/.test('cat');//true
/a+/.test('caaaaaaaaaaaat');//true
[<something>] Include any characters found between the brackets. (includes ranges like 0-9, a-z, and A-Z, as well as special codes like \w for 0-9a-zA-Z_-
/^[0-9]+/.test('f00')//false
/^[0-9]+/.test('000')//true
{x,y} between X and Y occurrences
/^[0-9]{1,2}$/.test('12');// true
/^[0-9]{1,2}$/.test('1');// true
/^[0-9]{1,2}$/.test('d');// false
/^[0-9]{1,2}$/.test('124');// false
So, that should cover everything, but for good measure:
/^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
Begins with at least character from \w, -, +, or .. Followed by an #, followed by at least one in the set a-zA-Z0-9.- followed by one character of anything (. means anything, they meant \.), followed by 2-4 characters of a-zA-z0-9
As a side note, this regular expression to check emails is not only dated, but it is very, very, very incorrect.