JavaScript RegExp to match a (partial) hour - javascript

I want to allow people to enter times into a textbox in various formats. One of the formats would be either:
2h for 2 hours, or
2.5h for 2 and a half hours
I want to use a regex to recognise the pattern but it's not picking it up for some reason:
I have:
var hourRegex = /^\d{1,2}[\.\d+]?[h|H]$/;
which works for 2h, but not for 2.5h.
I thought that this regex would mean - Start at the beginning of the string, have one or two digits, then have none or one decimal points which if present must be followed by one or more digits then have a h or a H and then it must be the end of the string.
I have tried the regex tool here but no luck.

/^\d{1,2}(?:\.\d+)?h$/i; Use parentheses instead of square braces.
Start at the beginning
One or two digits
Optional: a dot followed by at least one digit
End with a h
Case insensitive
RegExp tuturial
[...] - square braces mean: anything which is within the provided range.
[^...] means: Match a character which is not within the provided range
(...) - parentheses mean: Group me. Optionally, the first characters of a group can start with:
?: - Don't reference me (me, I = group)
?= - Don't include me in the match, though I have to be here
?! - I may not show up at this point
{a,b}, {a,} means: At least a, maximum b characters. Omitting b = Infinity
+ means: at least one time, match as much as possible equivalen to {1,}
* means: match as much as possible equivalent to {0,}
+? and *? have the same effect as previously described, with one difference: Match as less as possible
Examples
[a-z] One character, any character between a, b, c, ..., z
(a-z) Match "a-z", and group it
[^0-9] Match any non-number character
See also
MDN: Regular Expressions - A more detailed guide

The trouble is here :
[\.\d+]
you can not use character classes inside brackets.
Use this instead:
(\.[0-9]+)?

You've confused your square brackets with your parenthesis. Square brackets look for a single match of any contained character, whereas parenthesis look for a match of the entire enclosed pattern.
Your issue lies in [\.\d+]? It's looking for . or 0-9 or +.
Instead you should try:
/^\d{1,2}(\.\d+)?(h|H)$/
Although that will still allow users to enter invalid numbers, such as 99.3 which is probably not the expected behavior.

Related

Forcing a Strict Character Order in a Regex Expression

I'm trying to create a regex in Javascript that has a limited order the characters can be placed in, but I'm having trouble getting the validation to be fully correct.
The criteria for the expression is a little complicated. The user must input strings with the following criteria:
The string contains two parts, an initial group, and an end group.
The groups are separated by a colon (:).
Strings are separated by a semi-colon (;).
The initial group can start with one optional forward-slash and end with one optional forward-slash, but these forward-slashes may not appear anywhere else in the group.
Inside forward-slashes, one optional underscore may appear on either end, but they may not appear anywhere else in the group.
Inside these optional elements, the user may enter any number of numbers or letters, uppercase or lowercase, but exactly one of these characters must be surrounded with angular brackets (<>).
If the letter inside the brackets is an uppercase C, it may be followed by one of a lowercase u or v.
The end group may contain one or more of a number or letter, uppercase or lowercase (If it is an uppercase C, it can be followed by a lowercase u or v.) or one asterisk (*), but not both.
A string must be able to validate with multiple groupings.
This probably sounds a little confusing.
For example, the following examples are valid:
<C>:Cu;
<Cu>:Cv;
/_V<C>V:C;
/_VV<Cv>VV_/:Cu;
_<V>:V1;
_<V>_:V1;
_<V>/:V1;
_<V>:*;
_<m>:n;
The following are invalid:
Cu:Cv;
Cu:Cv
CuCv;
<Cu/>:Cv;
<Cu_>:Cv;
<Cu>:Cv/;
_/<Cu>:Cv;
<Cu>/_:Cv;
They should validate when grouped together like so.
<Cu>:Cv;/_V<C>V:C;_<V>:V1;_<V>/:V1;_<V>:*;_<m>:n;
Hopefully, these examples help you understand what I'm trying to match.
I created the following regexp and tested it on Regex101.com, but this is the closest I could come:
\\/{0,1}_{0,1}[A-Za-z0-9]{0,}<{1}[A-Za-z0-9]{1,2}>{1}[A-Za-z0-9]{0,}_{0,1}\\/{0,1}):([A-Za-z0-9]{1,2}|\\*;$
It's mostly correct, but it allows strings that should be invalid such as:
_/<C>:C;
If an underscore comes before the first forward-slash, it should be rejected. Otherwise, my regexp seems to be correct for all other cases.
If anyone has any suggestions on how to fix this, or knows of a way to match all criteria much more efficiently, any help is appreciated.
The following seems to fulfill all the criteria:
(?:^|;)(\/?_?[a-zA-Z0-9]*<(?:[a-zA-Z]|C[uv]?)>[a-zA-Z0-9]*_?\/?):([a-zA-Z0-9]+|\*)(?=;|$)
Regex101 demo.
It puts each of the "groups" in a capturing group so you can access them individually.
Details:
(?:^|;) A non-capturing group to make sure the string is either at the beginning or starts with a semicolon.
( Start of group 1.
\/?_? An optional forward-slash followed by an optional underscore.
[a-zA-Z0-9]* Any letter or number - Matches zero or more.
<(?:[a-zA-Z]|C[uv]?)> Mandatory <> pair containing one letter or the capital letter C followed by a lowercase u or v.
[a-zA-Z0-9]* Any letter or number - Matches zero or more.
_?\/? An optional underscore followed by an optional forward-slash.
) End of group1.
: Matches a colon character literally.
([a-zA-Z0-9]+|\*) Group 2 - containing one or more numbers or letters or a single * character.
(?=;|$) A positive Lookahead to make sure the string is either followed by a semicolon or is at the end.
Did you mean this?
/^(?:(^|\s*;\s*)(?:\/_|_)?[a-z]*<[a-z]+>[a-z]*_?\/?:(?:[a-z0-9]+|\*)(?=;))+;$/i
We start with a case-insensitive expression /.../i to keep it more readable. You have to rewrite it to a case-sensitive expression if you only want to allow uppercase at the beginning of a word.
^ means the begin of the string. $ means the end of the string.
The whole string ends with ';' after multiple repeatitions of the inner expression (?:...)+ where + means 1 or more ocurrences. ;$ at the end includes the last semicolon into the result. It is not necessary for a test only, since the look-ahead already does the job.
(^|\s*;\s*) every part is at the begin of the string or after a semicolon surrounded by arbitrary whitespaces including linefeed. Use \n if you do not want to allow spaces and tabs.
(?:...|...) is a non-captured alternative. ? after a character or group is the quantifier 0/1 - none or once.
So (?:\/_|_)? means '/', '' or nothing. Use \/?_? if you do want to allow strings starting with a single slash as well.
[a-z]*<[a-z]+>[a-z]* 0 or more letters followed by <...> with at least one letter inside and again followed by 0 or more letters.
_?\/?: optional '_', optional '/', mandatory : in this sequence.
(?:[a-z0-9]+|\*) The part after the colon contains letters and numbers or the asterisk.
(?=;) Look-ahead: Every group must be followed by a semicolon. Look-ahead conditions do not move the search position.

Regex for digits and hyphen only

I am trying to understand regex, for digits of length 10 I can simply do
/^[0-9]{10}$/
for hyphen only I can do
/^[-]$/
combining the two using group expression will result in
/^([0-9]{10})|([-])$/
This expression does not work as intended, it somehow will match part of the string instead of not match at all if the string is invalid.
How do I make the regex expression that accepts only "-" or 10 digits?
It would have worked fine to combine your two regexps exactly as you had them. In other words, just use the alternation/pipe operator to combine
/^[0-9]{10}$/
and
/^[-]$/
as is, directly into
/^[0-9]{10}$|^[-]$/
↑↑↑↑↑↑↑↑↑↑↑ ↑↑↑↑↑ YOUR ORIGINAL REGEXPS, COMBINED AS IS WITH |
This can be represented as
and that would have worked fine. As others have pointed out, you don't need to specify the hyphen in a character class, so
/^[0-9]{10}$|^-$/
↑ SIMPLIFY [-] TO JUST -
Now, we notice that each of the two alternatives has a ^ at the beginning and a $ at the end. That is a bit duplicative, and it also makes it little harder to see immediately that the regexp is always matching things from beginning to end. Therefore, we can rewrite this, as explained in other answers, by taking the ^ and $ out of both sub-regexps, and combine their contents using the grouping operator ():
/^([0-9]{10}|-)$/
↑↑↑↑↑↑↑↑↑↑↑↑↑ GROUP REGEXP CONTENTS WITH PARENS, WITH ANCHORS OUTSIDE
The corresponding visualization is
That would also work fine, but you could use \d instead of [0-9], so the final, simplest version is:
/^(\d{10}|-)$/
↑↑ USE \d FOR DIGITS
and this visualizes as
If for some reason you don't want to "capture" the group, use (?:, as in
/^(?:\d{10}|-)$/
↑↑ DON'T CAPTURE THE GROUP
and the visualization now shows that group is not captured:
By the way, in your original attempt to combine the two regexps, I noticed that you parenthesized them as in
/^([0-9]{10})|([-])$/
↑↑↑↑↑↑↑↑↑↑↑ ↑↑↑↑↑ YOU PARENTHESIZED THE SUB-REGEXPS
But actually this is not necessary, because the pipe (alternation, of "or") operator has low precedence already (actually it has the lowest precedence of any regexp operator); "low precedence" means it will apply only after things on both side are already processed, so what you wrote here is identical to
/^[0-9]{10}|[-]$/
which, however, still won't work for the reasons mentioned in other answers, as is clear from its visualization:
How do I make the regex expression that accepts only "-" or 10 digits?
You can use:
/^([0-9]{10}|-)$/
RegEx Demo
Your regex is just asserting presence of hyphen in the end due to misplacements of parentheses.
Here is the effective breakdown of OP's regex:
^([0-9]{10}) # matches 10 digits at start
| # OR
([-])$ # matches hyphen at end
which will cause OP's regex to match any input starting with 10 digits or ending with hyphen making these invalid inputs also a valid match:
1234567890111
1234----
------------------
1234567890--------
To get the regex expression that accepts only "-" or 10 digits - change your regexp as shown below:
^(\d{10}|-)$
DEMO link
The problem with your regex is it's looking for strings either
starting with 10 digits i.e. ^([0-9]{10}) or
ends with "-" - i.e. ([-])$
You needs an addtional wrapping ^( .. )$ to get this work. i.e.
/^(([0-9]{10})|([-]))$/
Better yet /^([0-9]{10}|-)$/ since [-] and - are both the same.

regex allows one character (it should not) why?

Hello I am trying to create a regex that recognizes money and numbers being inputted. I have to allow numbers because I am expecting non-formatted numbers to be inputted programmatically and then I will format them myself. For some reason my regex is allowing a one letter character as a possible input.
[\$]?[0-9,]*\.[0-9][0-9]
I understand that my regex accepts the case where multiple commas are added and also needs two digit after the decimal point. I have had an idea of how to fix that already. I have narrowed it down to possibly the *\. as the problem
EDIT
I found the regex expression that worked [\$]?([0-9,])*[\.][0-9]{2} but I still don't know how or why it was failing in the first place
I am using the .formatCurrency() to format the input into a money format. It can be found here but it still allows me to use alpha characters so i have to further masked it using the $(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" }); where input mask is found here and $(this) is a reference to a input element of type text. My code would look something like this
<input type="text" id="123" data-Money="true">
//in the script
.find("input").each(function () {
if ($(this).attr("data-Money") == "true") {
$(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" });
$(this).on("blur", function () {
$(this).formatCurrency();
});
I hope this helps. I try creating a JSfiddle but Idk how to add external libraries/plugin/extension
The "regular expression" you're using in your example script isn't a RegExp:
$(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" });
Rather, it's a String which contains a pattern which at some point is being converted into a true RegExp by your library using something along the lines of
var RE=!(value instanceof RegExp) ? new RegExp(value) : value;
Within Strings a backslash \ is used to represent special characters, like \n to represent a new-line. Adding a backslash to the beginning of a period, i.e. \., does nothing as there is no need to "escape" the period.
Thus, the RegExp being created from your String isn't seeing the backslash at all.
Instead of providing a String as your regular expression, use JavaScript's literal regular expression delimiters.
So rather than:
$(this).inputmask('Regex', { regex: "[\$]?([0-9,])*[\.][0-9]{2}" });
use
$(this).inputmask('Regex', { regex: /[\$]?([0-9,])*[\.][0-9]{2}/ });
And I believe your "regular expression" will perform as you expect.
(Note the use of forward slashes / to delimit your pattern, which JavaScript will use to provide a true RegExp.)
Firstly, you can replace '[0-9]' with '\d'. So we can rewrite your first regex a little more cleanly as
\$?[\d,]*\.\d\d
Breaking this down:
\$? - A literal dollar sign, zero or one
[\d,]* - Either a digit or a comma, zero or more
\. - A literal dot, required
\d - A digit, required
\d - A digit, required
From this, we can see that the minimum legal string is \.\d\d, three characters long. The regex you gave will never validate against any one character string.
Looking at your second regex,
[\$]? - A literal dollar sign, zero or one
([0-9,])* - Either a digit or a comma, subexpression for later use, zero or more
[\.] - A literal dot, required
[0-9]{2} - A digit, twice required
This has the exact same minimum matchable string as above - \.\d\d.
edit: As mentioned, depending on the language you may need to escape forward slashes to ensure they aren't misinterpretted by the language when processing the string.
Also, as an aside, the below regex is probably closer to what you need.
[A-Z]{3} ?(\d{0,3}(?:([,. ])\d{3}(?:\2\d{3})*)?)(?!\2)[,.](\d\d)\b
Explanation:
[A-Z]{3} - Three letters; for an ISO currency code
? - A space, zero or more; for readability
( - Capture block; to catch the integer currency amount
\d{0,3} - A digit, between one and three; for the first digit block
(?: - Non capturing block (NC)
([,. ]) - A comma, dot or space; as a thousands delimiter
\d{3} - A digit, three; the first possible whole thousands
(?: - Non capturing block (NC)
\2 - Match 2; the captured thousands delimiter above
\d{3} - A digits, three
)* - The above group, zero or more, i.e. as many thousands as we want
)? - The above (NC) group, zero or one, ie. all whole thousands
) - The above group, i.e everything before the decimal
[.,] - A comma or dot, as a decimal delimiter
(\d{2}) - Capture, A digit, two; ie. the decimal portion
\b - A word boundry; to ensure that we don't catch another
digit in the wrong place.
The negative lookahead was provided by an answer from John Kugelman in this question.
This correctly matches (matches enclosed in square brackets):
[AUD 1.00]
[USD 1,300,000.00]
[YEN 200 000.00]
I need [USD 1,000,000.00], all in non-sequential bills.
But not:
GBP 1.000
YEN 200,000

Please explain some Javascript Regular Expressions

I'm learning Javascript via an online tutorial, but nowhere on that website or any other I googled for was the jumble of symbols explained that makes up a regular expression.
Check if all numbers: /^[0-9]+$/
Check if all letters: /^[a-zA-Z]+$/
And the hardest one:
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
What do all the slashes and dollar signs and brackets mean? Please explain.
(By the way, what languages are required to create a flexible website? I know a bit of Javascript and wanna learn jQuery and PHP. Anything else needed?)
Thanks.
There are already a number of good sites that explain regular expressions so I'll just dive a bit into how each of the specific examples you gave translate.
Check if all numbers: ^ anchors the start of the expression (e.g. start at the beginning of the text). Without it a match could be found anywhere. [0-9] finds the characters in that character class (e.g. the numbers 0-9). The + after the character class just means "one or more". The ending $ anchors the end of the text (e.g. the match should run to the end of the input). So if you put that together, that regular expression would allow for only 1 or more numbers in a string. Note that the anchors are important as without them it might match something like "foo123bar".
Check if all letters: Pretty much the same as above but the character classes are different. In this example the character class [a-zA-Z] represents all lowercase and uppercase characters.
The last one actually isn't any more difficult than the other two it's just longer. This answer is getting quite long so I'll just explain the new symbols. A \w in a character class will match word characters (which are defined per regex implementation but are generally 0-9a-zA-Z_ at least). The backslash before the # escapes the # so that it isn't seen as a token in the regex. A period will match any character so .+ will match one or more of any character (e.g. a, 1, Z, 1a, etc). The last part of the regex ({2,4}) defines an interval expression. This means that it can match a minimum of 2 of the thing that precedes it, and a maximum of 4.
Hope you got something out of the above.
There is an awesome explanation of regular expressions at http://www.regular-expressions.info/ including notes on language and implementation specifics.
Let me explain:
Check if all numbers: /^[0-9]+$/
So, first thing we see is the "/" at the beginning and the end. This is a deliminator, and only serves to show the beginning and end of the regular expression.
Next, we have a "^", this means the beginning of the string. [0-9] means a number from 0-9. + is a modifier, which modifies the term in front of it, in this case, it means you can have one or more of something, so you can have one or more numbers from 0-9.
Finally, we end with "$", which is the opposite of "^", and means the end of the string. So put that all together and it basically makes sure that inbetween the start and end of the string, there can be any number of digits from 0-9.
Check if all letters: /^[a-zA-Z]+$/
We notice this is very similar, but instead of checking for numbers 0-9, it checks for letters a-z (lowercase) and A-Z (uppercase).
And the hardest one:
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
"\w" means that it is a word, in this case we can have any number of letters or numbers, as well as the period means that it can be pretty much any character.
The new thing here is escape characters. Many symbols cannot be used without escaping them by placing a slash in front, as is the case with "\#". This means it is looking directly for the symbol "#".
Now it looks for letters and symbols, a period (this one seems incorrect, it should be escaping the period too, though it will still work, since an unescaped period will make any symbol). Numbers inside {} mean that there is inbetween this many terms in the previous term, so of the [a-zA-Z0-9], there should be 2-4 characters (this part here is the website domain, such as .com, .ca, or .info). Note there's another error in this one here, the [a-zA-z0-9] should be [a-zA-Z0-9] (capital Z).
Oh, and check out that site listed above, it is a great set of tutorials too.
Regular Expressions is a complex beast and, as already pointed out, there are quite a few guides off of google you can go read.
To answer the OP questions:
Check if all numbers: /^[0-9]+$/
regexps here are all delimated with //, much like strings are quoted with '' or "".
^ means start of string or line (depending on what options you have about multiline matching)
[...] are called character classes. Anything in [] is a list of single matching characters at that position in this case 0-9. The minus sign has a special meaning of "sequence of characters between". So [0-9] means "one of 0123456789".
+ means "1 or more" of the preceeding match (in this case [0-9]) so one or more numbers
$ means end of string/line match.
So in summary find any string that contains only numbers, i.e '0123a' will not match as [0-9]+ fails to match a before $).
Check if all letters: /^[a-zA-Z]+$/
Hopefully [A-Za-z] makes sense now (A-Z = ABCDEF...XYZ and a-z abcdef...xyz)
Validate Email: /^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
Not all regexp parses know the \w sequence. Javascript, java and perl I know do support it.
I have already have covered '/^ at the beginning, for this [] match we are looking for
\w - . and +. I think that regexp is incorrect. Either the minus sign should be escaped with \ or it should be at the end of the [] (i.e [\w+.-]). But that is an aside they are basically attempting to allow anything of abcdefghijklmnopqrstuvwxyz01234567890-.+
so fred.smith-foo+wee#mymail.com will match but fred.smith%foo+wee#mymail.com wont (the % is not matched by [\w.+-]).
\# is the litteral atsil sign (it is escaped as perl expands # an array variable reference)
[a-zA-Z0-9.-]+ is the same as [\w.-]+. Very much like the user part of the match, but does not match +. So this matches foo.com. and google.co. but not my+foo.com or my***domain.co.
. means match any one character. This again is incorrect as fred#foo%com will match as . matches %*^%$£! etc. This should of been written as \.
The last character class [a-zA-z0-9]{2,4} looks for between 2 3 or 4 of the a-zA-Z0-9 specified in the character class (much like + looks for "1 more more" {2,4} means at least 2 with a maximum of 4 of the preceeding match. So 'foo' matches, '11' matches, '11111' does not match and 'information' does not.
The "tweaked" regexp should be:
/^[\w.+-]+\#[a-zA-Z0-9.-]+\.[a-zA-z0-9]{2,4}$/
I'm not doing a tutorial on RegEx's, that's been done really well already, but here are what your expressions mean.
/^<something>$/ String begins, has something in the middle, and then immediately ends.
/^foo$/.test('foo'); // true
/^foo$/.test('fool'); // false
/^foo$/.test('afoo'); // false
+ One or more of something:
/a+/.test('cot');//false
/a+/.test('cat');//true
/a+/.test('caaaaaaaaaaaat');//true
[<something>] Include any characters found between the brackets. (includes ranges like 0-9, a-z, and A-Z, as well as special codes like \w for 0-9a-zA-Z_-
/^[0-9]+/.test('f00')//false
/^[0-9]+/.test('000')//true
{x,y} between X and Y occurrences
/^[0-9]{1,2}$/.test('12');// true
/^[0-9]{1,2}$/.test('1');// true
/^[0-9]{1,2}$/.test('d');// false
/^[0-9]{1,2}$/.test('124');// false
So, that should cover everything, but for good measure:
/^[\w-.+]+\#[a-zA-Z0-9.-]+.[a-zA-z0-9]{2,4}$/
Begins with at least character from \w, -, +, or .. Followed by an #, followed by at least one in the set a-zA-Z0-9.- followed by one character of anything (. means anything, they meant \.), followed by 2-4 characters of a-zA-z0-9
As a side note, this regular expression to check emails is not only dated, but it is very, very, very incorrect.

Regular Expression for date validation - Explain

I was surfing online for date validation, but didn't exactly understand the regex. Can anyone explain it? I'm confused with ?, {} and $. Why do we need them?
dateReg = /^[0,1]?\d{1}\/(([0-2]?\d{1})|([3][0,1]{1}))\/(([1]{1}[9]{1}[9]{1}\d{1})|([2-9]{1}\d{3}))$/;
? means “zero or one occurences”.
{x} (where x is a number) means “exactly x occurences”
$ means “end of line”
These are very basic regex, I recommand you to read some documentation.
^ = beginning of the string
[0,1]? = optional zero, one or comma (the comma is probably an error)
\d{1} = exactly one digit (the {1} is redundant)
\/ = a forward slash
[0-2]? = optional zero, one or two (range character class) followed by any single digit (\d{1})
OR [3] = three (character class redundant here) followed by exactly one zero, one or comma
\/ = forward slash
[1]{1}[9]{1}[9]{1}\d{1} = 199 followed by any digit
OR 2-9 followed by any 3 digits
Overall, that's a really poorly written expression. I'd suggest finding a better one, or using a real date parser.
In Javascript you could validate date by passing it to Date.Parse() function. Successful conversion to a date object means you have a valid date.
Wouldn't recommend using regex for this. Too many edge cases and code gets hard to maintain.
? means "Zero or one of the aforementioned"
{n} means "exactly n of the aforementioned"
$ is the end of the String (Thanks #Andy E)
To summarize briefly:
`?' will match 0 or 1 times the pattern group you put in front of it. In this case, it's possibly being misused and should be left out, but it all depends on just what you want to match.
`{x}' tells the regex to match the preceding pattern group exactly x times.
`$' means to match the end of the line.
Well:
^ // start of the text
$ // end of the text
X{n} // number n inside these curly parenthesis define how many exact occurrences of X
X{m,n} // between m to n occurrences of X
X? // 0 or 1 occurrence of X
\d // any digits 0-9
For more help about Javascript date validation please see: Regular Expression to only grab date

Categories

Resources