Can somebody explain this RegEx to me? - javascript

I see this line of code and the regular expression just panics me...
quickExpr = /^(?:[^#<]*(<[\w\W]+>)[^>]*$|#([\w\-]*)$)/
Can someone please explain little by little what it does?
Thanks,G

Here's what I can extract:
^ beginning of string.
(?: non-matching group.
[^#<]* any number of consecutive characters that aren't # or <.
(<[\w\W]+>) a group that matches strings like <anything_goes_here>.
[^>]* any number of characters in sequence that aren't a >.
The part after the | denotes a second regex to try if the first one fails. That one is #([\w\-]*):
# matches the # character. Not that complex.
([\w\-]*) is a group that matches any number of word characters or dashes. Basically Things-of-this-form
$ marks the end of the regex.
I'm no regex pro, so please correct me if I am wrong.

^(?:[^#<]*(<[\w\W]+>)[^>]*$|#([\w\-]*)$)
Assert position at the start of the string «^»
Match the regular expression below «(?:[^#<]*(<[\w\W]+>)[^>]*$|#([\w\-]*)$)»
Match either the regular expression below (attempting the next alternative only if this one fails) «[^#<]*(<[\w\W]+>)[^>]*$»
Match a single character NOT present in the list "#<" «[^#<]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match the regular expression below and capture its match into backreference number 1 «(<[\w\W]+>)»
Match the character "<" literally «<»
Match a single character present in the list below «[\w\W]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a "word character" (letters, digits, etc.) «\w»
Match a single character that is a "non-word character" «\W»
Match the character ">" literally «>»
Match any character that is not a ">" «[^>]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Or match regular expression number 2 below (the entire group fails if this one fails to match) «#([\w\-]*)$»
Match the character "#" literally «#»
Match the regular expression below and capture its match into backreference number 2 «([\w\-]*)»
Match a single character present in the list below «[\w\-]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character that is a "word character" (letters, digits, etc.) «\w»
A - character «\-»
Assert position at the end of the string (or before the line break at the end of the string, if any) «$»
Created with RegexBuddy

Related

RegExp avoid double space and space before characters

I'm trying to write a regular expression in order to not allow double spaces anywhere in a string, and also force a single space before a MO or GO mandatory, with no space allowed at the beginning and at the end of the string.
Example 1 : It is 40 GO right
Example 2 : It is 40GO wrong
Example 3 : It is 40 GO wrong
Here's what I've done so far ^[^ ][a-zA-Z0-9 ,()]*[^;'][^ ]$, which prevents spaces at the beginning and at the end, and also the ";" character. This one works like a charm.
My issue is not allowing double spaces anywhere in the string, and also forcing spaces right before MO or GO characters.
After a few hours of research, I've tried these (starting from the previous RegExp I wrote):
To prevent the double spaces: ^[^ ][a-zA-Z0-9 ,()]*((?!.* {2}).+)[^;'][^ ]$
To force a single space before MO: ^[^ ][a-zA-Z0-9 ,()]*(?=\sMO)*[^;'][^ ]$
But neither of the last two actually work. I'd be thankful to anyone that helps me figure this out
The lookahead (?!.* {2} can be omitted, and instead start the match with a non whitespace character and end the match with a non whitespace character and use a single space in an optionally repeated group.
If the string can not contain a ' or ; then using [^;'][^ ]$ means that the second last character should not be any of those characters.
But you can omit that part, as the character class [a-zA-Z0-9,()] does not match ; and '
Note that using a character class like [^ ] and [^;'] actually expect a single character, making the pattern that you tried having a minimum length.
Instead, you can rule out the presence of GO or MO preceded by a non whitespace character.
^(?!.*\S[MG]O\b)[a-zA-Z0-9,()]+(?: [a-zA-Z0-9,()]+)*$
The pattern matches:
^ Start of string
(?!.*\S[MG]O\b) Negative lookahead, assert not a non whitspace character followed by either MO or GO to the right. The word boundary \b prevents a partial word match
[a-zA-Z0-9,()]+ Start the match with 1+ occurrences of any of the listed characters (Note that there is no space in it)
(?: [a-zA-Z0-9,()]+)* Optionally repeat the same character class with a leading space
$ End of string
Regex demo

Javascript regex negation - Negate period on email regex

I want an existing email regex to fail when entering a period (".") before the #.
This is the regex I have right now:
^[a-zA-Z]+[a-zA-Z0-9.]+#domain.com$
These should pass:
test.a#domain.com
a.test#domain.com
But these shouldn't:
.test#domain.com
test.#domain.com
The first case starting with period is handled but second case is not.
This should work without requiring two or more characters before the # sign.
^[a-zA-Z][a-zA-Z0-9]*(?:\.+[a-zA-Z0-9]+)*#domain\.com$
Here's how it breaks down:
^ Make sure we start at the beginning of the string
[a-zA-Z] First character needs to be a letter
[a-zA-Z0-9]* ...possibly followed by any number of letters or numbers.
(?: Start a non-capturing group
\.+ Match any periods...
[a-zA-Z0-9]+ ...followed by at least one letter or number
)* The whole group can appear zero or more times, to
offset the + quantifiers inside. Otherwise the
period would be required
#domain\.com$ Match the rest of the string. At this point, the
only periods we've allowed are followed by at
least one number or letter
I would try:
^[a-zA-Z]+[a-zA-Z0-9.]*[a-zA-Z0-9]+#domain.com$
Try this regex: ^[\w.+-]*[^\W.]#domain\.com$.
[\w.+-]* matches any number of alphanumerical characters, +, - and .
[^\W.] matches any character that is not a non-alphanumerical character or a . (which means any accepted character but .)
#domain\.com matches the rest of the email, change the domain as you wish or use #\w\.\w+ for matching most domains. (matching all domains is more complex, see more complete examples of email matching regex here)

Explain this regex js

I'm using this regex to match some strings:
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
I'm confusing about why it's permitted to enter two dots, like ..
What I understand is that only allowed to put 1 dash or none (-)?
Any digits with no limit or none (\d+)?
One dot or none (\.)?
Why is allowed to put .. or even .4.6?
Testing done in http://www.regextester.com/
[^\s] means anything that is not a whitespace. This includes dots. Trying to match .. will get you:
[^\s] matches .
(-)? doesn't match
(\d+)? doesn't match
(\.)? matches .
(\d+)? doesn't match
I'll assume you wanted to match numbers (possibly negative/floating):
^-?\d+(\.\d+)?$
^([^\s](-)?(\d+)?(\.)?(\d+)?)$/
Assert position at the beginning of the string ^
Match the regex below and capture its match into backreference number 1 ([^\s](-)?(\d+)?(\.)?(\d+)?)
Match any single character that is NOT present in the list below and that is NOT a line break character (line feed) [^\s]
A single character from the list “\s” (case sensitive) \s
Match the regex below and capture its match into backreference number 2 (-)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “-” literally -
Match the regex below and capture its match into backreference number 3 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Match the regex below and capture its match into backreference number 4 (\.)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
Match the character “.” literally \.
Match the regex below and capture its match into backreference number 5 (\d+)?
Between zero and one times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives ?
MySQL does not support any shorthand character classes \d+
Between one and unlimited times, as few or as many times as needed to find the longest match in combination with the other quantifiers or alternatives +
Assert position at the very end of the string $
Match the character “/” literally /
Created with RegexBuddy
As I mentioned in my comment, [^\n] is a negated character class that matches .. and as there is another (\.)? pattern, the regex can match 2 consecutive dots (since all of the parts except for [^\s] are optional).
In order not to match strings like .4.5 or .. you just need to add the . to the [^\n] negated character class:
^([^\s.](-)?(\d+)?(\.)?(\d+)?)$
^
See demo. This will not let any . in the initial capturing group.
You can use a lookahead to only disallow the first character as a dot:
^(?!\.)([^\s](-)?(\d+)?(\.)?(\d+)?)$
See another demo
All explanation is available at the online regex testers:
In order to match the numbers in the format you expect, use:
^(?:[-]?\d+\.?\d*|-)$
Human-readable explanation:
^ - start of string and then there are 2 alternatives...
[-]? - optional hyphen
\d+ - 1 or more digits
\.? - optional dot
\d* - 0 or more digits
| -OR-
- - a hyphen
$ - end of string
See demo

regular expression to avoid only special chars

I'm validating a input text box. I'm new to regexp. I want an expression which throws a validation error if all the characters of input are special characters. but it should allow special characters in the string.
-(**&^&)_) ----> invalid.
abcd-as jasd12 ----> valid.
currently validating for numbers and alphabets with /^[a-zA-Z0-9-]+[a-z A-Z 0-9 -]*$/
/[A-Za-z0-9]/ will match positive if the string contains at least 1 letter or number, which should be the same as what you're asking. If there are NO letters or numbers, that regex will evaluate as false.
According to your comment, special characters are !##$%^&*()_-, so you could use:
var regex = /^[!##$%^&*()_-]+$/;
if (regex.test(string))
// all char are special
If you have more special char, add them in the character class.
Use negative Lookahead:
if (/^(?![\s\S]*[^\w -]+)[\s\S]*?$/im.test(subject)) {
// Successful match
} else {
// Match attempt failed
}
DEMO
EXPLANATION:
^(?!.[^\w -]+).?$
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Assert that it is impossible to match the regex below starting at this position (negative lookahead) «(?!.*[^\w -]+)»
Match any single character «.*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character NOT present in the list below «[^\w -]+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
A word character (letters, digits, and underscores) «\w»
The character “ ” « »
The character “-” «-»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
~[^a-zA-z0-9 ]+~ it will matches if the String doesnot contains atleast one alphabets and numbers and spaces in it.
Demo

For comma-separated

I'm new to RegEx and JavaScript and I was wondering if anyone knew what the RegEx would be for detecting whether or not an input field contained the following type of format:
At least one alphanumeric tag which can contain spaces (e.g. "Test Tag" but not "Test#Tag")
Each tag separated by a single comma (e.g. "Cars, Vehicle, Large Dog, Bed" but not "Cars, Vehicle, Tiger)
An example of what I mean is this, these would be valid tags:
boy, man,girl, woman,tyrannosaurus rex, lion
And these would be invalid tags:
hat, cat, rat, c3po, #gmail
Because there are invalid characters in "#gmail".
It should also be able to accept just a single tag, as long as the characters are alphanumeric.
Assuming you want to allow _ and not allow whitespace at the beginning or the end this would be the shortest solution:
/^\w(\s*,?\s*\w)*$/
Introducing whitespace at the ends:
/^\s*\w(\s*,?\s*\w)*\s*$/
Removing _ from the allowed characters:
/^\s*[a-z0-9](\s*,?\s*[a-z0-9])*\s*$/
This is the brute-force regex I initially posted. It translates your requirements to regex syntax. I would like to leave it here for reference.
/^\s*([a-z0-9]+(\s[a-z0-9]+)*)(\s*,\s*([a-z0-9]+(\s[a-z0-9]+)*))*\s*$/
Try something like this:
var re = /^(\w+,? ?)+$/;
var str1 = "boy, man,girl, woman,tyrannosaurus rex, lion";
var str2 = "hat, cat, rat, c3po, #gmail";
alert(str1.test(re)); // true
alert(str2.test(re)); // false
Breaking it down... \w matches word characters, \w+ matches 1 or more word characters. ,? ? matches optional comma and space. (Two commas would be rejected.) The ()+ around everything says one or more times. Lastly ^ and $ anchors it to the beginning and end of the string to make sure everything is matched.
Assuming that underscores (_) are not invalid:
/^(\w+\s?[\w\s]*)(,\s*\w+\s?[\w\s]*)*$/
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match the regular expression below and capture its match into backreference number 1 «(\w+\s?[\w\s]*)»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character present in the list below «[\w\s]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A word character (letters, digits, and underscores) «\w»
A whitespace character (spaces, tabs, and line breaks) «\s»
Match the regular expression below and capture its match into backreference number 2 «(,\s*\w+\s?[\w\s]*)*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Note: You repeated the capturing group itself.
The group will capture only the last iteration.
Put a capturing group around the repeated group to capture all iterations. «*»
Match the character “,” literally «,»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
Match a single character that is a “word character” (letters, digits, and underscores) «\w+»
Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match a single character that is a “whitespace character” (spaces, tabs, and line breaks) «\s?»
Between zero and one times, as many times as possible, giving back as needed (greedy) «?»
Match a single character present in the list below «[\w\s]*»
Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»
A word character (letters, digits, and underscores) «\w»
A whitespace character (spaces, tabs, and line breaks) «\s»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
Created with RegexBuddy
A separate question is being answered here. How to do the same thing but allow tags with at most two words?
/^\s*[a-z0-9]+(\s+[a-z0-9]+)?(\s*,\s*[a-z0-9]+(\s+[a-z0-9]+)?)*\s*$/
Tested.

Categories

Resources