Javascript Pattern Validation - javascript

Please can someone assist me with a regular expression to validate a string of this pattern
aaa [bbb]
I want to an input in the format expressed above. aaa & bbb can be any combination of one or more words which could also contain special characters. bbb must be contained inside square brackets ([...]). The full string can have a leading or trailing spaces.
I have tried this:
var re=/\w{0,} \[*\w{0,}\]/
But it returns false on a test string like:
re.test("onc>*!llklk[dd<dfd]")

Your regular expression explicitly requires a space to be present. We can visualise this with Regexper:
This returns false on "onc>*!llklk[dd<dfd]" because there is no space character.
To fix your problem, either use a test string which has a space character, or change your regular expression to not require this character:
var re = /\w{0,}\[*\w{0,}\]/;
re.test("aaa [bbb]");
> true
re.test("onc>*!llklk[dd<dfd]");
> true
You may want to rethink your regular expression though, because as it stands, a single "]" character will pass the test:
re.test("]");
> true

\S+\s*\[\S+?\]
You can try this if you want to allow special characters as well.
http://regex101.com/r/yA1jY6/3

First, to make it easier to read, you could replace {0,} by * as it's the same thing.
Next, \w would not match some symbols like > or *, you can use a . to match any symbol.
Then, like an other answer is saying, you're expecting a space between the two groups (aaa and [bbb]), so your example won't match.
I think this regex is a good starting point (depending on your other requirements).
/.+\[.+\]/
Try it here

Related

Replace a phrase in a string that is being broken up into 2 separate lines [duplicate]

Is there a simple way to ignore the white space in a target string when searching for matches using a regular expression pattern? For example, if my search is for "cats", I would want "c ats" or "ca ts" to match. I can't strip out the whitespace beforehand because I need to find the begin and end index of the match (including any whitespace) in order to highlight that match and any whitespace needs to be there for formatting purposes.
You can stick optional whitespace characters \s* in between every other character in your regex. Although granted, it will get a bit lengthy.
/cats/ -> /c\s*a\s*t\s*s/
While the accepted answer is technically correct, a more practical approach, if possible, is to just strip whitespace out of both the regular expression and the search string.
If you want to search for "my cats", instead of:
myString.match(/m\s*y\s*c\s*a\*st\s*s\s*/g)
Just do:
myString.replace(/\s*/g,"").match(/mycats/g)
Warning: You can't automate this on the regular expression by just replacing all spaces with empty strings because they may occur in a negation or otherwise make your regular expression invalid.
Addressing Steven's comment to Sam Dufel's answer
Thanks, sounds like that's the way to go. But I just realized that I only want the optional whitespace characters if they follow a newline. So for example, "c\n ats" or "ca\n ts" should match. But wouldn't want "c ats" to match if there is no newline. Any ideas on how that might be done?
This should do the trick:
/c(?:\n\s*)?a(?:\n\s*)?t(?:\n\s*)?s/
See this page for all the different variations of 'cats' that this matches.
You can also solve this using conditionals, but they are not supported in the javascript flavor of regex.
You could put \s* inbetween every character in your search string so if you were looking for cat you would use c\s*a\s*t\s*s\s*s
It's long but you could build the string dynamically of course.
You can see it working here: http://www.rubular.com/r/zzWwvppSpE
If you only want to allow spaces, then
\bc *a *t *s\b
should do it. To also allow tabs, use
\bc[ \t]*a[ \t]*t[ \t]*s\b
Remove the \b anchors if you also want to find cats within words like bobcats or catsup.
This approach can be used to automate this
(the following exemplary solution is in python, although obviously it can be ported to any language):
you can strip the whitespace beforehand AND save the positions of non-whitespace characters so you can use them later to find out the matched string boundary positions in the original string like the following:
def regex_search_ignore_space(regex, string):
no_spaces = ''
char_positions = []
for pos, char in enumerate(string):
if re.match(r'\S', char): # upper \S matches non-whitespace chars
no_spaces += char
char_positions.append(pos)
match = re.search(regex, no_spaces)
if not match:
return match
# match.start() and match.end() are indices of start and end
# of the found string in the spaceless string
# (as we have searched in it).
start = char_positions[match.start()] # in the original string
end = char_positions[match.end()] # in the original string
matched_string = string[start:end] # see
# the match WITH spaces is returned.
return matched_string
with_spaces = 'a li on and a cat'
print(regex_search_ignore_space('lion', with_spaces))
# prints 'li on'
If you want to go further you can construct the match object and return it instead, so the use of this helper will be more handy.
And the performance of this function can of course also be optimized, this example is just to show the path to a solution.
The accepted answer will not work if and when you are passing a dynamic value (such as "current value" in an array loop) as the regex test value. You would not be able to input the optional white spaces without getting some really ugly regex.
Konrad Hoffner's solution is therefore better in such cases as it will strip both the regest and test string of whitespace. The test will be conducted as though both have no whitespace.

Combining 2 regexes, one with exact match using OR operator

I am trying to combine:
^[a-zA-Z.][a-zA-Z'\\- .]*$
with
(\W|^)first\sname(\W|$)
which should check for the exact phrase, first name, if that is correct. It should match either the first regex OR the second exact match. I tried this, but appears invalid:
^(([a-zA-Z.][a-zA-Z'\\- .]*$)|((\W|^)first\sname(\W|$))
This is in javascript btw.
Combining regular expressions generally can be done simply in the following way:
Regex1 + Regex2 = (Regex1|Regex2)
^[a-zA-Z.][a-zA-Z'\\- .]*$
+ (\W|^)first\sname(\W|$) =
(^[a-zA-Z.][a-zA-Z'\\- .]*$|(\W|^)first\sname(\W|$))
Because some SO users have a hard time understand the math analogy, here's a full word explanation.
If you have a regex with content REGEX1 and a second regex with content REGEX2 and you want to combine them in the way that was described by OP in his question, a simple way to do this without optimization is the following.
(REGEX1|REGEX2)
Where you surround both regular expressions with parenthesis and divide the two with |.
Your regex would be the following:
(^[a-zA-Z.][a-zA-Z'\\- .]*$|(\W|^)first\sname(\W|$))
Your first regex has an error in it, though, that makes it invalid. Try this instead.
(^[a-zA-Z.][a-zA-Z'\- .]*$|(\W|^)first\sname(\W|$))
You had \\ in the second character class where you wanted \
The problem is that the first regex is messed up. You don't need to double escape characters. Therefore
\\-
Will match an ascii character between \(92) and (32). Remove one of the slashes.
Reference

Reg: Javascript regex

How can I match the pattern abc_[someArbitaryStringHere]_xyz?
To clarify, I would want the regex to match strings of the nature:
abc_xyz, abc_asdfsdf_xyz, abc_32rwrd_xyz etc.
I tried with /abc_*_xyz/ but this seems to be an incorrect expression.
Use
/^abc(?:_.*_|_)xyz$/
Be sure to include the ^ and $, they guard the beginning and end of the string. Otherwise strings like "123abc_foo_xyz" will match.
(?:_.*_|_) Is a non-capture group that matches either _[someArbitaryStringHere]_ or a single _
Your regex would be,
abc(?:(?:_[^_]+)+)?_xyz
DEMO
Assuming abc_xyz is indeed a string you want to match, and isn't just a typo, then your regex is:
/abc(?:_[^_]+)?_xyz/
This will match abc, then optionally match a _ followed by greedily matching anything but _s. After this optional part, it will match the ending _xyz.
If this is to match an entire string (as opposed to just extracting substrings from a bigger string), then you can just put ^ at the start and $ at the end, like so:
/^abc(?:_[^_]+)?_xyz$/
EDIT: Just noticed that JavaScript doesn't support possessive matching, only greedy. Changed ++ to +.
EDIT2: The above regexes also assume that your "arbitrary string" does not contain more underscores. They can be expanded to allow more rules.
For example, to allow just anything, a truly arbitrary string, try:
/abc(?:_.*)?_xyz/ or /^abc(?:_.*)?_xyz$/
But if you want to be really clever, and disallow consecutive underscores, you can do:
/abc(?:_[^_]+)*_xyz/ or /^abc(?:_[^_]+)*_xyz$/
And lastly, if you want to "only allow letters or numbers" in your arbitrary strings, just replace [^_] with [a-zA-Z0-9].
The '*' mean you will match 0 or more. but of what?
/abc_[a-z0-9]*_xyz/im
The DOT. will match any character ANY.
/abc_(.*)_xyz/im
You need to check for at least one underscore as well if you want to match abc_xyz:
abc_+.*xyz

Regex to match a pattern or empty value

I need to match a pattern where i can have an alphanumeric value of size 4 or an empty value.
My current Regex
"[0-9a-z]{0|4}");
does not works for empty values.
I have tried the following two patterns but none of them works for me:
"(?:[0-9a-z]{4} )?");
"[0-9a-z]{0|4}");
I use http://xenon.stanford.edu/~xusch/regexp/ to validate my Regex but sometimes i get stuck for RegEx. Is there a way/tools that i can use to ensure i have to come here for very complex issues.
Examples i may want to match: we12, 3444, de1q, {empty}
But not want to match : #$12, #12q, 1, qwe, qqqqq
No UpperCase is matching.
Overall you could use the pattern expression|$, so it will try to match the expression or (|) the empty , and we make sure we don't have anything after that including the anchor $.
Furthermore, we could enclose it with a capture group (...), so it will finally look like this:
ˆ(expresison|)$
So applying it to your need, it would end up to be like:
^([0-9a-z]{4}|)$
here is an example
EDIT:
If you want to match also uppercases, add A-Z to the pattern:
^([0-9a-zA-Z]{4}|)$
I suppose "empty value" means empty line in the question above. If that's the case you can use this expression:
^\s*$|[a-z0-9]{4}
which will match alphanumeric patterns of size 4 or empty lines as explained here

Can it be done with regex?

Having the following regex: ([a-zA-Z0-9//._-]{3,12}[^//._-]) used like pattern="([a-zA-Z0-9/._-]{3,12}[^/._-])" to validate an HTML text input for username, I wonder if is there anyway of telling it to check that the string has only one of the following: ., -, _
By that I mean, that I'm in need of regex that would accomplish the following (if possible)
alex-how => Valid
alex-how. => Not valid, because finishing in .
alex.how => Valid
alex.how-ha => Not valid, contains already a .
alex-how_da => Not valid, contains already a -
The problem with my current regex, is that for some reason, accepts any character at the end of the string that is not ._-, and can't figure it out why.
The other problem, is that it doesn't check to see that it contains only of the allowed special characters.
Any ideas?
Try this one out:
^(?!(.*[.|_|-].*){2})(?!.*[.|_|-]$)[a-zA-Z0-9//._-]{3,12}$
Regexpal link. The regex above allow at max one of ., _ or -.
What you want is one or more strings containing all upper, lower and digit characters
followed by either one or none of the characters in "-", ".", or "_", followed by at least one character:
^[a-zA-Z0-9]+[-|_|\.]{0,1}[a-zA-Z0-9]+$
Hope this will work for you:-
It says starts with characters followed by (-,.,_) and followed and end with characters
^[\w\d]*[-_\.\w\d]*[\w\d]$
Seems to me you want:
^[A-Za-z0-9]+(?:[\._-][A-Za-z0-9]+)?$
Breaking it down:
^: beginning of line
[A-Za-z0-9]+: one or more alphanumeric characters
(?:[\._-][A-Za-z0-9]+)?: (optional, non-captured) one of your allowed special characters followed by one or more alphanumeric characters
$: end of line
It's unclear from your question if you wanted one of your special characters (., -, and _) to be optional or required (e.g., zero-or-one versus exactly-one). If you actually wanted to require one such special character, you would just get rid of the ? at the very end.
Here's a demonstration of this regular expression on your example inputs:
http://rubular.com/r/SQ4aKTIEF6
As for the length requirement (between 3 and 12 characters): This might be a cop-out, but personally I would argue that it would make more sense to validate this by just checking the length property directly in JavaScript, rather than over-complicating the regular expression.
^(?=[a-zA-Z0-9/._-]{3,12}$)[a-zA-Z0-9]+(?:[/._-][a-zA-Z0-9]+)?$
or, as a JavaScript regex literal:
/^(?=[a-zA-Z0-9\/._-]{3,12})[a-zA-Z0-9]+(?:[\/._-][a-zA-Z0-9]+)?$/
The lookahead, (?=[a-zA-Z0-9/._-]{3,12}$), does the overall-length validation.
Then [a-zA-Z0-9]+ ensures that the name starts with at least one non-separator character.
If there is a separator, (?:[/._-][a-zA-Z0-9]+)? ensures that there's at least one non-separator following it.
Note that / has no special meaning in a regex. You only have to escape it if you're using a regex literal (because / is the regex delimiter), and you escape it by prefixing with a backslash, not another forward-slash. And inside a character class, you don't need to escape the dot (.) to make it match a literal dot.
The dot in regex has a special meaning: "any character here".
If you mean a literal dot, you should escape it to tell the regex parser so.
Escape dot in a regex range

Categories

Resources