I'm trying to achieve these tasks through RegEx:
The string must start with alphabet.
String can have maximum length of 30 characters.
String may contain Numbers, Alphabets and Space ( ).
String may be case-insensitive.
String should not have more than one space sequentially.
String cannot end with Space.
After going through RegEx wiki and other RegEx questions, I've this expression:
/^([A-Z])([A-Z0-9 ]){0,29}$/i
Although, This successfully achieves task 1-4, I'm unable to find anything on task 5 and 6.
Note: I'm using Javascript for RegEx.
String should not have more than one space sequentially.
When matching a space, negative lookahead for another space.
String cannot end with Space.
Also negative lookahead for the end of the string when matching a space:
/^([A-Z])([A-Z0-9]| (?! |$)){0,29}$/i
^^^^^^^^^
This regular expression works with Ruby. I assume it will with Javascript as well.
r = /^(?!.{31})\p{Alpha}(?:\p{Alnum}| (?! ))*(?<! )$/
"The days of wine and 007" =~ r #=> 0 (a match)
"The days of wine and roses and 007" =~ r #=> nil (too long)
"The days of wine and 007" =~ r #=> nil (two consecutive spaces)
"The days of wine and 007!" =~ r #=> nil ('!' illegal)
The \p{} constructs match Unicode characters.
The regular expression can be expressed as follows in free-spacing mode (in order to document its component parts).
/
^ # beginning of string anchor
(?!.{31}) # 31 characters do not follow (neg lookahead)
\p{Alpha} # match a letter at beg of string
(?: # begin a non-capture group
\p{Alnum} # match an alphanumeric character
| # or
[ ] # match a space
(?![ ]) # a space does not follow (neg lookahead)
)* # end non-capture group and execute >= 0 times
(?<![ ]) # a space cannot precede end of string (neg lookbehind)
$ # end of string anchor
/x # free-spacing regex definition mode
Note that spaces are stripped from regexs defined in free-spacing mode, so spaces that are to be retained must be protected. I've put each in a character class ([ ]), but \s can be used as well (though that matches spaces, tabs, newlines and a few other characters, which should not be a problem).
Related
var string = 'Our Prices are $355.00 and $550, down form $999.00';
How can I get those 3 prices into an array?
The RegEx
string.match(/\$((?:\d|\,)*\.?\d+)/g) || []
That || [] is for no matches: it gives an empty array rather than null.
Matches
$99
$.99
$9.99
$9,999
$9,999.99
Explanation
/ # Start RegEx
\$ # $ (dollar sign)
( # Capturing group (this is what you’re looking for)
(?: # Non-capturing group (these numbers or commas aren’t the only thing you’re looking for)
\d # Number
| # OR
\, # , (comma)
)* # Repeat any number of times, as many times as possible
\.? # . (dot), repeated at most once, as many times as possible
\d+ # Number, repeated at least once, as many times as possible
)
/ # End RegEx
g # Match all occurances (global)
To match numbers like .99 more easily I made the second number mandatory (\d+) while making the first number (along with commas) optional (\d*). This means, technically, a string like $999 is matched with the second number (after the optional decimal point) which doesn’t matter for the result — it’s just a technicality.
A non-regex approach: split the string and filter the contents:
var arr = string.split(' ').filter(function(val) {return val.startsWith('$');});
Use match with regex as follow:
string.match(/\$\d+(\.\d+)?/g)
Regex Explanation
/ : Delimiters of regex
\$: Matches $ literal
\d+: Matches one or more digits
()?: Matches zero or more of the preceding elements
\.: Matches .
g : Matches all the possible matching characters
Demo
This will check if there is a possible decimal digits following a '$'
I have this regular expression
/^[',",\+,<,>,\(,\*,\-,%]?([£,$,€]?\d+([\,,\.]\d+)?[£,$,€]?\s*[\-,\/,\,,\.,\+]?[\/]?\s*)+[',",\+, <,>,\),\*,\-,%]?$/
It matches this very well $55.5, but in few of my test data I have some values like $ 55.5 (I mean, it has a space after $ sign).
The answers on this link are not working for me.
Currency / Percent Regular Expression
So, how can I change it to accept the spaces as well?
Try following RegEx:
/^[',",\+,<,>,\(,\*,\-,%]?([£,$,€]?\s*\d+([\,,\.]\d+)?[£,$,€]?\s*[\-,\/,\,,\.,\+]?[\/]?\s*)+[',",\+, <,>,\),\*,\-,%]?$/
Let me know if it worked!
Demo Here
TLDR:
/^[',",\+,<,>,\(,\*,\-,%]?([£,$,€]?\s*\d+([\,,\.]\d+)?[£,$,€]?\s*[\-,\/,\,,\.,\+]?[\/]?\s*)+[',",\+, <,>,\),\*,\-,%]?$/
The science bit
Ok, I'm guessing that you didn't construct the original regular expression, so here are the pieces of it, with the addition marked:
^ # match from the beginning of the string
[',",\+,<,>,\(,\*,\-,%]? # optionally one of these symbols
( # start a group
[£,$,€]? # optionally one of these symbols
\s* # <--- NEW ADDITION: optionally one or more whitespace
\d+ # then one or more decimal digits
( # start group
[\,,\.] # comma or a dot
\d+ # then one or more decimal digits
)? # group optional (comma/dot and digits or neither)
[£,$,€]? # optionally one of these symbols
\s* # optionally whitespace
[\-,\/,\,,\.,\+]? # optionally one of these symbols
[\/]? # optionally a /
\s* # optionally whitespace
)+ # this whole group one or more times
[',",\+, <,>,\),\*,\-,%]? # optionally one of these symbols
$ # match to the end of the string
Much of this is poking about matching stuff around the currency amount, so you could reduce that.
For a form validation I've to check input with javascript for valid names
The string has to fit the following pattern.
I may not start or end with a space
It may contain spaces
It may contain capital en lowercase letters, inclusive ê è en such
It may symbols like - ' "
It must contain at least 1 character
This RegExp does the job almost:
[a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð ,.'-]
But this RegExp doesn't check for spaces at start of end.
Which JS RegExp requires the requirements mentioned above?
Thanks in advance
Here is my take on the topic:
if (subject.match(/^(?=\S+)(?=[a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð ,.'-]*$).*(?=\S).$/)) {
// Successful match
}
It basically says, start with at least something which isn't a space. So here goes conditions 1 and 5.
Then make sure that the whole thing consists of only allowed characters. Here goes all your other conditions.
Then make sure that there is at least a non space character, match it and then match tne end.
More details:
"
^ # Assert position at the beginning of the string
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
\S # Match a single character that is a “non-whitespace character”
+ # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
[a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð ,.'-] # Match a single character present in the list below
# A character in the range between “a” and “z”
# A character in the range between “A” and “Z”
# One of the characters “àáâäãåèéêëìíîïòóôöõøùúûüÿýñçcšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆCŠŽ?ð ,.”
# The character “'”
# The character “-”
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
)
. # Match any single character that is not a line break character
* # Between zero and unlimited times, as many times as possible, giving back as needed (greedy)
(?= # Assert that the regex below can be matched, starting at this position (positive lookahead)
\S # Match a single character that is a “non-whitespace character”
)
. # Match any single character that is not a line break character
$ # Assert position at the end of the string (or before the line break at the end of the string, if any)
"
You need to use the RegExp ^ and $ codes, which specify the start and ending respectively.
See more documentation about this.
Hope this helps!
Try this
^(?! )[a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð ,.'-]*[a-zA-ZàáâäãåèéêëìíîïòóôöõøùúûüÿýñçčšžÀÁÂÄÃÅÈÉÊËÌÍÎÏÒÓÔÖÕØÙÚÛÜŸÝÑßÇŒÆČŠŽ∂ð,.'-]$
See it here on Regexr
^ anchors the pattern to the start of the string
$anchors the pattern to the end of the string
(?! ) is a negative lookahead that ensures, that its not starting with a space
Then there follows your character class with a * quantifier, means 0 or more times. At last there is your class once more, but without space, this is to ensure that it does not end with space.
Its a pity that Javascript regexes doesn't have Unicode support, and does not allow \p{L} for all kind of letters.
I want to match everything but no quoted strings.
I can match all quoted strings with this: /(("([^"\\]|\\.)*")|('([^'\\]|\\.)*'))/
So I tried to match everything but no quoted strings with this: /[^(("([^"\\]|\\.)*")|('([^'\\]|\\.)*'))]/ but it doesn't work.
I would like to use only regex because I will want to replace it and want to get the quoted text after it back.
string.replace(regex, function(a, b, c) {
// return after a lot of operations
});
A quoted string is for me something like this "bad string" or this 'cool string'
So if I input:
he\'re is "watever o\"k" efre 'dder\'4rdr'?
It should output this matches:
["he\'re is ", " efre ", "?"]
And than I wan't to replace them.
I know my question is very difficult but it is not impossible! Nothing is impossible.
Thanks
EDIT: Rewritten to cover more edge cases.
This can be done, but it's a bit complicated.
result = subject.match(/(?:(?=(?:(?:\\.|"(?:\\.|[^"\\])*"|[^\\'"])*'(?:\\.|"(?:\\.|[^"'\\])*"|[^\\'])*')*(?:\\.|"(?:\\.|[^"\\])*"|[^\\'])*$)(?=(?:(?:\\.|'(?:\\.|[^'\\])*'|[^\\'"])*"(?:\\.|'(?:\\.|[^'"\\])*'|[^\\"])*")*(?:\\.|'(?:\\.|[^'\\])*'|[^\\"])*$)(?:\\.|[^\\'"]))+/g);
will return
, he said.
, she replied.
, he reminded her.
,
from this string (line breaks added and enclosing quotes removed for clarity):
"Hello", he said. "What's up, \"doc\"?", she replied.
'I need a 12" crash cymbal', he reminded her.
"2\" by 4 inches", 'Back\"\'slashes \\ are OK!'
Explanation: (sort of, it's a bit mindboggling)
Breaking up the regex:
(?:
(?= # Assert even number of (relevant) single quotes, looking ahead:
(?:
(?:\\.|"(?:\\.|[^"\\])*"|[^\\'"])*
'
(?:\\.|"(?:\\.|[^"'\\])*"|[^\\'])*
'
)*
(?:\\.|"(?:\\.|[^"\\])*"|[^\\'])*
$
)
(?= # Assert even number of (relevant) double quotes, looking ahead:
(?:
(?:\\.|'(?:\\.|[^'\\])*'|[^\\'"])*
"
(?:\\.|'(?:\\.|[^'"\\])*'|[^\\"])*
"
)*
(?:\\.|'(?:\\.|[^'\\])*'|[^\\"])*
$
)
(?:\\.|[^\\'"]) # Match text between quoted sections
)+
First, you can see that there are two similar parts. Both these lookahead assertions ensure that there is an even number of single/double quotes in the string ahead, disregarding escaped quotes and quotes of the opposite kind. I'll show it with the single quotes part:
(?= # Assert that the following can be matched:
(?: # Match this group:
(?: # Match either:
\\. # an escaped character
| # or
"(?:\\.|[^"\\])*" # a double-quoted string
| # or
[^\\'"] # any character except backslashes or quotes
)* # any number of times.
' # Then match a single quote
(?:\\.|"(?:\\.|[^"'\\])*"|[^\\'])*' # Repeat once to ensure even number,
# (but don't allow single quotes within nested double-quoted strings)
)* # Repeat any number of times including zero
(?:\\.|"(?:\\.|[^"\\])*"|[^\\'])* # Then match the same until...
$ # ... end of string.
) # End of lookahead assertion.
The double quotes part works the same.
Then, at each position in the string where these two assertions succeed, the next part of the regex actually tries to match something:
(?: # Match either
\\. # an escaped character
| # or
[^\\'"] # any character except backslash, single or double quote
) # End of non-capturing group
The whole thing is repeated once or more, as many times as possible. The /g modifier makes sure we get all matches in the string.
See it in action here on RegExr.
Here is a tested function that does the trick:
function getArrayOfNonQuotedSubstrings(text) {
/* Regex with three global alternatives to section the string:
('[^'\\]*(?:\\[\S\s][^'\\]*)*') # $1: Single quoted string.
| ("[^"\\]*(?:\\[\S\s][^"\\]*)*") # $2: Double quoted string.
| ([^'"\\]*(?:\\[\S\s][^'"\\]*)*) # $3: Un-quoted string.
*/
var re = /('[^'\\]*(?:\\[\S\s][^'\\]*)*')|("[^"\\]*(?:\\[\S\s][^"\\]*)*")|([^'"\\]*(?:\\[\S\s][^'"\\]*)*)/g;
var a = []; // Empty array to receive the goods;
text = text.replace(re, // "Walk" the text chunk-by-chunk.
function(m0, m1, m2, m3) {
if (m3) a.push(m3); // Push non-quoted stuff into array.
return m0; // Return this chunk unchanged.
});
return a;
}
This solution uses the String.replace() method with a replacement callback function to "walk" the string section by section. The regex has three global alternatives, one for each section; $1: single quoted, $2: double quoted, and $3: non-quoted substrings, Each non-quoted chunk is pushed onto the return array. It correctly handles all escaped characters, including escaped quotes, both inside and outside quoted strings. Single quoted substrings may contain any number of double quotes and vice-versa. Illegal orphan quotes are removed and serve to divide a non-quoted section into two chunks. Note that this solution requires no lookaround and requires only one pass. It also implements Friedl's "Unrolling-the-Loop" efficiency technique and is quite efficient.
Additional: Here is some code to test the function with the original test string:
// The original test string (with necessary escapes):
var s = "he\\'re is \"watever o\\\"k\" efre 'dder\\'4rdr'?";
alert(s); // Show the test string without the extra backslashes.
console.log(getArrayOfNonQuotedSubstrings(s).toString());
You can't invert a regex. What you have tried was making a character class out of it and invert that - but also for doing that you would have to escape all closing brackets "\]".
EDIT: I would have started with
/(^|" |' ).+?($| "| ')/
This matches anything between the beginning or the end of a quoted string (very simple: a quotation mark plus a blank) and the end of the string or the start of a quoted string (a blank plus a quotation mark). Of course this doesn't handle any escape sequences or quotations which don't follow the scheme / ['"].*['"] /. See above answers for more detailed expressions :-)
I need to validate and input string client side.
Here is an example of the string:
1:30-1:34, 1:20-1:22, 1:30-1:37,
It's basically time codes for a video.
Can this be done with regex?
Banging my head against the wall...
^(?:\b\d+:\d+-\d+:\d+\b(?:, )?)+$
would probably work; at least it matches your example. But you might need to add a few edge cases to make the rules for matching/not matching clearer.
^ # Start of string
(?: # Try to match...
\b # start of a "word" (in this case, number)
\d+ # one or more digits
: # a :
\d+ # one or more digits
- # a dash
\d+ # one or more digits
: # a :
\d+ # one or more digits
\b # end of a "word"
(?:, )? # optional comma and space
)+ # repeat one or more times
$ # until the end of the string
The following is a simple representation. I have assumed that the string has the exact same form as you have shown. This may be a good starting point for you. I'll improve the regex if you provide more specific requirements.
([0-9]+:[0-9]{1,2}-[0-9]+:[0-9]{1,2},\w*)+
Explanation (inspired from Tim above)
[0-9]+ #One ore more digits
: # A colon
[0-9]{1,2} #A single digit or a pair of digits
- #A dash
, #A comma
\w* #Optional whitespace