I have a requirement where I need a regex which
should not repeat alphabet
should only contain alphabet and comma
should not start or end with comma
can contain more than 2 alphabets
example :-
A,B --- correct
A,B,C,D,E,F --- correct
D,D,A --- wrong
,B,C --- wrong
B,C, --- wrong
A,,B,C --- wrong
Can anyone help ?
Another idea with capturing and checking by use of a lookahead:
^(?:([A-Z])(?!.*?\1),?\b)+$
You can test here at regex101 if it meets your requirements.
If you don't want to match single characters, e.g. A, change the + quantifier to {2,}.
The statement of the question is incomplete in several respects. I have made the following assumptions:
Considering that D,D,A is incorrect I assume that a letter cannot be followed by a comma followed by the same letter.
The string may contain the same letter more than once as long as #1 is satisfied.
Considering that A,,B,C is incorrect I assume a comma cannot follow a comma.
Since the examples contain only capital letters I will assume that lower-case letters are not permitted (though one need only set the case-indifferent flag (i) to permit either case).
We observe that the requirements are satisfied if and only if the string begins with a capital letter and is followed by a sequence of comma-capital letter pairs, provided that no capital letter is followed by a comma followed by the same letter. We therefore can attempt to match the following regular expression.
^(?:([A-Z]),(?!\1))*[A-Z]$
Demo
The elements of the expression are as follows.
^ # match beginning of string
(?: # begin a non-capture group
([A-Z]) # match a capital letter and save to capture group 1
, # match a comma
(?!\1) # use negative lookahead to assert next character is not equal
# to the content of capture group 1
)* # end non-capture group and execute it zero or more times
[A-Z] # match a capital letter
$ # match end of string
Here is a big ugly regex solution:
var inputs = ['A,B', 'D,D,D', ',B,C', 'B,C,', 'A,,B'];
for (var i=0; i < inputs.length; ++i) {
if (/^(?!.*?([^,]+).*,\1(?:,|$))[^,]+(?:,[^,]+)*$/.test(inputs[i])) {
console.log(inputs[i] + " => VALID");
}
else {
console.log(inputs[i] + " => INVALID");
}
}
The regex has two parts to it. It uses a negative lookahead to assert that no two CSV entries ever repeat in the input. Then, it uses a straightforward pattern to match any proper CSV delimited input. Here is an explanation:
^ from the start of the input
(?!.*?([^,]+).*,\1(?:,|$)) assert that no CSV element ever repeats
[^,]+ then match a CSV element
(?:,[^,]+)* followed by comma and another element, 0 or more times
$ end of the input
This one could suit your needs:
^(?!,)(?!.*,,)(?!.*(\b[A-Z]+\b).*\1)[A-Z,]+(?<!,)$
^: the start of the string
(?!,): should not be directly followed by a comma
(?!.*,,): should not be followed by two commas
(?!.*(\b[A-Z]+\b).*\1): should not be followed by a value found twice
[A-Z,]+: should contain letters and commas only
$: the end of the string
(?<!,): should not be directly preceded by a comma
See https://regex101.com/r/1kGVSB/1
Related
I have got this string {bgRed Please run a task, {red a list has been provided below}, I need to do a string replace to remove the braces and also the first word.
So below I would want to remove {bgRed and {red and then the trailing brace which I can do separate.
I have managed to create this regex, but it is only matching {bgRed and not {red, can someone lend a hand?
/^\{.+?(?=\s)/gm
Note you are using ^ anchor at the start and that makes your pattern only match at the start of a line (mind also the m modifier). .+?(?=\s|$) is too cumbersome, you want to match any 1+ chars up to the first whitespace or end of string, use {\S+ (or {\S* if you plan to match { without any non-whitespace chars after it).
You may use
s = s.replace(/{\S*|}/g, '')
You may trim the outcome to get rid of resulting leading/trailing spaces:
s = s.replace(/{\S*|}/g, '').trim()
See the regex demo and the regex graph:
Details
{\S* - { char followed with 0 or more non-whitespace characters
| - or
} - a } char.
If the goal is go to from
"{bgRed Please run a task, {red a list has been provided below}"
to
"Please run a task, a list has been provided below"
a regex with two capture groups seems simplest:
const original = "{bgRed Please run a task, {red a list has been provided below}";
const rex = /\{\w+ ([^{]+)\{\w+ ([^}]+)}/g;
const result = original.replace(rex, "$1$2");
console.log(result);
\{\w+ ([^{]+)\{\w+ ([^}]+)} is:
\{ - a literal {
\w+ - one or more word characters ("bgRed")
a literal space
([^{]+) one or more characters that aren't {, captured to group 1
\{ - another literal {
\w+ - one or more word characters ("red")
([^}]+) - one or more characters that aren't }, captured to group 2
} - a literal }
The replacement uses $1 and $2 to swap in the capture group contents.
I'am looking to exclude matches that contain a specific word or phrase. For example, how could I match only lines 1 and 3? the \b word boundary does not work intuitively like I expected.
foo.js # match
foo_test.js # do not match
foo.ts # match
fun_tset.js # match
fun_tset_test.ts # do not match
UPDATE
What I want to exclude is strings ending explicitly with _test before the extension. At first I had something like [^_test], but that also excludes any combination of those characters (like line 3).
Regex: ^(?!.*_test\.).*$
Working examples: https://regex101.com/r/HdGom7/1
Why it works: uses negative lookahead to check if _test. exists somewhere in the string, and if so doesn't match it.
Adding to #pretzelhammer's answer, it looks like you want to grab strings that are file names ending in ts or js:
^(?!.*_test)(.*\.[jt]s)
The expression in the first parentheses is a negative lookahead that excludes any strings with _test, the second parentheses matches any strings that end in a period, followed by [jt] (j or t), followed by s.
I want to be able to match these types of strings (comma separated, and no beginning or trailing spaces):
LaunchTool[0].Label,LaunchTool[0].URI,LaunchTool[1].Label,LaunchTool[1].URI,LaunchItg[0].Label,LaunchItg[0].URI,csr_description
The rules, in English, are:
1) Zero or more instances of [] where the brackets must contain only one number 0-9
2) Zero or more instances of ., where . must be followed by a letter
3) Zero or more instances of _, where _ must be followed by a letter
I currently have this regex:
/^([a-z]){1,}(\[[0-9]\]){0,}(\.){0,}[a-z]{1,}$/i
I cannot figure out why
"aaaa" doesn't match
furthemore,
"aaaa[0].a" matches, but "aaaa[0]" does not...
anyone know what's wrong? I believe I might need a lookahead to make sure . and _ characters are followed by a letter? Perhaps I can avoid it.
this regex can match "aaaa", try getting value of
(/^([a-z]){1,}(\[[0-9]\]){0,}(\.){0,}[a-z]{1,}$/i).test("aaaa")
"aaaa[0]" does not match, because there is [a-z]{1,} in the end of expression. once "[0]" is matched by (\[[0-9]\]){0,}, trailing [a-z]{1,} must be shown at the end of string
Use optional capture groups. Example: ([a-z])?.
Here is what i ended up with:
/^((\w+)(\[\d+\])?\.?(\w+)?,?)+$/
Shorthands explanation:
* = {0,}
+ = {1,}
\w = [A-Za-z0-9_]
\d = [0-9]
I just finished all the "Easy" CoderByte challenges and am now going back to see if there are more efficient ways to answer the questions. I am trying to come up with a regular expression for "SimpleSymbol".
(Have the function SimpleSymbols(str) take the str parameter being passed and determine if it is an acceptable sequence by either
returning the string true or false. The str parameter will be composed
of + and = symbols with several letters between them (ie.
++d+===+c++==a) and for the string to be true each letter must be surrounded by a + symbol. So the string to the left would be false.
The string will not be empty and will have at least one letter.)
I originally answered the question by traversing through the whole string, when a letter is found, testing on either side to see if a "+" exists. I thought it would be easier if I could just test the string with a regular expression like,
str.match(/\+[a-zA-Z]\+/g)
This doesn't quite work. I am trying to see if the match will ONLY return true if the condition is met on ALL of the characters in the string. For instance the method will return true on the string, "++d+===+c++==a", due to '+d+' and '+c+'. However, based on the original question it should return false because of the 'a' and no surrounding '+'s.
Any ideas?
EDIT: #Marc brought up a very good point. The easiest way to do this is to search for violations using
[^+][a-zA-Z]|[a-zA-Z][^+]
or something like it. This will find all violations of the rule -- times when a letter appears next to something other than a +. If this matches, then you can return false, knowing that there exist violations. Else, return true.
Original answer:
Here's a regex -- I explain it below. Remember that you have to escape +, because it is a special character!
^([^a-zA-Z+]|(\+[a-zA-Z]\+)|[+])*$
^ // start of the string
[^a-zA-Z+] // any character except a letter or a + (1)
| // or
(\+[a-zA-Z]\+) // + (letter) + (2)
| //or
[+] // plus (3)
)*$ // repeat that pattern 0 or more times, end
The logic behind this is: skip all characters that aren't relevant in your string. (1)
If we have a + (letter) +, that's fine. capture that. (2)
if we have a + all by itself, that's fine too. (3)
A letter without surrounding + will fail.
The problem is that + is a special character in regular expressions. It's a quantifier that means 'one or more of the previous item'. You can represent a literal + character by escaping it, like this:
str.match(/\+[a-zA-Z]\+/g)
However, this will return true if any set of characters is found in the string matching that pattern. If you want to ensure that there are no other characters in the string which do not match that pattern you could do something like this:
str.match(/^([=+]*\+[a-zA-Z](?=\+))+[=+]*$/)
This will match, any number of = or + characters, followed by a literal + followed by a Latin letter, followed by a literal +, all of which may be repeated one or more times, followed by any number of = or + characters. The ^ at the beginning and the $ at the end matches the start and end of the input string, respectively. This ensures that no other characters are allowed. The (?=\+) is a look-ahead assertion, meaning that the next character must be a literal +, but is not considered part of the group, this means it can be rematched as the leading + in the next match (e.g. +a+b+).
Interesting problem!
String requirements:
String must be composed of only +, = and [A-Za-z] alpha chars.
Each and every alpha char must be preceded by a + and followed by a +.
There must be at least one alpha char.
Valid strings:
"+A+"
"++A++"
"=+A+="
"+A+A+"
"+A++A+"
Invalid strings:
+=+= # Must have at least one alpha.
+A+&+A+ # Non-valid char.
"+A" # Alpha not followed by +.
"A+" # Alpha not preceded by +.
Solution:
^[+=]*(?:\+[A-Z](?=\+)[+=]*)+$ (with i ignorecase option set)
Here's how I'd do it: (First in a tested python script with fully commented regex)
import re
def isValidSpecial(text):
if re.search(r"""
# Validate special exercise problem string format.
^ # Anchor to start of string.
[+=]* # Zero or more non-alphas {normal*).
(?: # Begin {(special normal*)+} construct.
\+[A-Z] # Alpha but only if preceded by +
(?=\+) # and followed by + {special} .
[+=]* # More non-alphas {normal*).
)+ # At least one alpha required.
$ # Anchor to end of string.
""", text, re.IGNORECASE | re.VERBOSE):
return "true"
else:
return "false"
print(isValidSpecial("+A+A+"))
Now here is the same solution in JavaScript syntax:
function isValidSpecial(text) {
var re = /^[+=]*(?:\+[A-Z](?=\+)[+=]*)+$/i;
return (re.test(text)) ? "true" : "false";
}
/^[=+]*\+[a-z](?=\+)(?:\+[a-z](?=\+)|[+=]+)*$/i.test(str)
pattern details:
^ # anchor, start of the string
[=+]* # 0 or more = and +
\+ [a-z] (?=\+) # imposed surrounded letter (at least one letter condition)
(?: # non capturing group with:
\+[a-z](?=\+) # surrounded letter
| # OR
[=+]+ # + or = characters (one or more)
)* # repeat the group 0 or more times
$ # anchor, end of the string
To allow consecutive letters like +a+a+a+a+a+, I use a lookahead assertion to check there is a + symbol after the letter without match it. (Thanks to ridgrunner for his comment)
Example:
var str= Array('+==+u+==+a', '++=++a+', '+=+=', '+a+-', '+a+a+');
for (var i=0; i<5; i++) {
console.log(/^[=+]*\+[a-z](?=\+)(?:\+[a-z](?=\+)|[+=]+)*$/i.test(str[i]));
}
I'm trying to match a pattern:
show_clipping.php?CLIP_id=*
from:
a href="javascript:void(0);" onclick="MM_openBrWindow('show_clipping.php?CLIP_id=575','news','scrollbars=yes,resizable=yes,width=500,height=400,left=100,top=60')">some text</a>
where
*
can be only numeric values(eg: 0, 1 , 1234)
the result has to return the whole thing(show_clipping.php?CLIP_id=575)
what I've tried:
show_clipping.php\?CLIP_id=([1-9]|[1-9][0-9]|[1-9][0-9][0-9])
but my attempt would truncate the rest of the digits from 575, leaving the results like:
show_clipping.php?CLIP_id=5
How do I match numeric part properly?
Another issue is that the value 575 can contain any numeric value, my regex will not work after 3 digits, how do i make it work with infinit amount of digits
You didn't specify what language your are using so here is just the regex:
'([^']+)'
Explanation
' # Match a single quote
([^`])+ # Capture anything not a single quote
' # Match the closing single quote
So basically it capture everything in single quotes, show_clipping.php?CLIP_id=5 is in the first capture group.
See it action here.
To only capture show_clipping.php?CLIP_id=5 I would do '(.*CLIP_id=[0-9]+)'
' # Match a single quote
(.* # Start capture group, match anyting
CLIP_id= # Match the literal string
[0-9]+) # Match one of more digit and close capture group
' # Match the closing single quote
Answer:
^(0|[1-9][0-9]*)$
answered before:
Regex pattern for numeric values
(answer number 6)
What about this?
onclick.match(/show_clipping\.php\?CLIP_id=\d+/)
["show_clipping.php?CLIP_id=575"]
(From the tags of your question I assume you're using JavaScript)
show_clipping.php\?CLIP_id=(\d+)
\d matches a digit, and + means 1 or more of them.
How about:
/(show_clipping.php\?CLIP_id=[1-9]\d*)/