I'am looking to exclude matches that contain a specific word or phrase. For example, how could I match only lines 1 and 3? the \b word boundary does not work intuitively like I expected.
foo.js # match
foo_test.js # do not match
foo.ts # match
fun_tset.js # match
fun_tset_test.ts # do not match
UPDATE
What I want to exclude is strings ending explicitly with _test before the extension. At first I had something like [^_test], but that also excludes any combination of those characters (like line 3).
Regex: ^(?!.*_test\.).*$
Working examples: https://regex101.com/r/HdGom7/1
Why it works: uses negative lookahead to check if _test. exists somewhere in the string, and if so doesn't match it.
Adding to #pretzelhammer's answer, it looks like you want to grab strings that are file names ending in ts or js:
^(?!.*_test)(.*\.[jt]s)
The expression in the first parentheses is a negative lookahead that excludes any strings with _test, the second parentheses matches any strings that end in a period, followed by [jt] (j or t), followed by s.
Related
I have a few strings:
some-text-123123#####abcdefg/
some-STRING-413123#####qwer123t/
some-STRING-413123#####456zxcv/
I would like to receive:
abcdefg
qwer123t
456zxcv
I have tried regexp:
/[^#####]*[^\/]/
But this not working...
To get whatever comes after five #s and before the last /, you can use
/#####(.*)\//
and pick up the first group.
Demo:
const regex = /#####(.*)\//;
console.log('some-text-123123#####abcdefg/'.match(regex)[1]);
console.log('some-STRING-413123#####qwer123t/'.match(regex)[1]);
console.log('some-STRING-413123#####456zxcv/'.match(regex)[1]);
assumptions:
the desired part of the string sample will always:
start after 5 #'s
end before a single /
suggestion: /(?<=#{5})\w*(?=\/)/
So (?<=#{5}) is a lookbehind assertion which will check to see if any matching string has the provided assertion immediately behind it (in this case, 5 #'s).
(?=\/) is a lookahead assertion, which will check ahead of a matching string segment to see if it matches the provided assertion (in this case, a single /).
The actual text the regex will return as a match is \w*, consisting of a character class and a quantifier. The character class \w matches any alphanumeric character ([A-Za-z0-9_]). The * quantifier matches the preceding item 0 or more times.
successful matches:
'some-text-123123#####abcdefg/'
'some-STRING-413123#####qwer123t/'
'some-STRING-413123#####456zxcv/'
I would highly recommend learning Regular Expressions in-depth, as it's a very powerful tool when fully utilised.
MDN, as with most things web-dev, is a fantastic resource for regex. Everything from my answer here can be learned on MDN's Regular expression syntax cheatsheet.
Also, an interactive tool can be very helpful when putting together a complex regular expression. Regex 101 is typically what I use, but there are many similar web-tools online that can be found from a google search.
You pattern does not work because you are using negated character classes [^
The pattern [^#####]*[^\/] can be written as [^#]*[^\/] and matches optional chars other than # and then a single char other than /
Here are some examples of other patterns that can give the same match.
At least 5 leading # chars and then matching 1+ word chars in a group and the / at the end of the string using an anchor $, or omit the anchor if that is not the case:
#####(\w+)\/$
Regex demo
If there should be a preceding character other than #
[^#]#####(\w+)\/$
(?<!#)#####(\w+)\/$
Regex demo
Matching at least 5 # chars and no # or / in between using a negated character class in this case:
#####([^#\/]+)\/
Or with lookarounds:
(?<=(?<!#)#####)[^#\/]+(?=\/)
Regex demo
I have a requirement where I need a regex which
should not repeat alphabet
should only contain alphabet and comma
should not start or end with comma
can contain more than 2 alphabets
example :-
A,B --- correct
A,B,C,D,E,F --- correct
D,D,A --- wrong
,B,C --- wrong
B,C, --- wrong
A,,B,C --- wrong
Can anyone help ?
Another idea with capturing and checking by use of a lookahead:
^(?:([A-Z])(?!.*?\1),?\b)+$
You can test here at regex101 if it meets your requirements.
If you don't want to match single characters, e.g. A, change the + quantifier to {2,}.
The statement of the question is incomplete in several respects. I have made the following assumptions:
Considering that D,D,A is incorrect I assume that a letter cannot be followed by a comma followed by the same letter.
The string may contain the same letter more than once as long as #1 is satisfied.
Considering that A,,B,C is incorrect I assume a comma cannot follow a comma.
Since the examples contain only capital letters I will assume that lower-case letters are not permitted (though one need only set the case-indifferent flag (i) to permit either case).
We observe that the requirements are satisfied if and only if the string begins with a capital letter and is followed by a sequence of comma-capital letter pairs, provided that no capital letter is followed by a comma followed by the same letter. We therefore can attempt to match the following regular expression.
^(?:([A-Z]),(?!\1))*[A-Z]$
Demo
The elements of the expression are as follows.
^ # match beginning of string
(?: # begin a non-capture group
([A-Z]) # match a capital letter and save to capture group 1
, # match a comma
(?!\1) # use negative lookahead to assert next character is not equal
# to the content of capture group 1
)* # end non-capture group and execute it zero or more times
[A-Z] # match a capital letter
$ # match end of string
Here is a big ugly regex solution:
var inputs = ['A,B', 'D,D,D', ',B,C', 'B,C,', 'A,,B'];
for (var i=0; i < inputs.length; ++i) {
if (/^(?!.*?([^,]+).*,\1(?:,|$))[^,]+(?:,[^,]+)*$/.test(inputs[i])) {
console.log(inputs[i] + " => VALID");
}
else {
console.log(inputs[i] + " => INVALID");
}
}
The regex has two parts to it. It uses a negative lookahead to assert that no two CSV entries ever repeat in the input. Then, it uses a straightforward pattern to match any proper CSV delimited input. Here is an explanation:
^ from the start of the input
(?!.*?([^,]+).*,\1(?:,|$)) assert that no CSV element ever repeats
[^,]+ then match a CSV element
(?:,[^,]+)* followed by comma and another element, 0 or more times
$ end of the input
This one could suit your needs:
^(?!,)(?!.*,,)(?!.*(\b[A-Z]+\b).*\1)[A-Z,]+(?<!,)$
^: the start of the string
(?!,): should not be directly followed by a comma
(?!.*,,): should not be followed by two commas
(?!.*(\b[A-Z]+\b).*\1): should not be followed by a value found twice
[A-Z,]+: should contain letters and commas only
$: the end of the string
(?<!,): should not be directly preceded by a comma
See https://regex101.com/r/1kGVSB/1
I have regex which works fine in my application, but it matches an empty string too, i.e. no error occurs when the input is empty. How do I modify this regex so that it will not match an empty string ? Note that I DON'T want to change any other functionality of this regex.
This is the regex which I'm using: ^([0-9\(\)\/\+ \-]*)$
I don't know a lot about regex formulation myself, which is why I'm asking. I have searched for an answer, but couldn't find a direct one. Closest I got to was this: regular expression for anything but an empty string in c#, but that doesn't really work for me ..
Replace "*" with "+", as "*" means "0 or more occurrences", while "+" means "at least one occurrence"
There are a lot of pattern types that can match empty strings. The OP regex belongs to an ^.*$ type, and it is easy to modify it to prevent empty string matching by replacing * (= {0,}) quantifier (meaning zero or more) with the + (= {1,}) quantifier (meaning one or more), as has already been mentioned in the posts here.
There are other pattern types matching empty strings, and it is not always obvious how to prevent them from matching empty strings.
Here are a few of those patterns with solutions:
[^"\\]*(?:\\.[^"\\]*)* ⇒ (?:[^"\\]|\\.)+
abc||def ⇒ abc|def (remove the extra | alternation operator)
^a*$ ⇒ ^a+$ (+ matches 1 or more chars)
^(a)?(b)?(c)?$ ⇒ ^(?!$)(a)?(b)?(c?)$ (the (?!$) negative lookahead fails the match if end of string is at the start of the string)
or ⇒ ^(?=.)(a)?(b)?(c?)$ (the (?=.) positive lookahead requires at least a single char, . may match or not line break chars depending on modifiers/regex flavor)
^$|^abc$ ⇒ ^abc$ (remove the ^$ alternative that enables a regex to match an empty string)
^(?:abc|def)?$ ⇒ ^(?:abc|def)$ (remove the ? quantifier that made the (?:abc|def) group optional)
To make \b(?:north|south)?(?:east|west)?\b (that matches north, south, east, west, northeast, northwest, southeast, southwest), the word boundaries must be precised: make the initial word boundary only match start of words by adding (?<!\w) after it, and let the trailing word boundary only match at the end of words by adding (?!\w) after it.
\b(?:north|south)?(?:east|west)?\b ⇒ \b(?<!\w)(?:north|south)?(?:east|west)?\b(?!\w)
You can either use + or the {min, max} Syntax:
^[0-9\(\)\/\+ \-]{1,}$
or
^[0-9\(\)\/\+ \-]+$
By the way: this is a great source for learning regular expressions (and it's fun): http://regexone.com/
Obviously you need to replace Replace * with +, as + matches 1 or more character. However inside character class you don't to do all that escaping you're doing. Your regex can be simplified to:
^([0-9()\/+ -]+)$
filter: function(t){ return /^#\w+/.test(t.tweet_raw_text); },
If this JS returns tweets that start with an # symbol, how to do return tweets with a specific hash tag, or word in them?
Everything I try just breaks it! It originates from this JS:
http://tweet.seaofclouds.com/
First let's break down the regular expression you have to see how it works.
/^#\w+/ - The slashes (/) at the beginning and end are just delimiters that tell JavaScript that this is a regular expression.
^ - matches the beginning of a string.
# - matches the literal # symbol.
\w - matches any alphanumeric character including underscore (short for [a-zA-Z0-9_]).
+ - is short for {1,}. Matches the previous character or expression (\w) one or more times.
That's how you match a tweet that starts with the # symbol. To match a tweet that contains a specific hashtag, you can replace the regular expression above with the specific hashtag you're trying to match.
For example, /#StackOverflow/.test(t.tweet_raw_text); will match a tweet that contains the exact hashtag #StackOverflow. That's a case-sensitive pattern though, so it wouldn't match the hashtag #stackoverflow. To make a JavaScript regular expression case insensitive, just add the i modifier after the closing delimeter like so: /#StackOverflow/i.
Thank your the above reply - a good lesson in reg expressions.
I also got round the problem with this code:
filter: function(t){ if (t.tweet_raw_text.indexOf("#EF1") !== -1; },
I have small requirement.I want to search a string with exact match.
Suppose i want to search for None_1, i am searching for 'None_1' using /None_1/, but it is matching even "xxxNone" but my requirement is it should match only None_[any digit].
Here is my code
/^None_+[0-9]{?}/
So it should match only None_1 , None_2
You should also anchor the expression at the end of the line. But that alone will not make it work. Your expression is wrong. I think it should be:
/^None_[0-9]+$/
^ matches the beginning of a line
[0-9]+ matches one or more digits
None_ matches None_
$ matches the end of a line
If you only want to match one digit, remove the +.
Your original expression /^None_+[0-9]{?}/ worked like this:
^ matches the beginning of a line
None matches None
_+ matches one or more underscores
[0-9] matches one digit
{? matches an optional opening bracket {
} matches }
Try this:
/^None_+[0-9]{?}$/