Regex to not allow special character without prefix or suffix - javascript

I was writing regex for the following validate a string. I wrote the following regex.
^[^\s]+[a-z]{0,}(?!.* {2})[ a-zA-z]{0,}$
it validates for
No space in beginning.
no two consecutive space allowed.
The problem is it allows a single special character. it should not allow a special character unless it is suffixed or prefixed with alpha-numeric character.
Examples:
# -> not allowed.
#A or A# or A2 or 3A is allowed.

One option is to assert that the string does not contain a single "special" char or 2 special chars next to each other using a negative lookahead.
^(?!.*[^a-zA-Z0-9\s][^a-zA-Z0-9\s])(?!.*(?:^| )[^a-zA-Z0-9\s](?!\S))\S+(?: \S+)*$
Explanation
^ Start of string
(?! Negative lookahead, assert that what is at the right does not contain
.*[^a-zA-Z0-9\s][^a-zA-Z0-9\s] match 2 chars other than a-zA-Z0-9 or a whitespace char next to each other
) Close lookahead
(?! Negative lookahead, assert that what is at the right does not contain
.*(?:^| )[^a-zA-Z0-9\s](?!\S) Match a single char other than a-zA-Z0-9 or a whitespace char
) Close lookahead
\S+(?: \S+)* Match 1+ non whitespace chars and optionally repeat a space and 1+ non whitespace chars
$ End of string
Regex demo

Please omit the '$' symbol from the regex because it represents the end of the sentence.
^[^\s]+[a-z]{0,}(?!.* {2})[ a-zA-z]{0,}
So when applying the above regex to the following, it finds only '# '.
#A A# A2 3A

Related

Using lookahead, how to ensure at least 4 alphanumeric chars are included + underscores

I'm trying to make sure that at least 4 alphanumeric characters are included in the input, and that underscores are also allowed.
The regular-expressions tutorial is a bit over my head because it talks about assertions and success/failure if there is a match.
^\w*(?=[a-zA-Z0-9]{4})$
my understanding:
\w --> alphanumeric + underscore
* --> matches the previous token between zero and unlimited times ( so, this means it can be any character that is alphanumeric/underscore, correct?)
(?=[a-zA-Z0-9]{4}) --> looks ahead of the previous characters, and if they include at least 4 alphanumeric characters, then I'm good.
Obviously I'm wrong on this, because regex101 is showing me no matches.
You want 4 or more alphanumeric characters, surround by any number of underscores (use ^ and $ to ensure it match's the whole input ):
^(_*[a-zA-Z0-9]_*){4,}$
Your pattern ^\w*(?=[a-zA-Z0-9]{4})$ does not match because:
^\w* Matches optional word characters from the start of the string, and if there are only word chars it will match until the end of the string
(?=[a-zA-Z0-9]{4}) The positive lookahead is true, if it can assert 4 consecutive alphanumeric chars to the right from the current position. The \w* allows backtracking, and can backtrack 4 positions so that the assertion it true.
But the $ asserts the end of the string, which it can not match as the position moved 4 steps to the left to fulfill the previous positive lookahead assertion.
Using the lookahead, what you can do is assert 4 alphanumeric chars preceded by optional underscores.
If the assertion is true, match 1 or more word characters.
^(?=(?:_*[a-zA-Z0-9]){4})\w+$
The pattern matches:
^ Start of string
(?= Positive lookahead, asser what is to the right is
(?:_*[a-zA-Z0-9]){4} Repeat 4 times matching optional _ followed by an alphanumeric char
) Close the lookahead
\w+ Match 1+ word characters (which includes the _)
$ End of string
Regex demo
I suggest using atomic groups (?>...), please see regex tutorial for details
^(?>_*[a-zA-Z0-9]_*){4,}$
to ensure 4 or more fragments each of them containing letter or digit.
Edit: If regex doesn't support atomic, let's try use just groups:
^(?:_*[A-Za-z0-9]_*){4,}$

JavaScript regex with below rules

Need to create a regex for a string with below criteria
Allowable characters:
uppercase A to Z A-Z
lowercase a to z a-z
hyphen `
apostrophe '
single quote '
space
full stop .
numerals 0 to 9 0-9
Validations:
Must start with an alphabetic character a-zA-Z or apostrophe
Cannot have consecutive non-alpha characters except for a full stop followed by a space.
The regex I have from the previous question in this forum. Business came back and want to allow string starting with apostrophe along with [a-zA-Z]. This break some previous validations.
eg: a1rte is valid
'tyer4 is valid
'4rt is invalid
^(?!.*[0-9'`\.\s-]{2})[a-zA-Z][a-zA-Z0-9-`'.\s]+$
Please advise.
You might use
^(?=[a-zA-Z0-9`'. -]+$)(?!.*[0-9'` -]{2})[a-zA-Z'][^\r\n.]*(?:\.[ a-z][^\r\n.]*)*$
Explanation
^ Start of string
(?=[a-zA-Z0-9`'. -]+$) Assert only allowed characters
(?!.*[0-9'` -]{2}) Assert not 2 consecutive listed characters
[a-zA-Z'] Match either a char a-zA-Z or apostrophe
[^\r\n.]* Optionally match any char except a newline or a dot
(?:\.[ a-z][^\r\n.]*)* Optionally repeat matching a dot only followed by a space or char a-z
$ End of string
Regex demo

How to combine a few regex expressions in JavaScript?

I have a few requirements to validate email address and have a few regex expressions for that purpose.
But how I can combine them to one regex to make it work correctly?
(?=^\\S+#\\S+\\.\\S+$)
Dotted email address
(?=^[A-Za-z0-9#$\\._-]+$)
allows alphanumeric, '#', '.', '_', '-'
(?=^[A-Za-z0-9].*$)
cannot start with special character
(?=^.{5,100}$)
between 5 and 100 characters
(?=^((?!([0-9]{9,}\\1)).)*$)
Not nine or more numbers
(^(?!.*[#].*[#]).*$)
One at mark
Try concatenating your expressions:
const emailRegex = /(?=^\S+#\S+\.\S+$)(?=^[A-Za-z0-9#$\._-]+$)(?=^[A-Za-z0-9].*$)(?=^.{5,100}$)(?=^((?!([0-9]{9,}\1)).)*$)(^(?!.[#].[#]).*$)/;
There are a lot of custom rules using lookaheads, from which you can omit a few by matching instead of asserting.
^(?=\S{5,100}$)(?!\S*\d{9})[A-Za-z0-9][A-Za-z0-9$\\._-]*#[A-Za-z0-9$\\._-]+\.[A-Za-z0-9]+$
^ Start of string
(?=\S{5,100}$) Assert 5-100 non whitspace chars
(?!\S*\d{9}) Assert not 9 consecutive digits in the string
[A-Za-z0-9] Match a single char A-Z a-z or a digit
[A-Za-z0-9$\\._-]* Optionally repeat what is listed in the character class
# Match an # char (Note that you can omit the square brackets [#])
[A-Za-z0-9$\\._-]+ Match 1+ times any of the listed in the character class
\.[A-Za-z0-9]+ Match a . and 1+ times any of the listed in the character class
$ End of string
Regex demo

regex to match question sentences in long text

I have a long text in form of a string.
This text includes a lot of questions that are at the same time the headers of sections.
These headers always start with a number+dot+whitespace character combination and end with a question mark, I am trying to extract these strings.
This is what I've got so far: longString.match(/\d\.\s+[a-zA-Z]+\s\\?/g).
Sure enough this doesn't work.
In your example you use [a-zA-Z]+, but you might extend that to matching 1 or more word characters using \w+
This part at the end of the pattern \s\\? matches an expected whitespace char followed by an optional backslash.
To match multiple words, you can optionally repeat the pattern to match a word preceded by 1 or more whitespace characters.
You one option is to use
\d\.\s+\w+(?:\s+\w+)*\s*\?
Explanation
\d\. Match a single digit (for 1 or digits use \d+)
\s+\w+ Match a . and 1+ whitspace chars and 1+ word chars
(?:\s+\w+)* Optionally repeat 1+ whitspace chars and 1+ word chars
\s*\? Match 0+ whitespace chars and a question mark.
Regex demo
A broader match might be matching at least a single time any char except a question mark or whitespace char after the digit, dot and whitespace:
\d\.\s+[^\s?]+(?:\s+[^\s?]+)*\?
Regex demo

How to match any string that contains no consecutively repeating letter

My regular expression should match if there aren't any consecutive letters that are the same.
for example :
"ploplir" should match
"ploppir" should not match
so I use this regular expression:
/([.])\1{1,}/
But It does the exact contrary of what I want. How can I make the match work correctly?
Code
See regex in use here
\b(?!\w*(\w)\1)\w+\b
var r = /\b(?!\w*(\w)\1)\w+\b/g
var s = "ploplir ploppir"
console.log(s.match(r))
Explanation
\b Assert position as a word boundary
(?!\w*(\w)\1\w*) Negative lookahead ensuring what follows doesn't match
\w* Match any number of word characters
(\w) Capture a word character into capture group 1
\1 Match the same text as most recently matched by the 1st capture group
\w+ Match one or more word characters
\b Assert position as a word boundary
Maybe you could use lookarounds to check if there are no consecutive letters in the string:
^(?!.*(.)(?=\1)).*$
Explanation
From the beginning of the string ^
A negative look ahead (?!
Which asserts that following .* a character (.) is not followed by the same character (?=\1) using the group reference \1
Close the negative lookahead
Match zero or more characters .*
The end of the string

Categories

Resources