Meaning of given regular expression in javascript [duplicate] - javascript

This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 7 years ago.
Please explain me the meaning of the following regular expression in JavaScript with proper exploration:
/^\b_((?:__|[\s\S])+?)_\b|^\*((?:\*\*|[\s\S])+?)\*(?!\*)/

This is the meaning.
/^\b_((?:__|[\s\S])+?)_\b|^\*((?:\*\*|[\s\S])+?)\*(?!\*)/
1st Alternative: ^\b_((?:__|[\s\S])+?)_\b
^ assert position at start of the string
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
_ matches the character _ literally
1st Capturing group ((?:__|[\s\S])+?)
(?:__|[\s\S])+? Non-capturing group
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
1st Alternative: __
__ matches the characters __ literally
2nd Alternative: [\s\S]
[\s\S] match a single character present in the list below
\s match any white space character [\r\n\t\f ]
\S match any non-white space character [^\r\n\t\f ]
_ matches the character _ literally
\b assert position at a word boundary (^\w|\w$|\W\w|\w\W)
2nd Alternative: ^\*((?:\*\*|[\s\S])+?)\*(?!\*)
^ assert position at start of the string
\* matches the character * literally
2nd Capturing group ((?:\*\*|[\s\S])+?)
(?:\*\*|[\s\S])+? Non-capturing group
Quantifier: +? Between one and unlimited times, as few times as possible, expanding as needed [lazy]
1st Alternative: \*\*
\* matches the character * literally
\* matches the character * literally
2nd Alternative: [\s\S]
[\s\S] match a single character present in the list below
\s match any white space character [\r\n\t\f ]
\S match any non-white space character [^\r\n\t\f ]
\* matches the character * literally
(?!\*) Negative Lookahead - Assert that it is impossible to match the regex below
\* matches the character * literally
Well, in a really nice form:
You can check this out at Regex 101.

Related

regex to match question sentences in long text

I have a long text in form of a string.
This text includes a lot of questions that are at the same time the headers of sections.
These headers always start with a number+dot+whitespace character combination and end with a question mark, I am trying to extract these strings.
This is what I've got so far: longString.match(/\d\.\s+[a-zA-Z]+\s\\?/g).
Sure enough this doesn't work.
In your example you use [a-zA-Z]+, but you might extend that to matching 1 or more word characters using \w+
This part at the end of the pattern \s\\? matches an expected whitespace char followed by an optional backslash.
To match multiple words, you can optionally repeat the pattern to match a word preceded by 1 or more whitespace characters.
You one option is to use
\d\.\s+\w+(?:\s+\w+)*\s*\?
Explanation
\d\. Match a single digit (for 1 or digits use \d+)
\s+\w+ Match a . and 1+ whitspace chars and 1+ word chars
(?:\s+\w+)* Optionally repeat 1+ whitspace chars and 1+ word chars
\s*\? Match 0+ whitespace chars and a question mark.
Regex demo
A broader match might be matching at least a single time any char except a question mark or whitespace char after the digit, dot and whitespace:
\d\.\s+[^\s?]+(?:\s+[^\s?]+)*\?
Regex demo

Regex to not allow special character without prefix or suffix

I was writing regex for the following validate a string. I wrote the following regex.
^[^\s]+[a-z]{0,}(?!.* {2})[ a-zA-z]{0,}$
it validates for
No space in beginning.
no two consecutive space allowed.
The problem is it allows a single special character. it should not allow a special character unless it is suffixed or prefixed with alpha-numeric character.
Examples:
# -> not allowed.
#A or A# or A2 or 3A is allowed.
One option is to assert that the string does not contain a single "special" char or 2 special chars next to each other using a negative lookahead.
^(?!.*[^a-zA-Z0-9\s][^a-zA-Z0-9\s])(?!.*(?:^| )[^a-zA-Z0-9\s](?!\S))\S+(?: \S+)*$
Explanation
^ Start of string
(?! Negative lookahead, assert that what is at the right does not contain
.*[^a-zA-Z0-9\s][^a-zA-Z0-9\s] match 2 chars other than a-zA-Z0-9 or a whitespace char next to each other
) Close lookahead
(?! Negative lookahead, assert that what is at the right does not contain
.*(?:^| )[^a-zA-Z0-9\s](?!\S) Match a single char other than a-zA-Z0-9 or a whitespace char
) Close lookahead
\S+(?: \S+)* Match 1+ non whitespace chars and optionally repeat a space and 1+ non whitespace chars
$ End of string
Regex demo
Please omit the '$' symbol from the regex because it represents the end of the sentence.
^[^\s]+[a-z]{0,}(?!.* {2})[ a-zA-z]{0,}
So when applying the above regex to the following, it finds only '# '.
#A A# A2 3A

Regex string without double letter [duplicate]

This question already has answers here:
How can I find repeated characters with a regex in Java?
(3 answers)
Closed 4 years ago.
My question is what can be a valid regex for strings that don't contain a double letter.
My solution is : b(ab)*a + a(ba)*b .
But i don't think it is correct, because it doesn't include the a or b.
Can someone help me?
You can achieve this with a negative lookahead:
const re = /^(?!.*?(.).*?\1)[a-z]*$/g;
let s1 = "abcdefgh", s2 = "abcdefga";
console.log(re.test(s1));
console.log(re.test(s2));
How it works:
/^(?!.*?(.).*?\1)[a-z]*$/g
^ asserts position at start of the string
Negative Lookahead (?!.*?(.).*?\1): Assert that the Regex below does not match
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed
1st Capturing Group (.)
. matches any character (except for line terminators)
.*? matches any character (except for line terminators)
*? Quantifier — Matches between zero and unlimited times, as few times as possible, expanding as needed
\1 matches the same text as most recently matched by the 1st capturing group
Match a single character present in the list below [a-z]*
* Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy)
a-z a single character in the range between a (index 97) and z (index 122) (case sensitive)
$ asserts position at the end of the string, or before the line terminator right at the end of the string (if any)
Global pattern flags
g modifier: global. All matches (don't return after first match)

How to match any string that contains no consecutively repeating letter

My regular expression should match if there aren't any consecutive letters that are the same.
for example :
"ploplir" should match
"ploppir" should not match
so I use this regular expression:
/([.])\1{1,}/
But It does the exact contrary of what I want. How can I make the match work correctly?
Code
See regex in use here
\b(?!\w*(\w)\1)\w+\b
var r = /\b(?!\w*(\w)\1)\w+\b/g
var s = "ploplir ploppir"
console.log(s.match(r))
Explanation
\b Assert position as a word boundary
(?!\w*(\w)\1\w*) Negative lookahead ensuring what follows doesn't match
\w* Match any number of word characters
(\w) Capture a word character into capture group 1
\1 Match the same text as most recently matched by the 1st capture group
\w+ Match one or more word characters
\b Assert position as a word boundary
Maybe you could use lookarounds to check if there are no consecutive letters in the string:
^(?!.*(.)(?=\1)).*$
Explanation
From the beginning of the string ^
A negative look ahead (?!
Which asserts that following .* a character (.) is not followed by the same character (?=\1) using the group reference \1
Close the negative lookahead
Match zero or more characters .*
The end of the string

Javascript REGEX - EXACT TEXT AND DIGIT

I am looking for a regex that exactly match the text 'PDR' or 'pdr' and 8 digit so altogether 11 digit ,( 3 text + 8 digit)
pdr16120008 - TRUE
PDR16120009 -TRUE
rdp16120001- FALSE
Regex101
^(pdr|PDR)\d{8}$
Debuggex Demo
Description
^ asserts position at start of a line
1st Capturing Group (pdr|PDR)
1st Alternative pdr
pdr matches the characters pdr literally (case sensitive)
2nd Alternative PDR
PDR matches the characters PDR literally (case sensitive)
\d{8} matches a digit (equal to [0-9])
{8} Quantifier — Matches exactly 8 times
$ asserts position at the end of a line
The regex isn't difficult:
match pdr litterally at the start using ^ anchor
match exactly 8 digits using a character class and a quantifier
use the i modifier in order to match as case insensitive.
const REGEX = /^pdr[0-9]{8}$/i;
let valids = ['pdr16120008', 'PDR16120009', 'rdp16120001']
.filter(input => REGEX.test(input))
;
console.log({valids});

Categories

Resources