Regex to capture all vars between delimiters - javascript

How do I capture all 1, 2 and 3 in not |1|2|3|
My regex \|(.*?)\| skips 2.
const re = /\|(.*?)\|/gi;
const text = 'not |1|2|3|'
console.log(text.match(re).map(m => m[1]));

You can use
const re = /\|([^|]*)(?=\|)/g;
const text = 'not |123|2456|37890|'
console.log(Array.from(text.matchAll(re), m => m[1]));
Details:
\| - a | char
([^|]*) - Group 1: zero or more chars other than |
(?=\|) - a positive lookahead that matches a location that is immediately followed with |.
If you do not care about matching the | on the right, you can remove the lookahead.
If you also need to match till the end of string when the trailing | is missing, you can use /\|([^|]*)(?=\||$)/g.

Related

How do I replace the last character of the selected regex?

I want this string {Rotation:[45f,90f],lvl:10s} to turn into {Rotation:[45,90],lvl:10}.
I've tried this:
const bar = `{Rotation:[45f,90f],lvl:10s}`
const regex = /(\d)\w+/g
console.log(bar.replace(regex, '$&'.substring(0, -1)))
I've also tried to just select the letter at the end using $ but I can't seem to get it right.
You can use
bar.replace(/(\d+)[a-z]\b/gi, '$1')
See the regex demo.
Here,
(\d+) - captures one or more digits into Group 1
[a-z] - matches any letter
\b - at the word boundary, ie. at the end of the word
gi - all occurrences, case insensitive
The replacement is Group 1 value, $1.
See the JavaScript demo:
const bar = `{Rotation:[45f,90f],lvl:10s}`
const regex = /(\d+)[a-z]\b/gi
console.log(bar.replace(regex, '$1'))
Check this out :
const str = `{Rotation:[45f,90f],lvl:10s}`.split('');
const x = str.splice(str.length - 2, 1)
console.log(str.join(''));
You can use positive lookahead to match the closing brace, but not capture it. Then the single character can be replaced with a blank string.
const bar= '{Rotation:[45f,90f],lvl:10s}'
const regex = /.(?=})/g
console.log(bar.replace(regex, ''))
{Rotation:[45f,90f],lvl:10}
The following regex will match each group of one or more digits followed by f or s.
$1 represents the contents captured by the capture group (\d).
const bar = `{Rotation:[45f,90f],lvl:10s}`
const regex = /(\d+)[fs]/g
console.log(bar.replace(regex, '$1'))

What will be the best regex Expression for censoring email?

Hello I am stuck on a problem for censoring email in a specific format, but I am not getting how to do that, please help me!
Email : exampleEmail#example.com
Required : e***********#e******.com
Help me getting this in javascript,
Current code I am using to censor :
const email = exampleEmail#example.com;
const regex = /(?<!^).(?!$)/g;
const censoredEmail = email.replace(regex, '*');
Output: e**********************m
Please help me getting e***********#e******.com
You can use
const email = 'exampleEmail#example.com';
const regex = /(^.|#[^#](?=[^#]*$)|\.[^.]+$)|./g;
const censoredEmail = email.replace(regex, (x, y) => y || '*');
console.log(censoredEmail );
// => e***********#e******.com
Details:
( - start of Group 1:
^.| - start of string and any one char, or
#[^#](?=[^#]*$)| - a # and any one char other than # that are followed with any chars other than # till end of string, or
\.[^.]+$ - a . and then any one or more chars other than . till end of string
) - end of group
| - or
. - any one char.
The (x, y) => y || '*' replacement means the matches are replaced with Group 1 value if it matched (participated in the match) or with *.
If there should be a single # present in the string, you can capture all the parts of the string and do the replacement on the specific groups.
^ Start of string
([^\s#]) Capture the first char other than a whitespace char or # that should be unmodified
([^\s#]*) Capture optional repetitions of the same
# Match literally
([^\s#]) Capture the first char other than a whitespace char or # after it that should be unmodified
([^\s#]*) Capture optional repetitions of the same
(\.[^\s.#]+) Capture a dot and 1+ other chars than a dot, # or whitespace char that should be unmodified
$ End of string
Regex demo
In the replacement use all 5 capture groups, where you replace group 2 and 4 with *.
const regex = /^([^\s#])([^\s#]*)#([^\s#])([^\s#]*)(\.[^\s.#]+)$/;
[
"exampleEmail#example.com",
"test"
].forEach(email =>
console.log(
email.replace(regex, (_, g1, g2, g3, g4, g5) =>
`${g1}${"*".repeat(g2.length)}#${g3}${"*".repeat(g4.length)}${g5}`)
)
);

Regex matches numbers in date but shouldn't

Why does my regex pattern match the date part of the string? It seems like I'm not accounting for the / (slash) correctly with [^\/] to avoid the pattern to match date strings?
const reg = new RegExp(
/(USD|\$|EUR|€|USDC|USDT)?\s?(\d+[^\/]|\d{1,3}(,\d{3})*)(\.\d+)?(k|K|m|M)?\b/,
"i"
);
const str = "02/22/2021 $50k";
console.log(reg.exec(str));
// result: ['02', undefined, '02', undefined, undefined, undefined, undefined, index: 0, input: '02/22/2021 $50k', groups: undefined]
// was expecting: [$50k,...]
You get those matches for the date part and the undefined ones, because you use a pattern with optional parts and alternations |
In your pattern there is this part (\d+[^\/]|\d{1,3}(,\d{3})*). That first part of the alternation \d+[^\/] matches 1+ digits followed by any char except a / (which can also match a digit) and the minimum amount of characters is 2. That part will match 20, 22 and 2021 in the date part.
If there is 1 digit, the second part of the alternation will match it.
If you want to match only numbers as well, you can assert not / to the left and the right, and make the whole part with the first alternatives like USD optional with the optional whitspace chars as well, to prevent matching that before only digits.
The last alternation can be shortened to a character class [km]? with a case insensitive flag.
See this page for the lookbehind support for Javascript.
(?:(?:USD|\$|EUR|€|USDC|USDT)\s?)?(?<!\/)\b(?:\d{1,3}(?:,\d{3})*(?:\.\d+)?|\d+)(?!\/)[KkMm]?\b
Regex demo
const reg = /(?:(?:USD|\$|EUR|€|USDC|USDT)\s?)?(?<!\/)\b(?:\d{1,3}(?:,\d{3})*(?:\.\d+)?|\d+)(?!\/)[KkMm]?\b/gi;
const str = "02/22/2021 $50k 1,213.3 11111111 $50,000 $50000"
const res = Array.from(str.matchAll(reg), m => m[0]);
console.log(res)
If the currency is not optional:
(?:USD|\$|EUR|€|USDC|USDT)\s?(?:\d{1,3}(?:,\d{3})*(?:\.\d+)?|\d+)[KkMm]?\b
Regex demo
I can't get your regex well. so i try to figure out what result you would expect. check this. in groups you have each part of your string.
const regex = /(\d{2})*\/?(\d{2})\/(\d{2,4})?\s*(USD|\$|EUR|€|USDC|USDT)?(\d*)(k|K|m|M)?\b/i
const regexNamed = /(?<day>\d{2})*\/?(?<month>\d{2})\/(?<year>\d{2,4})?\s*(?<currency>USD|\$|EUR|€|USDC|USDT)?(?<value>\d*)(?<unit>k|K|m|M)?\b/i
const str1 = '02/22/2021 $50k'
const str2 = '02/2021 €50m'
const m1 = str1.match(regex)
const m2 = str2.match(regexNamed)
console.log(m1)
console.log(m2.groups)
Blockquote

Setting the end of the match

I have the following string:
[TITLE|prefix=a] [STORENAME|prefix=b|suffix=c] [DYNAMIC|limit=10|random=0|reverse=0]
And I would like to get the value of the prefix of TITLE, which is a.
I have tried it with (?<=TITLE|)(?<=prefix=).*?(?=]|\|) and that seems to work but that gives me also the prefix of STORENAME (b). So if [TITLE|prefix=a] will be missing in the string, I'll have the wrong value.
So I need to set the end of the match with ] that belongs to [TITLE. Please notice that this string is dynamic. So it could be [TITLE|suffix=x|prefix=y] as well.
const regex = "[TITLE|prefix=a] [STORENAME|prefix=b|suffix=c] [DYNAMIC|limit=10|random=0|reverse=0]".match(/(?<=TITLE|)(?<=prefix=).*?(?=]|\|)/);
console.log(regex);
You can use
(?<=TITLE(?:\|suffix=[^\]|]+)?\|prefix=)[^\]|]+
See the regex demo. Details:
(?<=TITLE(?:\|suffix=[^\]|]+)?\|prefix=) - a location in string immediately preceded with TITLE|prefix| or TITLE|suffix=...|prefix|
[^\]|]+ - one or more chars other than ] and |.
See JavaScript demo:
const texts = ['[TITLE|prefix=a] [STORENAME|prefix=b|suffix=c] [DYNAMIC|limit=10|random=0|reverse=0]', '[TITLE|suffix=s|prefix=a]'];
for (let s of texts) {
console.log(s, '=>', s.match(/(?<=TITLE(?:\|suffix=[^\]|]+)?\|prefix=)[^\]|]+/)[0]);
}
You could also use a capturing group
\[TITLE\|(?:[^|=\]]*=[^|=\]]*\|)*prefix=([^|=\]]*)[^\]]*]
Explanation
\[TITLE\| Match [TITLE|
(?:\w+=\w+\|)* Repeat 0+ occurrences wordchars = wordchars and |
prefix= Match literally
(\w+) Capture group 1, match 1+ word chars
[^\]]* Match any char except ]
] Match the closing ]
Regex demo
const regex = /\[TITLE\|(?:\w+=\w+\|)*prefix=(\w+)[^\]]*\]/g;
const str = `[TITLE|prefix=a] [STORENAME|prefix=b|suffix=c] [DYNAMIC|limit=10|random=0|reverse=0]
[TITLE|suffix=x|prefix=y]`;
let m;
while ((m = regex.exec(str)) !== null) {
console.log(m[1]);
}
Or with a negated character class instead of \w
\[TITLE\|(?:[^|=\]]*=[^|=\]]*\|)*prefix=([^|=\]]*)[^\]]*]
Regex demo

Regex to match all of symbols but except a word

How do regex to match all of symbols but except a word?
Need find all symbols except a word.
(.*) - It find all symbols.
[^v] - It find all symbols except letter v
But do how find all symbols except a word?
Solution (writed below):
((?:(?!here any word for block)[\s\S])*?)
or
((?:(?!here any word for block).)*?)
((?:(?!video)[\s\S])*?)
I want to find all except |end| and replace all except `|end|.
I try:
Need all except |end|
var str = '|video| |end| |water| |sun| |cloud|';
// May be:
//var str = '|end| |video| |water| |sun| |cloud|';
//var str = '|cloud| |video| |water| |sun| |end|';
str.replace(/\|((?!end|end$).*?)\|/gm, test_fun2);
function test_fun2(match, p1, offset, str_full) {
console.log("--------------");
p1 = "["+p1+"]";
console.log(p1);
console.log("--------------");
return p1;
}
Output console log:
--------------
[video]
--------------
--------------
--------------
--------------
--------------
--------------
--------------
Example what need:
Any symbols except [video](
input - '[video](text-1 *******any symbols except: "[video](" ******* [video](text-2 any symbols) [video](text-3 any symbols) [video](text-4 any symbols) [video](text-5 any symbols)'
output - <div>text-1 *******any symbols except: "[video](" *******</div> <div>text-2 any symbols</div><div>text-3 any symbols</div><div>text-4 any symbols</div><div>text-5 any symbols</div>
Scenario 1
Use the best trick ever:
One key to this technique, a key to which I'll return several times, is that we completely disregard the overall matches returned by the regex engine: that's the trash bin. Instead, we inspect the Group 1 matches, which, when set, contain what we are looking for.
Solution:
s = s.replace(/\|end\||\|([^|]*)\|/g, function ($0, $1) {
return $1 ? "[" + $1 + "]" : $0;
});
Details
\|end\| - |end| is matched
| - or
\|([^|]*)\| - | is matched, any 0+ chars other than | are captured into Group 1, and then | is matched.
If Group 1 matched ($1 ?) the replacement occurs, else, $0, the whole match, is returned back to the result.
JS test:
console.log(
"|video| |end| |water| |sun| |cloud|".replace(/\|end\||\|([^|]*)\|/g, function ($0, $1) {
return $1 ? "[" + $1 + "]" : $0;
})
)
Scenario 2
Use
.replace(/\[(?!end])[^\]]*]\(((?:(?!\[video]\()[\s\S])*?)\)/g, '<div>$1</div>')
See the regex demo
Details
\[ - a [ char
(?!end]) - no end] allowed right after the current position
[^\]]* - 0+ chars other than ] and [
] - a ] char
\( - a ( char
((?:(?!\[video])[\s\S])*?) - Group 1 that captures any char ([\s\S]), 0 or more occurrences, but as few as possible (*?) that does not start a [video]( char sequence
\) - a ) char.
Something like this is better done in multiple steps. Also, if you're matching stuff, you should use match.
var str = '|video| |end| |water| |sun| |cloud|';
var matches = str.match(/\|.*?\|/g);
// strip pipe characters...
matches = matches.map(m=>m.slice(1,-1));
// filter out unwanted words
matches = matches.filter(m=>!['end'].includes(m));
// this allows you to add more filter words easily
// if you'll only ever need "end", just do (m=>m!='end')
console.log(matches); // ["video","water","sun","cloud"]
Notice how this is a lot easier to understand what's going on, and also much easier to maintain and change in future as needed.
You are on the right track. Here is what you need to do with regex:
var str = '|video| |end| |water| |sun| |cloud|';
console.log(str.replace(/(?!\|end\|)\|(\S*?)\|/gm, test_fun2));
function test_fun2(match, p1, offset, str_full) {
return "["+p1+"]";
}
And an explanation of what was wrong - you had your negative-lookahead placed after the | character. That means that the matching engine would do the following:
Match |video| because the pattern works with it
Grab the next |
Find that the next text is end which is in the negative lookahead and drop it.
Grab the | immediately after end
grab the space and the next | character, since this passes the negative lookahead and also works with .*?
continue grabbing the intermediate | | sequences because the | in the beginning of the word was consumed by the previous match.
So you end up matching the following things
var str = '|video| |end| |water| |sun| |cloud|';
^^^^^^^ ^^^ ^^^ ^^^
|video| ______| | | |
| | ____________________| | |
| | ____________________________| |
| | __________________________________|
All because the |end match was dropped.
You can see this if you print out the matches
var str = '|video| |end| |water| |sun| |cloud|';
str.replace(/\|((?!end|end$).*?)\|/gm, test_fun2);
function test_fun2(match, p1, offset, str_full) {
console.log(match, p1, offset);
}
You will see that the second, third, and fourth match is | | the captured item p1 is - a blank space (not very well displayed, but there) and the offset they were found were 12, 20, 26
|video| |end| |water| |sun| |cloud|
01234567890123456789012345678901234
^ ^ ^
12 _________| | |
20 _________________| |
26 _______________________|
The change I made was to instead look for explicitly the |end| pattern in a negative lookahead and also to only match non-whitespace characters, so you don't grab | | again.
Also worth noting that you can move your filtering logic to the replacement callback instead, instead of the regex. This simplifies the regex but makes your replacement more complex. Still, it's a fair tradeoff, as code is usually easier to maintain if you have more complex conditions:
var str = '|video| |end| |water| |sun| |cloud|';
//capturing word characters - an alternative to "non-whitespace"
console.log(str.replace(/\|(\w*)\|/gm, test_fun2));
function test_fun2(match, p1, offset, str_full) {
if (p1 === 'end') {
return match;
} else {
return "[" + p1 + "]"
}
}

Categories

Resources