Splitting a String with Emoji Regex Respecting Variation Selector 15

I'm trying to create a way to split a string by emoji and non-emoji chunks. I managed to get a regex from here and altered to this to take into account the textual variation selector:
This works with .match such as:
'🇦🇨'.match(regex) // (["0x1F1E6", "0x1F1E8"]) => ['🇦🇨']
'🇦🇨'.match(regex) // (["0x1F1E6", "0x1F1E8", "0xFE0E]) => null
But split isn't giving me the expected results:
'🇦🇨'.split(regex) // (["", undefined, "🇨", ""]) => ['🇦🇨']
I need split to return the entire emoji in one element. What am I doing wrong?
I have a working regex now, except for the edge case exhibited here: https://regex101.com/r/Vki2ZS/2.
I don't want the second emoji to be matched since it is succeeded by the textual variant selector. I think this is because I'm using lookahead, as the reverse string is matched as expected, but I can't use negative look behind since it's not supported by all browsers.

Your pattern does not work because the second emoji got partly matched with the + quantified (?:\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff])+: \uD83E\uDD20\uFE0F\uD83E\uDD20 was matched in \uD83E\uDD20\uFE0F\uD83E\uDD20\uFE0E with two iterations, first \uD83E\uDD20\uFE0F, then \uD83E\uDD20.
The pattern you may use with .split is
The main goal was to fail all matches where (?:\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff])+ was followed with \uFE0E, see I added a negative lookahead (?!\ufe0e).
JS demo:
var regex = /((?:(?:\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff])+(?!\ufe0e)(?:\ufe0f)?(?:\u200d)?)+)/;
// If you need to wrap the match with some tags:
console.log('🤠️🤠︎'.replace(/(?:(?:\u00a9|\u00ae|[\u2000-\u3300]|\ud83c[\ud000-\udfff]|\ud83d[\ud000-\udfff]|\ud83e[\ud000-\udfff])+(?!\ufe0e)(?:\ufe0f)?(?:\u200d)?)+/g, '<span class="special">$&</span>'))


How to get content using filter and match?

I want to search in the array if theres the string that Im looking for, to do that im using match
const search_notes = array_notes.filter(notes => notes.real_content.toUpperCase().match(note.toUpperCase()));
as you can see, search_notes will give me an array with all the strings that at least has a character from the input or match completely, but theres a problem, because when I write , ), [], + or any regex symbol in the input it will gives me this error:
how can i solve this?
If you look at documentation for the match method (for instance, MDN's), you'll see that it accepts a RegExp object or a string, and if you give it a string, it passes that string into new RegExp. So naturally, characters that have special meaning in a regular expression need special treatment.
You don't need match, just includes, which doesn't do that:
const search_notes = array_notes.filter(
notes => notes.real_content.toUpperCase().includes(note.toUpperCase())

How to extract specific substring from a string using a one liner

Input: parent/123/child/grand-child
Expected output: child
Attempt 1: (?<=\/parent\/\d*)(.*)(?=\/.*)
Error: A quantifier inside a lookbehind makes it non-fixed width, look behind does not accept * but I don't know the width of the number hence must use it
Attempt 2: (works but 2 liners):
const currentRoute='/parent/123/child/grand-child'
let extract = currentRoute.replace(/\/parent\/\d*/g, '');
extract = extract.substring(1, extract.lastIndexOf('/'));
console.log('Result', extract)
How do I get the extract with a one liner, preferably using regex
Your current pattern will match 123/child instead of child only as there is a forward slash missing after \d* (note the * means 0 or more times)
It will also over match (See demo) due to the .* if there are more forward slashes present.
Instead, you could make use of a capturing group and use match.
Regex demo
The value is in capturing group 1.
let res = "parent/123/child/grand-child".match(/parent\/\d+\/(\w+)\//);
if (res) console.log(res[1])
A pattern with a lookbehind to get the value child could be
Regex demo
Note that this is not yet widely supported.
let res = "parent/123/child/grand-child".match(/(?<=parent\/\d*\/)([^\/]+)(?=\/)/);
if (res) console.log(res[0])
How about
If the format is fixed, then use .split("/")[2] to get 3rd element
To match the parent part of the string use .match(/^parent\/[^\/]+\/([^\/]+)/)[1]

How to get the 1st character after a pattern using regex?

I'm trying to get the first character after the pattern.
I want to select:
How do you just get the first character after a selected pattern?
Example Here! :)
To get the t after border-, you usally match with this kind of regex:
You can then extract the submatch:
var characterAfter = str.match(/border-(.)/)[1];
match returns an array with the whole match as first element, and the submatches in the following positions.
To get an array of all the caracters following a dash, use
var charactersAfter = str.match(/-(.)/g).map(function(s){ return s.slice(1) })
Just use a capturing group:
"border-top-color".replace(/-([a-z])/g, "-[$1]")
You can use submatching like dystroy said or simply use lookbehind to match it:

Regular expression to match a string which is NOT matched by a given regexp

I've been hoving around by some answers here, and I can't find a solution to my problem:
I have this regexp which matches everyting inside an HTML span tag, including contents:
and I want to find a way to make a search in all the text, except for what is matched with that regexp.
For example, if my text is:
var text = "...for there is a class of <span class="highlight">guinea</span> pigs which..."
... then the regexp would match:
<span class="highlight">guinea</span>
and I want to be able to make a regexp such that if I search for "class", regexp will match "...for there is a class of..."
and will not match inside the tag, like in
"... class="highlight"..."
The word to be matched ("class") might be anywhere within the text. I've tried
but it keeps searching inside tags as well.
I want to find a solution using only regexp, not dealing with DOM nor JQuery. Thanks in advance :).
Although I wouldn't recommend this, I would do something like below
You can see this in action here
Rubular Link for this regex
You can capture your matches from the groups and work with them as needed. If you can, use a HTML parser and then find matches from the text element.
It's not pretty, but if I get you right, this should do what you wan't. It's done with a single RegEx but js can't (to my knowledge) extract the result without joining the results in a loop.
The RegEx: /(?:<span\b[^>]*>.*?<\/span>)|(.)/g
Example js code:
var str = '...for there is a class of <span class="highlight">guinea</span> pigs which...',
pattern = /(?:<span\b[^>]*>.*?<\/span>)|(.)/g,
res = '';
match = pattern.exec(str)
while( match != null )
res += match[1];
match = pattern.exec(str)
document.writeln('Result:' + res);
In English: Do a non capturing test against your tag-expression or capture any character. Do this globally to get the entire string. The result is a capture group for each character in your string, except the tag. As pointed out, this is ugly - can result in a serious number of capture groups - but gets the job done.
If you need to send it in and retrieve the result in one call, I'd have to agree with previous contributors - It can't be done!

Extract specific chars from a string using a regex

I need to split an email address and take out the first character and the first character after the '#'
I can do this as follows:
'bar#foo'.split('#').map(function(a){ return a.charAt(0); }).join('')
--> bf
Now I was wondering if it can be done using a regex match, something like this
--> bar#fbf
Not really what I want, but I'm sure I miss something here! Any suggestions ?
Why use a regex for this? just use indexOf to get the char at any given position:
var addr = 'foo#bar';
console.log(addr[0], addr[addr.indexOf('#')+1])
To ensure your code works on all browsers, you might want to use charAt instead of []:
console.log(addr.charAt(0), addr.charAt(addr.indexOf('#')+1));
Either way, It'll work just fine, and This is undeniably the fastest approach
If you are going to persist, and choose a regex, then you should realize that the match method returns an array containing 3 strings, in your case:
["the whole match",//start of string + first char + .*?# + first string after #
"groupw 1 \w",//first char
"group 2 \w"//first char after #
So addr.match(/^(\w).*?#(\w)/).slice(1).join('') is probably what you want.
If I understand correctly, you are quite close. Just don't join everything returned by match because the first element is the entire matched string.
--> bf
Using regex:
'abc#xyz'.replace(/(?:^|#)(\w)/g, function($0, $1) { matched += $1; return $0; });
// ax
The regex match function returns an array of all matches, where the first one is the 'full text' of the match, followed by every sub-group. In your case, it returns this:
To get rid of the first item (the full match), use slice:
Use String.prototype.replace with regular expression:
'bar#foo'.replace(/^(\w).*#(\w).*$/, '$1$2'); // "bf"
Or using RegEx

