Emoji typing regex messed up - javascript

I'm building a simple chat with some basic emojis. So when the user for example types :), a smiley emoji is being displayed inside the input field.
However my regex fails when it comes to the emoji :/ and the reason is that it messes up URLs. (Keep in mind the emoji detection is triggered on keyup).
Example test text:
Hello could you please visit my website at http://www.example.com or https://www.example.com :/ ://
So to sum up the regex must not replace :/ when it starts with http or https.
Currently in my mapping I have this: "[^http?s]:/{1}(?!/)": "1F615" but for some strange reason it keeps 'eating the previous character upon replace'.

Since the issue is that :/ is messing up http://, and JavaScript doesn't support lookbehinds ("check if this did/didn't occur immediately before"), I think the simplest solution would be just to check if the :/ is (not) followed by another /
":/(?!/)" : "1F615"
Notably, the lookahead (which is supported by JavaScript) does not actually match the subsequent /, it just ensures that it isn't there.

This should work
function reverse(s){
return s.split("").reverse().join("");
}
function getMatch(str){
const regex = /\/+\:(?!s{0,1}ptth)/g;
const subst = ``;
const result = reverse(reverse(str).replace(regex, subst));
console.log(result);
}
getMatch(`Hello could you please visit my website at http://www.example.com or https://www.example.com :/ ://`);
Since Javascript does not support negative lookbehind, but does support negative lookaheads, we use this to our advantage by reversing the string, replacing and then reversing again
demo for reversed string

Related

Safari Regex error "invalid regular expression invalid group specifier name" [duplicate]

In my Javascript code, this regex /(?<=\/)([^#]+)(?=#*)/ works fine in Chrome, but in safari, I get:
Invalid regular expression: invalid group specifier name
Any ideas?
Looks like Safari doesn't support lookbehind yet (that is, your (?<=\/)). One alternative would be to put the / that comes before in a non-captured group, and then extract only the first group (the content after the / and before the #).
/(?:\/)([^#]+)(?=#*)/
Also, (?=#*) is odd - you probably want to lookahead for something (such as # or the end of the string), rather than a * quantifier (zero or more occurrences of #). It might be better to use something like
/(?:\/)([^#]+)(?=#|$)/
or just omit the lookahead entirely (because the ([^#]+) is greedy), depending on your circumstances.
The support for RegExp look behind assertions as been issued by web kit:
Check link: https://github.com/WebKit/WebKit/pull/7109
Regex ?<= not supported Safari iOS, we can use ?:
Note: / or 1st reference letter that comes before in a non-captured group
See detail: https://caniuse.com/js-regexp-lookbehind
let str = "Get from Slash/to Next hashtag #GMK"
let workFineOnChromeOnly = str?.match(/(?<=\/)([^#]+)(?=#*)/g)
console.log("❌ Work Fine On Chrome Only", workFineOnChromeOnly )
let workFineSafariToo = str?.match(/(?:\/)([^#]+)(?=#*)/g)
console.log("✔️ Work Fine Safari too", workFineSafariToo )
Just wanted to put this out there for anyone who stumbles across this issue and can't find anything...
I had the same issue, and it turned out to be a RegEx expression in one of my dependencies, namely Discord.js .
Luckily I no longer needed that package but if you do, consider putting an issue out there or something (maybe you shouldn't even be running discord.js in your frontend react app).

search match beetwen three conditions [duplicate]

I am using the following regex for validating youtube video share url's.
var valid = /^(http\:\/\/)?(youtube\.com|youtu\.be)+$/;
alert(valid.test(url));
return false;
I want the regex to support the following URL formats:
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrX1w5luM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
youtu.be/cCnrX1w5luM
I tried different regex but I am not getting a suitable one for share links. Can anyone help me to solve this.
Here's a regex I use to match and capture the important bits of YouTube URLs with video codes:
^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|v\/)?)([\w\-]+)(\S+)?$
Works with the following URLs:
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
//www.youtube.com/watch?v=DFYRQ_zQ-gk
www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
//youtube.com/watch?v=DFYRQ_zQ-gk
youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
//m.youtube.com/watch?v=DFYRQ_zQ-gk
m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
//www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
//www.youtube.com/embed/DFYRQ_zQ-gk
www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
//youtube.com/embed/DFYRQ_zQ-gk
youtube.com/embed/DFYRQ_zQ-gk
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
//www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://youtube-nocookie.com/embed/DFYRQ_zQ-gk
//youtube-nocookie.com/embed/DFYRQ_zQ-gk
youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
//youtu.be/DFYRQ_zQ-gk
youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
The captured groups are:
protocol
subdomain
domain
path
video code
query string
https://regex101.com/r/vHEc61/1
You're missing www in your regex
The second \. should optional if you want to match both youtu.be and youtube (but I didn't change this since just youtube isn't actually a valid domain - see note below)
+ in your regex allows for one or more of (youtube\.com|youtu\.be), not one or more wild-cards.
You need to use a . to indicate a wild-card, and + to indicate you want one or more of them.
Try:
^(https?\:\/\/)?(www\.youtube\.com|youtu\.be)\/.+$
Live demo.
If you want it to match URLs with or without the www., just make it optional:
^(https?\:\/\/)?((www\.)?youtube\.com|youtu\.be)\/.+$
Live demo.
Invalid alternatives:
If you want www.youtu.be/... to also match (at the time of writing, this doesn't appear to be a valid URL format), put the optional www. outside the brackets:
^(https?\:\/\/)?(www\.)?(youtube\.com|youtu\.be)\/.+$
youtube/cCnrX1w5luM (with or without http://) isn't a valid URL, but the question explicitly mentions that the regex should support that. To include this, replace youtu\.be with youtu\.?be in any regex above. Live demo.
I know I'm like 2 years late to the party, but I was needing to write something up anyway, and seems to fit every test case that I can throw at it. Should be able to reference the first match ($1) to get the ID. Matches the http, https, www and non-www, youtube.com, youtu.be, /watch? and /watch.php? on youtube.com (youtu.be does not use these), and it supports matching even when there are other variables in the URL string (?t= for time, ?list= for playlists, etc).
(?:https?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]+)
Format for YouTube videos has changed. This regex works for all cases:
^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_])+
Tests here.
Based on so many other regex; this is the best I have got:
((http(s)?:\/\/)?)(www\.)?((youtube\.com\/)|(youtu.be\/))[\S]+
Test:
http://regexr.com/3bga2
Try this:
((http://)?)(www\.)?((youtube\.com/)|(youtu\.be)|(youtube)).+
http://regexr.com?36o7a
I took one of the answers from here and added support for a few edge cases that I noticed in my dataset. This should work for pretty much any valid url.
^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*(?:[&\/\#].*)?$
I tried this one and it works fine for me.
(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)
You can check here https://regex101.com/r/Kvk0nB/1
https://regexr.com/62kgd
^((http|https)\:\/\/)?(www\.youtube\.com|youtu\.?be)\/((watch\?v=)?([a-zA-Z0-9]{11}))(&.*)*$
https://www.youtube.com/watch?v=YPz9zqakRbk
https://www.youtube.com/watch?v=YPz9zqakRbk&t=11
http://youtu.be/cCnrX1w5luM&y=12
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrXswsluM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
Check this pattern instead:
r'(?i)(http.//|https.//)*[A-Za-z0-9._%+-]+\.\w+'

Works in Chrome, but breaks in Safari: Invalid regular expression: invalid group specifier name /(?<=\/)([^#]+)(?=#*)/

In my Javascript code, this regex /(?<=\/)([^#]+)(?=#*)/ works fine in Chrome, but in safari, I get:
Invalid regular expression: invalid group specifier name
Any ideas?
Looks like Safari doesn't support lookbehind yet (that is, your (?<=\/)). One alternative would be to put the / that comes before in a non-captured group, and then extract only the first group (the content after the / and before the #).
/(?:\/)([^#]+)(?=#*)/
Also, (?=#*) is odd - you probably want to lookahead for something (such as # or the end of the string), rather than a * quantifier (zero or more occurrences of #). It might be better to use something like
/(?:\/)([^#]+)(?=#|$)/
or just omit the lookahead entirely (because the ([^#]+) is greedy), depending on your circumstances.
The support for RegExp look behind assertions as been issued by web kit:
Check link: https://github.com/WebKit/WebKit/pull/7109
Regex ?<= not supported Safari iOS, we can use ?:
Note: / or 1st reference letter that comes before in a non-captured group
See detail: https://caniuse.com/js-regexp-lookbehind
let str = "Get from Slash/to Next hashtag #GMK"
let workFineOnChromeOnly = str?.match(/(?<=\/)([^#]+)(?=#*)/g)
console.log("❌ Work Fine On Chrome Only", workFineOnChromeOnly )
let workFineSafariToo = str?.match(/(?:\/)([^#]+)(?=#*)/g)
console.log("✔️ Work Fine Safari too", workFineSafariToo )
Just wanted to put this out there for anyone who stumbles across this issue and can't find anything...
I had the same issue, and it turned out to be a RegEx expression in one of my dependencies, namely Discord.js .
Luckily I no longer needed that package but if you do, consider putting an issue out there or something (maybe you shouldn't even be running discord.js in your frontend react app).

regex to match all keywords in a string

Being noob in regex I require some support from community
Let say I have this string str
www.anysite.com hello demo try this link
anysite.com indeed demo link
http://www.anysite.com another one
www.anysite.com
http://anysite.com
Consider 1-5 as whole string str here
I want to convert all 'anysite.com' into clickable html links, for which I am using:
str = str.replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '$1');
This converts all space separated words starting with http/https/ftp into links as
url
So, line 3 and line 5 has been converted correctly. Now to convert all www.anysite.com into links I again used
str = str.replace(/(\b^(http|https|ftp)?(www\.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig, '$1');
Though it only converts www.anysite.com into link if it is found at very beginning of str. So it convert line number 1 but not line number 4.
Note that I have used ^(http|https|ftp)?(www.) to find all www not
starting with http/https/ftp, as for http they already have been
converted
Also the link on line number 2, where it is neither started with http nor www rather it ends with .com, how the regex would be for that.
For reference you can try posting this whole string to you facebook timeline, it converts all five line into links. Check snapshot
Thanks for help, the final RegEx that helped me is:
//remove all http:// and https://
str = str.replace(/(http|https):\/\//ig, "");
//replace all string ending with .com or .in only into link
str = str.replace( /((www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.(com|in))/ig, '$1');
I used .com and .in for my specific requirement, else the solution on this http://regexr.com/39i0i will work
Though sill there is issue like- it doesn't convert shortened url into
links perfectly. e.g http://s.ly/qhdfTyuiOP will give link till s.ly
Still any suggestions?
^(http|https|ftp)?(www\.) does not mean "all www not starting with http/https/ftp" but rather "a string that starts with an optional http/https/ftp followed by www..
Indeed, ^ in this context isn't a negation but rather an anchor representing the start of the string. I suppose you used it this way because of its meaning when used in a character class ([^...]) ; it is rather tricky since its meaning change depending on the context it is found in.
You could just remove it and you should be fine, as I see no point of making sure the string does not start with http/https/ftp (you transformed those occurrences just before, there should be none left).
Edit : I mentioned lookbehind but forgot it's not available in JS...
If you wanted to make some kind of negation, the easiest way would be to use a negative lookbehind :
(?<!http|https|ftp)www\.
This matches "www." only when it's not preceded by http, https nor ftp.

Regex to find web addresses in short copy

Having a short copy I need to match all occurrences of links to websites. To keep things simple a need to find out addresses in this format:
www.aaaaaa.bbbbbb
http://aaaaaa.bbbb
https://aa.bbbb
but also I need to take care of longer www/http/https versions:
www.aaaaa.bbbb.ccc.ddd.eeee
etc. So basically number of subdomains is not known. Now I came up with this regex:
(www\.([a-zA-Z0-9-_]|\.(?!\s))+)[\s|,|$]|(http(s)?:\/\/(?!\.)([a-zA-Z0-9-_]|\.(?!\s))+)[\s|,|$]
If you test on:
this is some tex with www.somewIebsite.dfd.jhh.hjh inside of it or maybe http://www.ssss.com or maybe https://evenore.com hahaah blah
It works fine with exception of when address is at the very end. $ seems to work only when there is \n in the end and it fails for:
this is some tex with www.somewIebsite.dfd.jhh.hjh
I'm guessing fix is simple and I miss something obvious so how would I fix it? BTW I posted regex here if yu want to quickly play around https://regex101.com/r/eL1bI4/3
The problem is that you placed the end anchor $ inside the character group []
[\s|,|$]
It is then interpreted literally as a dollar sign, and not as the anchor (the pipe character | is also interpreted literally, it's not needed there). The solution is to move the $ anchor outside:
(?:[\s,]|$)
However, in this case it makes more sense to use a positive lookahead instead of the noncapturing group (you don't want trailing spaces, or commas):
(?=[\s,]|$)
In the result you will end up with the following regex pattern:
(www\.([a-zA-Z0-9-_]|\.(?!\s))+)(?=[\s,]|$)|(http(s)?:\/\/(?!\.)([a-zA-Z0-9-_]|\.(?!\s))+)(?=[\s,]|$)
See the working demo.
The updated version that handles trailing full stops:
(www\.([a-zA-Z0-9-_]|\.(?!\s|\.|$))+)(?=[\s,.]|$)|(http(s)?:\/\/(?!\.)([a-zA-Z0-9-_]|\.(?!\s|\.|$))+)(?=[\s,.]|$)
See the working demo.

Categories

Resources