Detect protocol optional url in text with Javascript - javascript

I found a lot of answers to detect url in text. I tried them but failed to detect protocol optional urls. Most of existing solutions could find urls in "Hello, open http://somesite.com/etc to see my photo" and replace the http part to tags. But they could not work for cases like "Hey, open somesite.com or somesite.com/etc to take a look."
How can I detect both cases (better with one regex), and the default protocol is "http://" for none-protocol urls.
I note that SO also failed to detect the latter case...
Edit:
Maybe it is error-prone to change somesite.com into urls (in English), so this requirement is not ideal?
I used regex /(https?://[^\s]+)/g. This one is simple and support cases like http://abcd.com/etc/?v=3&type=xyz. But I don't know how to change it to support protocol absent case.

You can use this:
^(https?:\/\/)?\w+\.\w+([\/\w\?\-\+\!\&\$\=\.\:]+)*$
Matches:
www.domain.com
http://www.domain.com
http://www.somesite.com/etc
http://www.somesite.com/etc/etc/etc
somesite.com
somesite.com/etc
http://google.com/q=regex+for+this&utm-source=stackoverflow
DEMO

This isn't bullet proof, but it works with the example you've provided:
var text = "Hey, open some-site.com or somesite.com/etc?test=1 to take a look."
var myregexp = /([\-A-Z0-9+&##\/%=~_|]+\.[a-z]{2,7})(\/[\-A-Z0-9+&##\/%=~_|?]+)?/ig;
var result = text.replace(myregexp, '$1$2');
alert(result);
//Hey, open some-site.com or somesite.com/etc?test=1 to take a look.
LIVE DEMO

Related

How to grab URLs in JavaScript without harming embedded objects and inline URL

I wrote a RegExp to grab and encode URLs in JavaScript.
This works fine but, it introduced a bug into my app.
I have a span Element which is used to display Emojis like this:
<span style="background:url(http://localhost/res/emo/face/E004.png)"></span>
Now, I'm using this Regular Expression to grab and convert anything URL into actual HTML clickable links:
/((https?:\/\/)?[\w-]+(\.[\w-]+)+\.?(:\d+)?(\/\S*)?)/ig
This ended up encoding the emoji URL into a clickable link.
Can anyone adjust that Code to Ignore URLs inside Elements or embedded Objects???
Please I need help!
This is the code:
var urlRegex = /((https?:\/\/)?[\w-]+(\.[\w-]+)+\.?(:\d+)?(\/\S*)?)/ig;
return txt.replace(urlRegex, function (url) {
var hyperlink = url;
if(!hyperlink.match('^https?:\/\/')) {
hyperlink = 'http://' + hyperlink;
}
return `${url}`;
});
I don't that the URLS inside
<span style="background:url(http://localhost/res/emo/face/E004.png)"></span>
were touched.
You would need to use negative look behind, which has limited support in JavaScript. (see here https://stackoverflow.com/a/50434875/6853740)
Simply adding negative look behind to your existing regex still doesn't work as expected:
((?<!url\()(https?:\/\/)?[\w-]+(\.[\w-]+)+\.?(:\d+)?(\/\S*)?) still matches "E004.png" in your example. Even other URL regexs from this post (What is the best regular expression to check if a string is a valid URL?) also match that. You may need to consider only looking for links that start with http:// or https:// which may help you recraft a regex that will only match full URLs.

search match beetwen three conditions [duplicate]

I am using the following regex for validating youtube video share url's.
var valid = /^(http\:\/\/)?(youtube\.com|youtu\.be)+$/;
alert(valid.test(url));
return false;
I want the regex to support the following URL formats:
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrX1w5luM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
youtu.be/cCnrX1w5luM
I tried different regex but I am not getting a suitable one for share links. Can anyone help me to solve this.
Here's a regex I use to match and capture the important bits of YouTube URLs with video codes:
^((?:https?:)?\/\/)?((?:www|m)\.)?((?:youtube(-nocookie)?\.com|youtu.be))(\/(?:[\w\-]+\?v=|embed\/|v\/)?)([\w\-]+)(\S+)?$
Works with the following URLs:
https://www.youtube.com/watch?v=DFYRQ_zQ-gk&feature=featured
https://www.youtube.com/watch?v=DFYRQ_zQ-gk
http://www.youtube.com/watch?v=DFYRQ_zQ-gk
//www.youtube.com/watch?v=DFYRQ_zQ-gk
www.youtube.com/watch?v=DFYRQ_zQ-gk
https://youtube.com/watch?v=DFYRQ_zQ-gk
http://youtube.com/watch?v=DFYRQ_zQ-gk
//youtube.com/watch?v=DFYRQ_zQ-gk
youtube.com/watch?v=DFYRQ_zQ-gk
https://m.youtube.com/watch?v=DFYRQ_zQ-gk
http://m.youtube.com/watch?v=DFYRQ_zQ-gk
//m.youtube.com/watch?v=DFYRQ_zQ-gk
m.youtube.com/watch?v=DFYRQ_zQ-gk
https://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
http://www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
//www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
www.youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
youtube.com/v/DFYRQ_zQ-gk?fs=1&hl=en_US
https://www.youtube.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube.com/embed/DFYRQ_zQ-gk
http://www.youtube.com/embed/DFYRQ_zQ-gk
//www.youtube.com/embed/DFYRQ_zQ-gk
www.youtube.com/embed/DFYRQ_zQ-gk
https://youtube.com/embed/DFYRQ_zQ-gk
http://youtube.com/embed/DFYRQ_zQ-gk
//youtube.com/embed/DFYRQ_zQ-gk
youtube.com/embed/DFYRQ_zQ-gk
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk?autoplay=1
https://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
//www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
www.youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtube-nocookie.com/embed/DFYRQ_zQ-gk
http://youtube-nocookie.com/embed/DFYRQ_zQ-gk
//youtube-nocookie.com/embed/DFYRQ_zQ-gk
youtube-nocookie.com/embed/DFYRQ_zQ-gk
https://youtu.be/DFYRQ_zQ-gk?t=120
https://youtu.be/DFYRQ_zQ-gk
http://youtu.be/DFYRQ_zQ-gk
//youtu.be/DFYRQ_zQ-gk
youtu.be/DFYRQ_zQ-gk
https://www.youtube.com/HamdiKickProduction?v=DFYRQ_zQ-gk
The captured groups are:
protocol
subdomain
domain
path
video code
query string
https://regex101.com/r/vHEc61/1
You're missing www in your regex
The second \. should optional if you want to match both youtu.be and youtube (but I didn't change this since just youtube isn't actually a valid domain - see note below)
+ in your regex allows for one or more of (youtube\.com|youtu\.be), not one or more wild-cards.
You need to use a . to indicate a wild-card, and + to indicate you want one or more of them.
Try:
^(https?\:\/\/)?(www\.youtube\.com|youtu\.be)\/.+$
Live demo.
If you want it to match URLs with or without the www., just make it optional:
^(https?\:\/\/)?((www\.)?youtube\.com|youtu\.be)\/.+$
Live demo.
Invalid alternatives:
If you want www.youtu.be/... to also match (at the time of writing, this doesn't appear to be a valid URL format), put the optional www. outside the brackets:
^(https?\:\/\/)?(www\.)?(youtube\.com|youtu\.be)\/.+$
youtube/cCnrX1w5luM (with or without http://) isn't a valid URL, but the question explicitly mentions that the regex should support that. To include this, replace youtu\.be with youtu\.?be in any regex above. Live demo.
I know I'm like 2 years late to the party, but I was needing to write something up anyway, and seems to fit every test case that I can throw at it. Should be able to reference the first match ($1) to get the ID. Matches the http, https, www and non-www, youtube.com, youtu.be, /watch? and /watch.php? on youtube.com (youtu.be does not use these), and it supports matching even when there are other variables in the URL string (?t= for time, ?list= for playlists, etc).
(?:https?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]+)
Format for YouTube videos has changed. This regex works for all cases:
^(http(s)??\:\/\/)?(www\.)?((youtube\.com\/watch\?v=)|(youtu.be\/))([a-zA-Z0-9\-_])+
Tests here.
Based on so many other regex; this is the best I have got:
((http(s)?:\/\/)?)(www\.)?((youtube\.com\/)|(youtu.be\/))[\S]+
Test:
http://regexr.com/3bga2
Try this:
((http://)?)(www\.)?((youtube\.com/)|(youtu\.be)|(youtube)).+
http://regexr.com?36o7a
I took one of the answers from here and added support for a few edge cases that I noticed in my dataset. This should work for pretty much any valid url.
^(?:https?:)?(?:\/\/)?(?:youtu\.be\/|(?:www\.|m\.)?youtube\.com\/(?:watch|v|embed)(?:\.php)?(?:\?.*v=|\/))([a-zA-Z0-9\_-]{7,15})(?:[\?&][a-zA-Z0-9\_-]+=[a-zA-Z0-9\_-]+)*(?:[&\/\#].*)?$
I tried this one and it works fine for me.
(?:http(?:s)?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:(?:watch)?\?(?:.*&)?v(?:i)?=|(?:embed|v|vi|user)\/))([^\?&\"'<> #]+)
You can check here https://regex101.com/r/Kvk0nB/1
https://regexr.com/62kgd
^((http|https)\:\/\/)?(www\.youtube\.com|youtu\.?be)\/((watch\?v=)?([a-zA-Z0-9]{11}))(&.*)*$
https://www.youtube.com/watch?v=YPz9zqakRbk
https://www.youtube.com/watch?v=YPz9zqakRbk&t=11
http://youtu.be/cCnrX1w5luM&y=12
http://youtu.be/cCnrX1w5luM
http://youtube/cCnrXswsluM
www.youtube.com/cCnrX1w5luM
youtube/cCnrX1w5luM
Check this pattern instead:
r'(?i)(http.//|https.//)*[A-Za-z0-9._%+-]+\.\w+'

javascript Reg Exp to match specific domain name

I have been trying to make a Reg Exp to match the URL with specific domain name.
So if i want to check if this url is from example.com
what reg exp should be the best?
This reg exp should match following type of URLs:
http://api.example.com/...
http://preview.example.com/...
http://www.example.com/...
http://purhcase.example.com/...
Just simple rule, like http://{something}.example.com/{something} then should pass.
Thank you.
I think this is what you're looking for: (https?:\/\/(.+?\.)?example\.com(\/[A-Za-z0-9\-\._~:\/\?#\[\]#!$&'\(\)\*\+,;\=]*)?).
It breaks down as follows:
https?:\/\/ to match http:// or https:// (you didn't mention https, but it seemed like a good idea).
(.+?\.)? to match anything before the first dot (I made it optional so that, for example, http://example.com/ would be found
example\.com (example.com, of course);
(\/[A-Za-z0-9\-\._~:\/\?#\[\]#!$&'\(\)\*\+,;\=]*)?): a slash followed by every acceptable character in a URL; I made this optional so that http://example.com (without the final slash) would be found.
Example: https://regex101.com/r/kT8lP2/1
Use indexOf javascript API. :)
var url = 'http://api.example.com/api/url';
var testUrl = 'example.com';
if(url.indexOf(testUrl) !== -1) {
console.log('URL passed the test');
} else{
console.log('URL failed the test');
}
EDIT:
Why use indexOf instead of Regular Expression.
You see, what you have here for matching is a simple string (example.com) not a pattern. If you have a fixed string, then no need to introduce semantic complexity by checking for patterns.
Regular expressions are best suited for deciding if patterns are matched.
For example, if your requirement was something like the domain name should start with ex end with le and between start and end, it should contain alphanumeric characters out of which 4 characters must be upper case. This is the usecase where regular expression would prove beneficial.
You have simple problem so it's unnecessary to employ army of 1000 angels to convince someone who loves you. ;)
Use this:
/^[a-zA-Z0-9_.+-]+#(?:(?:[a-zA-Z0-9-]+\.)?[a-zA-Z]+\.)?
(domain|domain2)\.com$/g
To match the specific domain of your choice.
If you want to match only one domain then remove |domain2 from (domain|domain2) portion.
It will help you. https://www.regextester.com/94044
Not sure if this would work for your case, but it would probably be better to rely on the built in URL parser vs. using a regex.
var url = document.createElement('a');
url.href = "http://www.example.com/thing";
You can then call those values using the given to you by the API
url.protocol // (http:)
url.host // (www.example.com)
url.pathname // (/thing)
If that doesn't help you, something like this could work, but is likely too brittle:
var url = "http://www.example.com/thing";
var matches = url.match(/:\/\/(.[^\/]+)(.*)/);
// matches would return something like
// ["://example.com/thing", "example.com", "/thing"]
These posts could also help:
https://stackoverflow.com/a/3213643/4954530
https://stackoverflow.com/a/6168370
Good luck out there!
There are cases where the domain you're looking for could actually be found in the query section but not in the domain section: https://www.google.com/q=www.example.com
This answer would treat that case better.
See this example on regex101.
As you you pointed you only need example.com (write domain then escaped period then com), so use it in regex.
Example
UPDATED
See the answer below

Javascript Regex patterns to pickup URLs

To start off I know this is bad practice. I know there are libraries out there that are supposed to help with this; however, this is the task to which I was assigned and changing this whole thing to work with a library will be much more work than we can take on right now (since we are on a tight time frame).
In our web app we have fields that people usually type URLs into. We have been assigned a task to 'linkify' anything that looks like a URL. Currently the people who wrote our app seemed to have used a regex to determine if a string of text is a URL. I am basing my regex off that (I am no regex guru, not even a novice).
The 'search' regex looks like so
function DoesTextContainLinks(linktText) {
//replace all urls with links!
var linkifyValue = /((ftp|https?):\/\/)?(www\.)?([a-zA-Z0-9\-]{1,}\.){1,}[a-zA-Z0-9]{1,4}(:[0-9]{1,5})?(\/[a-zA-Z0-9\-\_\.\?\&\#]{1,})*(\/)?$/.test(linktText);
return linkifyValue;
}
Using this regex and https://regex101.com/ I have come up with two regexes that work most of the time.
function WrapLinkTextInAnchorTag(linkText) {
//capture links that only have www and add http to the begining of them (regex ignores entries that have http, https, and ftp in them. They are handled by the next regexes)
linkText = linkText.replace(/(^(?:(?!http).)*^(?:(?!ftp).)(www\.)?([a-zA-Z0-9\-]{1,}\.){1,}[a-zA-Z0-9]{1,4}(:[0-9]{1,5})?(\/[a-zA-Z0-9\-\_\.\?\&\#]{1,})*(\/)?$)/gim, "<a href='http://$1'>$1</a>");
//capture links that have https and http on them and fix those too. No need to prepend http here
linkText = linkText.replace(/(((https|http|ftp?):\/\/)?(www\.)?([a-zA-Z0-9\-]{1,}\.){1,}[a-zA-Z0-9]{1,4}(:[0-9]{1,5})?(\/[a-zA-Z0-9\-\_\.\?\&\#]{1,})*(\/)?$)/gim, "<a href='$1'>$1</a>");
return linkText;
}
The problem here is that some complex URLs seem to not work. I can't understand exactly why they don't work. regex101 is pretty bad ass in that it tells you what each part is doing; however, my trouble is combining these keywords in the regex to get them to do what I want. I have two scenarios to account for : when a user types www.something.com | ftp.something.com and when a user actually types http://www.something.com.
I am looking for some help in pointing out exactly what is wrong with my 2 regexes that prevents them from capturing complicated URLs like the one below
https://pw.something.com/AAPS/default.aspx?guid=a5741c35-6fe1-31a1-b555-4028e931642b
I use this one ...
^(http|https|ftp)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?\/?([a-zA-Z0-9\-\._\?\,\'\/\\\+&%\$#\=~])*$
Look here ... Regex Tester
URL RegExp that requires (http, https, ftp)://, A nice domain, and a decent file/folder string. Allows : after domain name, and these characters in the file/folder string (letter, numbers, - . _ ? , ' / \ + & % $ # = ~). It blocks all other special characters and id good for protecting against user input!
If you look closely you will notice that nowhere in your regexps do you match an = character. That's what's breaking on the example you give.
Changing the second regexp by adding a \= to the characters supported in the path:
linkText.replace(/(((https|http|ftp?):\/\/)?(www\.)?([a-zA-Z0-9\-]{1,}\.){1,}[a-zA-Z0-9]{1,4}(:[0-9]{1,5})?(\/[a-zA-Z0-9\-\_\.\?\&\#\=]{1,})*(\/)?$)/gim, "<a href='$1'>$1</a>");
Causes your example URL to match. That said it may be worth slogging through the RFC on urls (http://www.ietf.org/rfc/rfc3986.txt) to find other characters that might be allowed in URLs (even if they have special meanings) because you're probably missing some others.

Capture every URL in text [duplicate]

I have to find the first url in the text with a regular expression:
for example:
I love this website:http://www.youtube.com/music it's fantastic
or
[ es. http://www.youtube.com/music] text
I looked into this issue last year and developed a solution that you may want to look at - See: URL Linkification (HTTP/FTP) This link is a test page for the Javascript solution with many examples of difficult-to-linkify URLs.
My regex solution, written for both PHP and Javascript - is not simple (but neither is the problem as it turns out.) For more information I would recommend also reading:
The Problem With URLs by Jeff Atwood, and
An Improved Liberal, Accurate Regex Pattern for Matching URLs by John Gruber
The comments following Jeff's blog post are a must read if you want to do this right...
Note that this question gets asked a lot. Maybe do a search next time :)
You can't do this perfectly with a regular expression. You may be interested in this blog post. There is a bit more information on Regex Guru, but even those look very fragile. You will need to have additional checks outside of your regular expression to catch the edge cases.
Identifying URLs is tricky because they are often surrounded by punctuation marks and because users frequently do not use the full form of the URL. Many JavaScript functions exist for replacing URLs with hyperlinks, but I was unable to find one that works as well as the urlize filter in the Python-based web framework Django. I therefore ported Django's urlize function to JavaScript: https://github.com/ljosa/urlize.js
It actually would not pick up the URL in your example because there is a colon right before the URL. But if we modify the example a little:
urlize("I love this website: http://www.youtube.com/music it's fantastic", true, true)
=> 'I love this website: http://www.youtube.com/music it's fantastic"'
Note the second argument which, if true, inserts rel="nofollow" and the third argument which, if true, quotes characters that have special meaning in HTML.
This might work->
\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
Found it somewhere
Will find links ->
http://foo.com/blah_blah/
(Something like http://foo.com/blah_blah)
http://foo.com/blah_blah_(wikipedia)
Hope this works....
i am using this regex : :) ( its translated ABNF )
[a-zA-Z]([a-zA-Z]|[0-9]|\+|\-|\.)*:\/\/((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:)*#)?(\[((([0-9A-Fa-f]{1,4}:){6}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|::([0-9A-Fa-f]{1,4}:){5}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|([0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){4}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,1}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){3}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,2}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){2}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,3}[0-9A-Fa-f]{1,4})?::[0-9A-Fa-f]{1,4}:([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,4}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,5}[0-9A-Fa-f]{1,4})?::[0-9A-Fa-f]{1,4}|(([0-9A-Fa-f]{1,4}:){0,6}[0-9A-Fa-f]{1,4})?::)|v[0-9A-Fa-f]\.(([a-zA-Z]|[0-9]|-|\.|_|~)|[!$&'\(\)\*\+,;=]|:))\]|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=])*)(:[0-9]*)?(((\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*|\/((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*)?|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*))?\/?(\?((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?(\#((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?
You can use the following regex expression for extracting any type of url coming in message.
String regex = "(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&/=]*)";
Typescript/Angular
This works for me:
const regExpressionUrl = new RegExp(/(https?:\/\/[^\s]+)/g); //detect URL
Ref: https://www.regextester.com/96249%7CRegular

Categories

Resources