Regex expression to match certain url behavior in my website - javascript

I have the following url
https://myurl/blogs/<blog-category>/<blog-article>
I've trying to create a regEx so i can thrigger a script only when i'm in an article.
i tried this among other tests but it didn't work and i'm not really the best guy building RegExs.
window.location.pathname.match(/\/blogs\/^[a-zA-Z0-9_.-]*$\/^[a-zA-Z0-9_.-]*$/
So in my understanding the first part of this regEx (\/blogs\/) is trying just to match a fixed string.
Then next parts just tries to match any kind of numeric,character and _.- combination (which is basically the potential strings that i can have there)
However this is not working at all.
My piece of script is looking like this
if(window.location.pathname.match(/\/blogs\/^[a-zA-Z0-9_.-]*$\/^[a-zA-Z0-9_.-]*$/){
// A code implementation here
}
Note: One thing that i noticed when writing this is that if i remove everything and just try
window.location.pathname.match(/\/blogs\/)
It doesn't work either.
Can someone help me solve this? I will also appreciate any guide that can help me improve my RegEx skills.
Thanks!
Update: to have this working i had to separate my condition into two things to get it to work properly.
It ended up looking like this:
var path = window.location.pathname;
const regEx = /\/blogs\/[a-zA-Z0-9_.-]*\/[a-zA-Z0-9_.-]*/i;
if(path.match(regEx)){
// My code here
}

This should work:
\/blogs\/[a-zA-Z0-9_.-]*\/[a-zA-Z0-9_.-]*
the "^" symbol checks that it is the start of a string which is not the case for the url in question
I would suggest using https://regexr.com/ for testing your regex to remove any other possible issues from other code

var patt = /\/blogs\/[a-zA-Z0-9_.-]*\/[a-zA-Z0-9_.-]*/i window.location.pathname.match(patt)
You can try using this

Related

What RegEx to use to include all subroutes except for specific urls

I'm terribly at regex and I could use some help in building a regular expression so that I can target all subroutes on a specific domain and at the same time exclude a couple of specific subroutes.
The regex is to be used in JavaScript (as page targeting within the Optimizely software).
Should allow:
www.mydomain.com/**/*
www.mydomain.com/foo/**/*
Should not allow
www.mydomain.com/foo/bar/**/*
www.mydomain.com/baz/**/*
The part I am most struggling with is allowing everything, also allowing everything ending with /foo/... except when it is ending with /foo/bar/..., while also excluding anything ending with /baz/....
Any help is much appreciated, thank you in advance!
You can use a negative lookahead assertion to exclude specific patterns:
^www\.mydomain\.com\/(?!(?:foo\/bar|baz)\/).*\/.*
Demo: https://regex101.com/r/w6MQA0/1
Use this (www.mydomain.com\/)(([a-z]+\/)*(foo\/))?\*\*\/\*. It should work.
It's working in this scenario:
`www.mydomain.com/**/*`
or
`www.mydomain.com/<any params may or may not be>/foo/**/*`
Code:
var regx = /(www.mydomain.com\/)(([a-z]+\/)*(foo\/))?\*\*\/\*/g;
ar = ['www.mydomain.com/**/*', 'www.mydomain.com/foo/**/*','www.mydomain.com/foo/bar/**/*','www.mydomain.com/baz/**/*']
regx.test(ar[0]) // true
regx.test(ar[1]) // true
regx.test(ar[2]) // false
regx.test(ar[3]) // false
Demo: https://regex101.com/r/05vUz8/1
Other regex for referrance:
https://regex101.com/r/NoDI87/1
https://regex101.com/r/HFaQo0/1
Thanks for the replies, they helped me in finding the solution myself:
domain\.com((?=\/foo)|(?!\/foo\/bar\/|\/baz\/)).*

Regex to exclude specific websites in javascript?

I have a regex which matches all the websites but i want to exclude 2 specific websites from this regex?
Regex is
[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)
Websites I want to exclude are
www.gfycat.com
www.imgur.com
imgur.com/*
gfycat.com/*
Is it possible to write the regex which exludes the specific websites? Any suggestions on how to solve this problem?
/[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)/
I have attached the screenshot for match patterns.
Using RES and regex need to be implemented here.
Try this
^(?:(?!(?:www\.)?(?:google|gfycat|imgur))[-a-zA-Z0-9#:%._\+~#=]{2,256})\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)
Not sure, why you need regex to do the same. Can you not do something simple like the below, unless I understood it completely wrong.
url = new URL('https://www.google.co.uk/?gfe_rd=cr&ei=bN5IWaP7CYyDtAHKv4CIBg#q=hello');
if url.hostname == 'www.google.com'
// ignore
else
// process
The answer is not relevant to the specific question as OP is using a different tool

JavaScript RegEx match unless wrapped with [nocode][/nocode] tags

My current code is:
var user_pattern = this.settings.tag;
user_pattern = user_pattern.replace(/[\-\[\]\/\{\}\(\)\*\+\?\.\\\^\$\|]/g, "\\$&"); // escape regex
var pattern = new RegExp(user_pattern.replace(/%USERNAME%/i, "(\\S+)"), "ig");
Where this.settings.tag is a string such as "[user=%USERNAME%]" or "#%USERNAME%". The code uses pattern.exec(str) to find any username in the corresponding tag and works perfectly fine. For example, if str = "Hello, [user=test]" then pattern.exec(str) will find test.
This works fine, but I want to be able to stop it from matching if the string is wrapped in [nocode][/nocode] tags. For example, if str = "[nocode]Hello, [user=test], how are you?[/nocode]" thenpattern.exec(str)` should not match anything.
I'm not quite sure where to start. I tried using a (?![nocode]) before and after the pattern, but to no avail. Any help would be great.
I would just test if the string starts with [nocode] first:
/^\[nocode\]/.test('[nocode]');
Then simply do not process it.
Maybe filter out [nocode] before trying to find the username(s)?
pattern.exec(str.replace(/\[nocode\](.*)\[\/nocode\]/g,''));
I know this isn't exactly what you asked for because now you have to use two separate regular expressions, however code readability is important too and doing it this way is definitely better in that aspect. Hope this helps 😉
JSFiddle: http://jsfiddle.net/1f485Lda/1/
It's based on this: Regular Expression to get a string between two strings in Javascript

regex exclude certain tags

just need a quick help for solving this problem.
I want to strip all html tags out of a string except the tags from a whitelist(variable).
My code so far:
whitelist = 'p|br|ul|li|strike|em|strong|a',
reqExp = new RegExp('<\/?[^>|' + whitelist + ']+\/?>');
The problem is now it works more or less fine but also not removing for example b because it matches the b from the br out of the whitelist.
I tried different approaches but dont find the right solution.
How can i tell the regex to do something like /.WITHOUT(smth)/ (therefore: match all expect everything following).
Use this regex:-
<(?!/?(p|br|ul|li|strike|em|strong|a)(>|\s))[^<]+?>
LIVE DEMO
For more information, refer to my earlier answer, which fullfill your requirement.

Capture every URL in text [duplicate]

I have to find the first url in the text with a regular expression:
for example:
I love this website:http://www.youtube.com/music it's fantastic
or
[ es. http://www.youtube.com/music] text
I looked into this issue last year and developed a solution that you may want to look at - See: URL Linkification (HTTP/FTP) This link is a test page for the Javascript solution with many examples of difficult-to-linkify URLs.
My regex solution, written for both PHP and Javascript - is not simple (but neither is the problem as it turns out.) For more information I would recommend also reading:
The Problem With URLs by Jeff Atwood, and
An Improved Liberal, Accurate Regex Pattern for Matching URLs by John Gruber
The comments following Jeff's blog post are a must read if you want to do this right...
Note that this question gets asked a lot. Maybe do a search next time :)
You can't do this perfectly with a regular expression. You may be interested in this blog post. There is a bit more information on Regex Guru, but even those look very fragile. You will need to have additional checks outside of your regular expression to catch the edge cases.
Identifying URLs is tricky because they are often surrounded by punctuation marks and because users frequently do not use the full form of the URL. Many JavaScript functions exist for replacing URLs with hyperlinks, but I was unable to find one that works as well as the urlize filter in the Python-based web framework Django. I therefore ported Django's urlize function to JavaScript: https://github.com/ljosa/urlize.js
It actually would not pick up the URL in your example because there is a colon right before the URL. But if we modify the example a little:
urlize("I love this website: http://www.youtube.com/music it's fantastic", true, true)
=> 'I love this website: http://www.youtube.com/music it's fantastic"'
Note the second argument which, if true, inserts rel="nofollow" and the third argument which, if true, quotes characters that have special meaning in HTML.
This might work->
\b(([\w-]+://?|www[.])[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/)))
Found it somewhere
Will find links ->
http://foo.com/blah_blah/
(Something like http://foo.com/blah_blah)
http://foo.com/blah_blah_(wikipedia)
Hope this works....
i am using this regex : :) ( its translated ABNF )
[a-zA-Z]([a-zA-Z]|[0-9]|\+|\-|\.)*:\/\/((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:)*#)?(\[((([0-9A-Fa-f]{1,4}:){6}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|::([0-9A-Fa-f]{1,4}:){5}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|([0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){4}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,1}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){3}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,2}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:){2}([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,3}[0-9A-Fa-f]{1,4})?::[0-9A-Fa-f]{1,4}:([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,4}[0-9A-Fa-f]{1,4})?::([0-9A-Fa-f]{1,4}:[0-9A-Fa-f]{1,4}|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9]))|(([0-9A-Fa-f]{1,4}:){0,5}[0-9A-Fa-f]{1,4})?::[0-9A-Fa-f]{1,4}|(([0-9A-Fa-f]{1,4}:){0,6}[0-9A-Fa-f]{1,4})?::)|v[0-9A-Fa-f]\.(([a-zA-Z]|[0-9]|-|\.|_|~)|[!$&'\(\)\*\+,;=]|:))\]|(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])\.(25[0-5]|2[0-4][0-9]|1[0-9]{2}|[1-9][0-9]|[0-9])|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=])*)(:[0-9]*)?(((\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*|\/((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*)?|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*|(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|#){1}(\/(([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)*)*))?\/?(\?((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?(\#((([a-zA-Z]|[0-9]|-|\.|_|~)|%[0-9A-Fa-f][0-9A-Fa-f]|[!$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?
You can use the following regex expression for extracting any type of url coming in message.
String regex = "(http(s)?:\/\/.)?(www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&/=]*)";
Typescript/Angular
This works for me:
const regExpressionUrl = new RegExp(/(https?:\/\/[^\s]+)/g); //detect URL
Ref: https://www.regextester.com/96249%7CRegular

Categories

Resources