Make regex betterer - javascript

I need an expression that covers two eventualities:
www.example.com
knowledge.example.com
There are many other possible subdomains so it needs to be specifically either the root domain or the knowledge domain.
I did have a go and this appears to work. But it looks long and unsightly and I wondered if there was a more elegant regex:
(www\.)?(knowledge\.)?(example\.com)
It's not that long and ugly, I suppose. I'm just curious if I'm approaching it right or if there's a shorter way of writing it.

This is slightly less ugly, in my opinion:
(www|knowledge)\.(example\.com)
Sometimes I prefer this:
(www|knowledge)[.](example[.]com)

Full equivalent of yours regexp:
((?:www|knowledge)\.)?(example\.com)

Bonus answer,
You can use conditionals with in your regex pattern,
Eg. (?(?!www\.)knowledge|www)(?:\.example\.com)
Working demo # regex101
Edit I
For regex engines that do not support conditionals, below is the workaround used to mimic the if-else flow:
((?(?=positive-regex-statement)then|(?!negavite-regex-statement)then)
((?=www\.)www|(?!www\.)knowledge)(?:\.example\.com)
Working demo # regex101-javascript-conditionals

Regex is not the most readable thing in the world. If you want something more clear in its meaning, try this:
var domain = 'www.example.com';
var subdomain = domain.replace(/(\w+)\.example\.com/, '$1');
var validSubdomains = ['www', 'knowledge'];
validSubdomains.indexOf(subdomain) != -1;

Related

Regex expression to match certain url behavior in my website

I have the following url
https://myurl/blogs/<blog-category>/<blog-article>
I've trying to create a regEx so i can thrigger a script only when i'm in an article.
i tried this among other tests but it didn't work and i'm not really the best guy building RegExs.
window.location.pathname.match(/\/blogs\/^[a-zA-Z0-9_.-]*$\/^[a-zA-Z0-9_.-]*$/
So in my understanding the first part of this regEx (\/blogs\/) is trying just to match a fixed string.
Then next parts just tries to match any kind of numeric,character and _.- combination (which is basically the potential strings that i can have there)
However this is not working at all.
My piece of script is looking like this
if(window.location.pathname.match(/\/blogs\/^[a-zA-Z0-9_.-]*$\/^[a-zA-Z0-9_.-]*$/){
// A code implementation here
}
Note: One thing that i noticed when writing this is that if i remove everything and just try
window.location.pathname.match(/\/blogs\/)
It doesn't work either.
Can someone help me solve this? I will also appreciate any guide that can help me improve my RegEx skills.
Thanks!
Update: to have this working i had to separate my condition into two things to get it to work properly.
It ended up looking like this:
var path = window.location.pathname;
const regEx = /\/blogs\/[a-zA-Z0-9_.-]*\/[a-zA-Z0-9_.-]*/i;
if(path.match(regEx)){
// My code here
}
This should work:
\/blogs\/[a-zA-Z0-9_.-]*\/[a-zA-Z0-9_.-]*
the "^" symbol checks that it is the start of a string which is not the case for the url in question
I would suggest using https://regexr.com/ for testing your regex to remove any other possible issues from other code
var patt = /\/blogs\/[a-zA-Z0-9_.-]*\/[a-zA-Z0-9_.-]*/i window.location.pathname.match(patt)
You can try using this

What regular expression would I use to grab a certain part of this link?

https://www.twitch.tv/averagepothead/clip/TiredRoughElkSquadGoals
I would like to use a regular expression to specifically grab everything after /clip/, aka the five random words that denotes the clip "id". I've been looking up other examples on here, but unfortunately when I write my own expressions based on that I don't get it exactly right... if anyone would be able to point me in the right direction that would be amazing. Thank you!
Regex? Arguably wrong tool for the job
const [dontcare, words] = url.split('clip/');
To show what I mean, here's a quick-and-dirty regex version:
const match = url.match(/[a-zA-Z0-9\/\.:]+clip\/(\w+)/);
const words = match && match[1];
That regex is pretty gnarly for such a basic task. You could make it shorter:
/.*clip\/(\w+)/
at the cost of making it even slower than it already is. Regexes are great for stuff that can't be represented simply as a quick string operation, but are more trouble than they're worth for something like this.

Regex to exclude specific websites in javascript?

I have a regex which matches all the websites but i want to exclude 2 specific websites from this regex?
Regex is
[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)
Websites I want to exclude are
www.gfycat.com
www.imgur.com
imgur.com/*
gfycat.com/*
Is it possible to write the regex which exludes the specific websites? Any suggestions on how to solve this problem?
/[-a-zA-Z0-9#:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)/
I have attached the screenshot for match patterns.
Using RES and regex need to be implemented here.
Try this
^(?:(?!(?:www\.)?(?:google|gfycat|imgur))[-a-zA-Z0-9#:%._\+~#=]{2,256})\.[a-z]{2,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*)
Not sure, why you need regex to do the same. Can you not do something simple like the below, unless I understood it completely wrong.
url = new URL('https://www.google.co.uk/?gfe_rd=cr&ei=bN5IWaP7CYyDtAHKv4CIBg#q=hello');
if url.hostname == 'www.google.com'
// ignore
else
// process
The answer is not relevant to the specific question as OP is using a different tool

Javascript form validation/sanitizing do i need regex here?

I have a single form input that is for checking domains. Sometimes people type in www. before the domain or .com after the domain name. The service that i use to check availability automatically checks for all top level domains so when people add the .com at the end it becomes redundant. For example the string submitted is domainname.com.com which is clearly invalid.
I understand you can do this on the server side but due to some rather weird circumstance i must use javascript for this. So is regex the solution here ? If so is there some kind of regex generator i can use for this or can someone point me in the right direction with a code snippet perhaps ?
Appreciate any help thanks!
This page has an example Regex.
function isUrl(s) {
var regexp = /^(ht|f)tp(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-\.\?\,\'\/\\\+&%\$#_]*)?$/
return regexp.test(s);
}
Here is another example.
function isUrl(s) {
var regexp = /(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/
return regexp.test(s);
}
Well, regex is one possible solution. You can peel off common TLD's like this:
input = input.replace(
/\.(?:com|net|org|biz|edu|in(?:t|fo)|gov|mil|mobi|museum|[a-z][a-z])$/i, "");
Is that the kind of thing you're looking for?

Javascript Regex: extracting variables from paths

Trying to extract variable names from paths (variable is preceded with : ,optionally enclosed by ()), the number of variables may vary
"foo/bar/:firstVar/:(secondVar)foo2/:thirdVar"
Expected output should be:
['firstVar', 'secondVar', 'thirdVar']
Tried something like
"foo/bar/:firstVar/:(secondVar)foo2/:thirdVar".match(/\:([^/:]\w+)/g)
but it doesnt work (somehow it captures colons & doesnt have optional enclosures), if there is some regex mage around, please help. Thanks a lot in advance!
var path = "foo/bar/:firstVar/:(secondVar)foo2/:thirdVar";
var matches = [];
path.replace(/:\(?(\w+)\)?/g, function(a, b){
matches.push(b)
});
matches; // ["firstVar", "secondVar", "thirdVar"]
What about this:
/\:\(?([A-Za-z0-9_\-]+)\)?/
matches:
:firstVar
:(secondVar)
:thirdVar
$1 contains:
firstVar
secondVar
thirdVar
May I recommend that you look into the URI template specification? It does exactly what you're trying to do, but more elegantly. I don't know of any current URI template parsers for JavaScript, since it's usually a server-side operation, but a minimal implementation would be trivial to write.
Essentially, instead of:
foo/bar/:firstVar/:(secondVar)foo2/:thirdVar
You use:
foo/bar/{firstVar}/{secondVar}foo2/{thirdVar}
Hopefully, it's pretty obvious why this format works better in the case of secondVar. Plus it has the added advantage of being a specification, albeit currently still a draft.

Categories

Resources