Match URL and it's sub directories - javascript

How would I write a regex to match /path/subpath and all /path/subpath/*, but not /path/subpathrandomcharacters
I am currently trying this \/path\/subpath.*
But this is also matching /path/subpathrandomcharacters

After you check the value of the path/subparth, then it can either have a slash then anything or nothing.
\/path\/subpath(\/.*)*
https://www.regextester.com/?fam=103013

Related

there is no char between 2 group with regex

I need a regexp to filter a list of paths.
my sample list is below:
models/user.js
models/adapter.js
models/acquire.js
models/schema/extension.js
models/schema/permission.js
modules/breaktime/models/break.js
modules/breaktime/models/rule.js
modules/breaktime/models/step.js
modules/pbxmanager/models/group.js
modules/pbxmanager/models/member.js
modules/pbxmanager/models/shift.js
modules/breaktime/models/request.js
modules/breaktime/models/state.js
I'd define the exact start path and want to get only the files under that path, not from subfolders.
For example; if I set models/ as a starter string in the regexp, I should only get first 3 line, not those under schema folder.
I tried to make groups like (^start string)(exactly nothing)(end string$) but no luck.
(^models\/)[\w{0}](\/[\w]+\.js$)
https://regex101.com/r/wNtCni/1
I couldn't find how to set "nothing between two group" in regexp.
If you use a character class [\w{0}] there is at least a single char expected, either a word character, {, 0, or }
In your case you don't want that and you can omit it the character class which will give you (^models\/)(\/[\w]+\.js$) which has a forward slash too much.
If you remove that extra / as well, and remove the unnecessary groups and brackets, you will get
^models\/\w+\.js$
Regex demo

RegEx match only final domain name from any email address

I want to match only parent domain name from an email address, which might or might not have a subdomain.
So far I have tried this:
new RegExp(/.+#(:?.+\..+)/);
The results:
Input: abc#subdomain.maindomain.com
Output: ["abc#subdomain.domain.com", "subdomain.maindomain.com"]
Input: abc#maindomain.com
Output: ["abc#maindomain.com", "maindomain.com"]
I am interested in the second match (the group).
My objective is that in both cases, I want the group to match and give me only maindomain.com
Note: before the down vote, please note that neither have I been able to use existing answers, nor the question matches existing ones.
One simple regex you can use to get only the last 2 parts of the domain name is
/[^.]+\.[^.]$/
It matches a sequence of non-period characters, followed by period and another sequence of non-periods, all at the end of the string. This regex doesn't ensure that this domain name happens after a "#". If you want to make a regex that also does that, you could use lazy matching with "*?":
/#.*?([^.]+\.[^.])$/
However,I think that trying to do everything at once tends to make the make regexes more complicated and hard to read. In this problem I would prefer to do things in two steps: First check that the email has an "#" in it. Then you get the part after the "#" and pass it to the simple regex, which will extract the domain name.
One advantage of separating things is that some changes are easier. For example, if you want to make sure that your email only has a single "#" in it its very easy to do in a separate step but would be tricky to achieve in the "do everything" regex.
You can use this regex:
/#(?:[^.\s]+\.)*([^.\s]+\.[^.\s]+)$/gm
Use captured group #1 for your result.
It matches # followed by 0 or more instance of non-DOT text and a DOT i.e. (?:[^.\s]+\.)*.
Using ([^.\s]+\.[^.\s]+)$ it is matching and capturing last 2 components separated by a DOT.
RegEx Demo
With the following maindomain should always return the maindomain.com bit of the string.
var pattern = new RegExp(/(?:[\.#])(\w[\w-]*\w\.\w*)$/);
var str = "abc#subdomain.maindomain.com";
var maindomain = str.match(pattern)[1];
http://codepen.io/anon/pen/RRvWkr
EDIT: tweaked to disallow domains starting with a hyphen i.e - '-yahoo.com'

Trying to find the regex for pretty URLs

I have a website structure like this:
/docs/one_of_the_resources/one-of-the-resources.html
/docs/a_complete_different_resource/a-complete-different-resource.html
I want to get rid of all sub-folders in the url and get this:
/one-of-the-resources.html
/a-complete-different-resource.html
Sub-folders should not be affected:
/docs/one_of_the_resources/assets/*
The folder name is always the same as the html file just dashes are swapped with underline and of course there is no suffix.
I'm using grunt-contrib-rewrite and grunt-connect.
Can't wrap my head around it. Is this even possible?
You can use a negated character class
/\/[^/]+$/
[^/]+ Matches anything other than a /. The quantifier + ensures one or more characters.
$ Anchors the regex at the end of the string.
Regex Demo
Example
string = "/docs/one_of_the_resources/one-of-the-resources.html";
console.log(string.match(/\/[^/]+$/)[0]);
// => one-of-the-resources.html

if pathname starts with, as well as contains . Regex

I am trying to test the pathname of the url, checking if pathname starts with privmsg as well as contains one of the words in the selection. And my quantifier is selecting that at least one word must be found.
New RegExp thanks to one of the answers and I extended it more.
var post = /(^\/privmsg\?).+(post|reply){1}(.*)?/;
My urls will look like
/privmsg?mode=post
/privmsg?mode=reply
/privmsg?mode=reply&p=2 //another way
Though we have other modes that I do not want. I need to just get the constant url beginning with privmsg and having at least post or reply in it. Can someone explain what is wrong with my regex string and if I used the quantifier incorrectly.
Problem now is that it is still coming out false...
You need to allow for arbitrary characters between ? and (post|reply) (i.e. mode=). E.g.:
var post = /^\/privmsg\?.+(post|reply){1}/g;
\/
|match any sequence of|
|1 or more characters |
You miss to include something for mode=.
With your regex you will match strings like /privmsg?post.
So alter your regex to include mode=:
^\/privmsg\?.*(post|reply)$

Stop matching when a certain character is reached

I've the following regex which needs to stop matching when it encounters a hash.
Regex:
/[?&]+([^=&]+)=([^&]*)/gi
URL Sample:
http://website.com/1068?page=4&taco=cat#tasty
The above regex will capture cat#tasty instead of just cat in the last capture group. I attempted the following which works ONLY if a hash is present.
Regex Test:
/[?&]+([^=&]+)=([^&]*)#/gi
If the url doesn't have a hash, it won't match. making the hash optional — #? — doesn't work either as the greedy * of the last capture group still grabs cat#tasty.
A little-known way to parse URLs in JavaScript is to simply create an a element and give it the url as the href attribute!
var link=document.createElement('a')
link.href="http://website.com/1068?page=4&taco=cat#tasty"
alert(link.search) //?page=4&taco=cat
alert(link.hash) //#tasty
Just tossing this out there. If you do your regex on just link.search (or perhaps link.search.substr(1)) you won't have to worry about ever matching anything but parameters.
/[?&]+([^=&]+)=([^&#]*)/gi
Although as Ray pointed out, there are many url parsers available.

Categories

Resources