Javascript routing regex

Javascript routing regex - javascript

I need to build a router, that routes a REST request to a correct controller and action. Here some examples:
POST /users
GET /users/:uid
GET /users/search&q=lol
GET /users
GET /users/:uid/pictures
GET /users/:uid/pictures/:pid
It is important to have a single regular expression and as good as possible since routing is essential and done at every request.
we first have to replace : (untill end or untill next forward slash /) in the urls with a regex, that we can afterwards use to validate the url with the request url.
How can we replace these dynamic routings with regex? Like search for a string that starts with ":" and end with "/", end of string or "&".
This is what I tried:
var fixedUrl = new RegExp(url.replace(/\\\:[a-zA-Z0-9\_\-]+/g, '([a-zA-Z0-0\-\_]+)'));
For some reason it does not work. How could I implement a regex that replaces :id with a regex, or just ignores them when comparing to the real request url.
Thanks for help

I'd use :[^\s/]+ for matching parameters starting with colon (match :, then as many characters as possible except / and whitespace).
As replacement, I'm using ([\\w-]+) to match any alphanumeric character, - and _, in a capture group, given you're interested in using the matched parameters as well.
var route = "/users/:uid/pictures";
var routeMatcher = new RegExp(route.replace(/:[^\s/]+/g, '([\\w-]+)'));
var url = "/users/1024/pictures";
console.log(url.match(routeMatcher))

Related

Regular expression match the begining and the end without the middle

How could I merge both those regular expressions? This is interesting because it would allow to match the beginning of a string and the end without touching the content in the middle.
function cleanURL(url) {
url = url.replace(/^(?:https?:\/\/)?(?:www\.|ww\d\.|w\d\d\.)?/, '')
url = url.replace(/(\/.*)/, '')
return url
}
console.log(cleanURL('https://hello-world.example.tld/yello-blue/green'))
Result: hello-world.example.tld

Rather than trying to use regex to tease out parts of URLs just use the URL class:
new URL("https://hello-world.example.tld/yello-blue/green").hostname
That doesn't mean it can't be done with a regex, you just need to look for whatever is between // and /.
All this said, I'm not sure what your actual intent is, because there's a bunch of shenanigans with www etc. The URL approach won't filter, it'll just return the hostname.

How to remove URL from a string completely in Javascript?

I have a string that may contain several url links (http or https). I need a script that would remove all those URLs from the string completely and return that same string without them.
I tried so far:
var url = "and I said http://fdsadfs.com/dasfsdadf/afsdasf.html";
var protomatch = /(https?|ftp):\/\//; // NB: not '.*'
var b = url.replace(protomatch, '');
console.log(b);
but this only removes the http part and keeps the link.
How to write the right regex that it would remove everything that follows http and also detect several links in the string?
Thank you so much!

You can use this regex:
var b = url.replace(/(?:https?|ftp):\/\/[\n\S]+/g, '');
//=> and I said
This regex matches and removes any URL that starts with http:// or https:// or ftp:// and matches up to next space character OR end of input. [\n\S]+ will match across multi lines as well.

Did you search for a url parser regex? This question has a few comprehensive answers Getting parts of a URL (Regex)
That said, if you want something much simpler (and maybe not as perfect), you should remember to capture the entire url string and not just the protocol.
Something like
/(https?|ftp):\/\/[\.[a-zA-Z0-9\/\-]+/
should work better. Notice that the added half parses the rest of the URL after the protocol.

if pathname starts with, as well as contains . Regex

I am trying to test the pathname of the url, checking if pathname starts with privmsg as well as contains one of the words in the selection. And my quantifier is selecting that at least one word must be found.
New RegExp thanks to one of the answers and I extended it more.
var post = /(^\/privmsg\?).+(post|reply){1}(.*)?/;
My urls will look like
/privmsg?mode=post
/privmsg?mode=reply
/privmsg?mode=reply&p=2 //another way
Though we have other modes that I do not want. I need to just get the constant url beginning with privmsg and having at least post or reply in it. Can someone explain what is wrong with my regex string and if I used the quantifier incorrectly.
Problem now is that it is still coming out false...

You need to allow for arbitrary characters between ? and (post|reply) (i.e. mode=). E.g.:
var post = /^\/privmsg\?.+(post|reply){1}/g;
\/
|match any sequence of|
|1 or more characters |

You miss to include something for mode=.
With your regex you will match strings like /privmsg?post.
So alter your regex to include mode=:
^\/privmsg\?.*(post|reply)$

Bookmarklet - Verify URL format and extract substring

I'm trying to build a bookmarklet that preforms a service client side, but I'm not really fluent in Javascript. In my code below I want to take the current page url and first verify that it's a url following a specific format after the domain, which is...
/photos/[any alphanumeric string]/[any numeric string]
after that 3rd "/" should always be the numeric string that I need to extract into a var. Also, I can't just start from the end and work backwards because there will be times that there is another "/" after the numeric string followed by other things I don't need.
Is indexOf() the right function to verify if the url is the specific format and how would I write that expression? I've tried several things related to indexOf() and Regex(), but had no success. I seem to always end up with an unexpected character or it just doesn't work.
And of course the second part of my question is once I know the url is the right format, how do I extract the numeric string into a variable?
Thank you for any help!
javascript:(function(){
// Retrieve the url of the current page
var photoUrl = window.location.pathname;
if(photoUrl.indexOf(/photos/[any alphanumeric string]/[any numeric string]) == true) {
// Extract the numeric substring into a var and do something with it
} else {
// Do something else
}
})();

var id = window.location.pathname.match(/\/photos\/(\w+)\/(\d+)/i);
if (id) alert(id[1]); // use 1 or 2 depending on what you want
else alert('url did not fit expected format');
(EDIT: changed first \d* to \w+ and second \d* to \d+ and dig to id.)

To test strings for patterns and get their parts, you can use regular expressions. Exression for your criteria would be like this:
/^\/photos\/\w+\/(\d+)\/?$/
It will match any string starting with /photos/, followed by any alphanumeric character (and underscore), followed by any number and optional / at the end of string, wrapped in a capture group.
So, if we do this:
"/photos/abc123/123".match(/^\/photos\/\w+\/(\d+)\/?$/)
the result will be ["/photos/abc123/123", "123"]. As you might have noticed, capture group is the second array element.
Ready to use function:
var extractNumeric = function (string) {
var exp = /^\/photos\/\w+\/(\d+)\/?$/,
out = string.match(exp);
return out ? out[1] : false;
};
You can find more detailed example here.
So, the answers:
Is indexOf() the right function to verify if the url is the specific
format and how would I write that expression? I've tried several
things related to indexOf() and Regex(), but had no success. I seem to
always end up with an unexpected character or it just doesn't work.
indexOf isn't the best choice for the job, you were right about using regular expression, but lacked experience to do so.
And of course the second part of my question is once I know the url is
the right format, how do I extract the numeric string into a variable?
Regular expression together with match function will allow to test string for desired format and get it's portions at the same time.

Javascript url validation allowing relative and absolute urls

I'm trying to validate a field to allow relative and absolute urls. I'm using the regex from this post but it is allowing spaces in the url.
var urlRegex = new RegExp(/(\/?[\w-]+)(\/[\w-]+)*\/?|(((http|ftp|https):\/\/)?[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?)/gi);
Example:
// this should work
this/will/work.aspx?say=hello
http://www.example.com/this/will/work.aspx?say=hello
// this shouldn't work but does
and/this will also work/even though it shouldn't
and/this-shouldn't/but it does/also
The code below is what I was originally using to validate just absolute urls and it was working perfectly. If I remember properly, I pulled it from the jquery source. If this could be modified to also accept relative urls that would be perfect, but this is out of my league.
var urlRegex = new RegExp(/^(https?|ftp):\/\/(((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:)*#)?(((\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5])\.(\d|[1-9]\d|1\d\d|2[0-4]\d|25[0-5]))|((([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|\d|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.)+(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])*([a-z]|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])))\.?)(:\d*)?)(\/((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)+(\/(([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)*)*)?)?(\?((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|[\uE000-\uF8FF]|\/|\?)*)?(\#((([a-z]|\d|-|\.|_|~|[\u00A0-\uD7FF\uF900-\uFDCF\uFDF0-\uFFEF])|(%[\da-f]{2})|[!\$&'\(\)\*\+,;=]|:|#)|\/|\?)*)?$/i);

I think you just need to anchor the pattern so that it has to match the whole string:
var urlRegex = /^(\/?[\w-]+)(\/[\w-]+)*\/?|(((http|ftp|https):\/\/)?[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?)$/gi;
The leading ^ and trailing $ means that the pattern has to match the entire string instead of just some part of it.
edit that said, the pattern has other problems. First, those HTML entities for & (&) need to be just "&". The slashes don't need to be escaped in [] groups, and we don't need the "g" suffix. That leaves us with:
var urlRegex = /^(?:(\/?[\w-]+)(\/[\w-]+)*\/?|(((http|ftp|https):\/\/)?[\w-]+(\.[\w-]+)*([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?))$/i;
edit again - oops also need to wrap the whole thing.

I wrote an article about URI validation complete with code snippets for all the various URI components as defined by RFC3986 here:
Regular Expression URI Validation
You may find what you are looking for there. Note however that almost any string represents a valid URI - even an empty string!

Develop Reference

JavaScript is the programming language of the Web.

Javascript routing regex - javascript

Related

Regular expression match the begining and the end without the middle

How to remove URL from a string completely in Javascript?

if pathname starts with, as well as contains . Regex

Bookmarklet - Verify URL format and extract substring

Javascript url validation allowing relative and absolute urls

Categories

Resources