Trying to find the regex for pretty URLs

Trying to find the regex for pretty URLs - javascript

I have a website structure like this:
/docs/one_of_the_resources/one-of-the-resources.html
/docs/a_complete_different_resource/a-complete-different-resource.html
I want to get rid of all sub-folders in the url and get this:
/one-of-the-resources.html
/a-complete-different-resource.html
Sub-folders should not be affected:
/docs/one_of_the_resources/assets/*
The folder name is always the same as the html file just dashes are swapped with underline and of course there is no suffix.
I'm using grunt-contrib-rewrite and grunt-connect.
Can't wrap my head around it. Is this even possible?

You can use a negated character class
/\/[^/]+$/
[^/]+ Matches anything other than a /. The quantifier + ensures one or more characters.
$ Anchors the regex at the end of the string.
Regex Demo
Example
string = "/docs/one_of_the_resources/one-of-the-resources.html";
console.log(string.match(/\/[^/]+$/)[0]);
// => one-of-the-resources.html

Related

jQuery replace a subdirectory with prefix and suffix

I want to replace a sub directory that has the prefix /s72 and the suffix /. Example:
https://www.example.com/dsasdsad/iufnasdadaso/s72/first-picture.img
https://www.example.com/efggvdfb/pothgpbmfkoe/s72-c/second-picture.img
https://www.example.com/jyhjgfdf/rokomvcrvkmw/s72-w222-h888/third-picture.img
https://www.example.com/pokmhfds/qprigmvmspej/s72-c-d/fourth-picture.img
I tried to change /s72/, /s72-c/, /s72-w222-h888/, /s72-c-d/, and other /s72 subdirectory with /w100-h100-c/ using jQuery .replace(). If I can't use jQuery .replace(), can I replace it in another way?
I have tried:
.replace(/\/s72\S+/g, "/w100-h100-c/")
But the result is:
https://www.example.com/dsasdsad/iufnasdadaso/w100-h100-c/
https://www.example.com/efggvdfb/pothgpbmfkoe/w100-h100-c/
https://www.example.com/jyhjgfdf/rokomvcrvkmw/w100-h100-c/
https://www.example.com/pokmhfds/qprigmvmspej/w100-h100-c/
The filename is lost.

You were trying to match all non-whitespace characters until the end
try this
.replace(/\/s72\S*\//g, "/w100-h100-c/")
https://regex101.com/r/UxZDSa/1

there is no char between 2 group with regex

I need a regexp to filter a list of paths.
my sample list is below:
models/user.js
models/adapter.js
models/acquire.js
models/schema/extension.js
models/schema/permission.js
modules/breaktime/models/break.js
modules/breaktime/models/rule.js
modules/breaktime/models/step.js
modules/pbxmanager/models/group.js
modules/pbxmanager/models/member.js
modules/pbxmanager/models/shift.js
modules/breaktime/models/request.js
modules/breaktime/models/state.js
I'd define the exact start path and want to get only the files under that path, not from subfolders.
For example; if I set models/ as a starter string in the regexp, I should only get first 3 line, not those under schema folder.
I tried to make groups like (^start string)(exactly nothing)(end string$) but no luck.
(^models\/)[\w{0}](\/[\w]+\.js$)
https://regex101.com/r/wNtCni/1
I couldn't find how to set "nothing between two group" in regexp.

If you use a character class [\w{0}] there is at least a single char expected, either a word character, {, 0, or }
In your case you don't want that and you can omit it the character class which will give you (^models\/)(\/[\w]+\.js$) which has a forward slash too much.
If you remove that extra / as well, and remove the unnecessary groups and brackets, you will get
^models\/\w+\.js$
Regex demo

regex to match all keywords in a string

Being noob in regex I require some support from community
Let say I have this string str
www.anysite.com hello demo try this link
anysite.com indeed demo link
http://www.anysite.com another one
www.anysite.com
http://anysite.com
Consider 1-5 as whole string str here
I want to convert all 'anysite.com' into clickable html links, for which I am using:
str = str.replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '$1');
This converts all space separated words starting with http/https/ftp into links as
url
So, line 3 and line 5 has been converted correctly. Now to convert all www.anysite.com into links I again used
str = str.replace(/(\b^(http|https|ftp)?(www\.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig, '$1');
Though it only converts www.anysite.com into link if it is found at very beginning of str. So it convert line number 1 but not line number 4.
Note that I have used ^(http|https|ftp)?(www.) to find all www not
starting with http/https/ftp, as for http they already have been
converted
Also the link on line number 2, where it is neither started with http nor www rather it ends with .com, how the regex would be for that.
For reference you can try posting this whole string to you facebook timeline, it converts all five line into links. Check snapshot

Thanks for help, the final RegEx that helped me is:
//remove all http:// and https://
str = str.replace(/(http|https):\/\//ig, "");
//replace all string ending with .com or .in only into link
str = str.replace( /((www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.(com|in))/ig, '$1');
I used .com and .in for my specific requirement, else the solution on this http://regexr.com/39i0i will work
Though sill there is issue like- it doesn't convert shortened url into
links perfectly. e.g http://s.ly/qhdfTyuiOP will give link till s.ly
Still any suggestions?

^(http|https|ftp)?(www\.) does not mean "all www not starting with http/https/ftp" but rather "a string that starts with an optional http/https/ftp followed by www..
Indeed, ^ in this context isn't a negation but rather an anchor representing the start of the string. I suppose you used it this way because of its meaning when used in a character class ([^...]) ; it is rather tricky since its meaning change depending on the context it is found in.
You could just remove it and you should be fine, as I see no point of making sure the string does not start with http/https/ftp (you transformed those occurrences just before, there should be none left).
Edit : I mentioned lookbehind but forgot it's not available in JS...
If you wanted to make some kind of negation, the easiest way would be to use a negative lookbehind :
(?<!http|https|ftp)www\.
This matches "www." only when it's not preceded by http, https nor ftp.

Hashtag linking in AngularJS and JS

I'm new to regex expressions and don't really understand them. I'm getting comments from a PHP script that may or may not include hashtags. I need to create a link out of the hashtag (not including urls or if the hashtag has a commas or a space in it)
So far I've looked online and found this:
string = string.replace(/(^|\s)(#[a-z\d-]+)/ig, "$1$2");
However, the link generated is:
#thenameofhashtag
I need to be able to exclude the hashtag from the tag= variable line. How can I modify the expression to achieve this and are there any angularJS way's of doing this? Additionally, are languages (Chinese, Japanese, etc) or characters that are not in UTF-8 encoded create problems?

You can exclude the # from the capturing group so that it is not captured in $2 as
(^|\s)#([a-z\d-]+)/ig
#([a-z\d-]+) Here the # is moved outside so that only [a-z\d-]+ is captured
Example
string.replace(/(^|\s)#([a-z\d-]+)/ig, "$1#$2");
// => #thenameofhashtag

Building a Hashtag in Javascript without matching Anchor Names, BBCode or Escaped Characters

I would like to convert any instances of a hashtag in a String into a linked URL:
#hashtag -> should have "#hashtag" linked.
This is a #hashtag -> should have "#hashtag" linked.
This is a [url=http://www.mysite.com/#name]named anchor[/url] -> should not be linked.
This isn't a pretty way to use quotes -> should not be linked.
Here is my current code:
String.prototype.parseHashtag = function() {
return this.replace(/[^&][#]+[A-Za-z0-9-_]+(?!])/, function(t) {
var tag = t.replace("#","")
return t.link("http://www.mysite.com/tag/"+tag);
});
};
Currently, this appears to fix escaped characters (by excluding matches with the amperstand), handles named anchors, but it doesn't link the #hashtag if it's the first thing in the message, and it seems to grab include the 1-2 characters prior to the "#" in the link.
Halp!

How about the following:
/(^|[^&])#([A-Za-z0-9_-]+)(?![A-Za-z0-9_\]-])/g
matches the hashtags in your example. Since JavaScript doesn't support lookbehind, it tries to either match the start of the string or any character except & before the hashtag. It captures the latter so it can later be replaced. It also captures the name of the hashtag.
So, for example:
subject.replace(/(^|[^&])#([A-Za-z0-9_-]+)(?![A-Za-z0-9_\]-])/g, "$1http://www.mysite.com/tag/$2");
will transform
#hashtag
This is a #hashtag and this one #too.
This is a [url=http://www.mysite.com/#name]named anchor[/url]
This isn't a pretty way to use quotes
into
http://www.mysite.com/tag/hashtag
This is a http://www.mysite.com/tag/hashtag and this one http://www.mysite.com/tag/too.
This is a [url=http://www.mysite.com/#name]named anchor[/url]
This isn't a pretty way to use quotes
This probably isn't what t.link() (which I don't know) would have returned, but I hope it's a good starting point.

There is an open-source Ruby gem to do this sort of thing (hashtags and #usernames) called twitter-text. You might get some ideas and regexes from that, or try out this JavaScript port.
Using the JavaScript port, you'll want to just do:
var linked = TwitterText.auto_link_hashtags(text, {hashtag_url_base: "http://www.mysite.come/tag/"});

Tim, your solution was almost perfect. Here's what I ended up using:
subject.replace(/(^| )#([A-Za-z0-9_-]+)(?![A-Za-z0-9_\]-])/g, "$1#$2");
The only change is the first conditional, changed it to match the beginning of the string or a space character. (I tried \s, but that didn't work at all.)

Develop Reference

JavaScript is the programming language of the Web.

Trying to find the regex for pretty URLs - javascript

Related

jQuery replace a subdirectory with prefix and suffix

there is no char between 2 group with regex

regex to match all keywords in a string

Hashtag linking in AngularJS and JS

Building a Hashtag in Javascript without matching Anchor Names, BBCode or Escaped Characters

Categories

Resources