Auto-shorten the URL in JS or Rails

Auto-shorten the URL in JS or Rails - javascript

In Twitter, if you post a link in tweet, for example, http://stackoverflow.com/questions/8699459/get-title-content-via-link-in-rails
the URL automatically changes to a shorten one:
And the correspongding Html is:
<span class="invisible">http://</span>
<span class="js-display-url">stackoverflow.com/questions/8699</span>
<span class="invisible">459/get-title-content-via-link-in-rails</span>
(the http:// and long partis hidden)
So How can I achieve this? I believe it's something with JS? or How should I approach this with Rails?
Many Thanks!!!

I think this is what you're looking for or at least close to it.
aHref.innerHTML = aHref.match(/([\w\d\-_]*\.)+(\w*$|\w*\/[\w\d]*)/)[0];
In short; this ignores the protocol:// and takes every string ending in "." (for sub domains) until it reaches the last item (could be "com", could be "uk" as in "co.uk") denoted by either "$" (end of the string) or a "/" followed by an alphanumeric string. With the resulting string it replaces the innerHTML or displayed content value.
So for example: "http://test.couch.com/333/fire" would become "test.couch.com/333"
Edit: I should add that I've only accounted for
A-Za-z0-9 and -_
in the url up to the "/" and only
A-Za-z0-9
For the remainder.
Edit 2: Using in Rails controller
The same principle applies but without the innerHTML since it's a DOM element property
url = url.match(/([\w\d\-_]*\.)+(\w*$|\w*\/[\w\d]*)/).first
The line above will do the same thing as the javascript or at least will create the same shortened url.
If you mean how to use with the a helper:
shortUrl= url.match(/([\w\d\-_]*\.)+(\w*$|\w*\/[\w\d]*)/).first
link_to shortUrl, url

Related

Javascript | Link/Bookmarklet to remove variables in url

I found this and was able to do what I initially wanted.
Javascript | Link/Bookmarklet to replace current window location
javascript:(function(){var loc=location.href;loc=loc.replace('gp/product','dp'); location.replace(loc)})()
Which was to change an amazon url from the product link to the dude perfect link.
It turns this url: https://www.amazon.com/gp/product/B01NBKTPTS/
into this url: https://www.amazon.com/dp/B01NBKTPTS/
I would like to take this a step further. Is there a way to do the above switch and then also remove the string of variables after the ? essentially cleaning up
https://www.amazon.com/gp/product/B01NBKTPTS/?pf_rd_r=DQV2YXJP8FFKM1Q50KS9&pf_rd_p=eb347dce-a775-4231-8920-ae66bdd987f4&pf_rd_m=ATVPDKIKX0DER&pf_rd_t=Landing&pf_rd_i=16310101&pf_rd_s=merchandised-search-2&linkCode=ilv&tag=onamzbybcreat-20&ascsubtag=At_Home_Cooking_210426210002&pd_rd_i=B01NBKTPTS
to
https://www.amazon.com/dp/B01NBKTPTS/
Thanks!

You've almost done it yourself!
To do the second part you can use split on your /? string (i.e. URL).
In our case that will give you an array with two elements: the first element stores everything BEFORE the /? (reference [0], that's what we can use), and the other stores everything AFTER (reference [1], not needed for us)
FYI: if there were more /?, then split would produce an array with several elements. Additional information.
In addition, you shouldn't forget to escape the special character / this way: \/.
So here is the final working bookmarklet code to get the first URL part before /? letters, with gd/product replaced by dp:
javascript:(function(){
var loc=location.href;
loc=loc
.split('\/?')[0]
.replace('gp/product','dp')
+'/';
location.replace(loc);
})();

regex to match all keywords in a string

Being noob in regex I require some support from community
Let say I have this string str
www.anysite.com hello demo try this link
anysite.com indeed demo link
http://www.anysite.com another one
www.anysite.com
http://anysite.com
Consider 1-5 as whole string str here
I want to convert all 'anysite.com' into clickable html links, for which I am using:
str = str.replace(/((http|https|ftp):\/\/[\w?=&.\/-;#~%-]+(?![\w\s?&.\/;#~%"=-]*>))/g, '$1');
This converts all space separated words starting with http/https/ftp into links as
url
So, line 3 and line 5 has been converted correctly. Now to convert all www.anysite.com into links I again used
str = str.replace(/(\b^(http|https|ftp)?(www\.)[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig, '$1');
Though it only converts www.anysite.com into link if it is found at very beginning of str. So it convert line number 1 but not line number 4.
Note that I have used ^(http|https|ftp)?(www.) to find all www not
starting with http/https/ftp, as for http they already have been
converted
Also the link on line number 2, where it is neither started with http nor www rather it ends with .com, how the regex would be for that.
For reference you can try posting this whole string to you facebook timeline, it converts all five line into links. Check snapshot

Thanks for help, the final RegEx that helped me is:
//remove all http:// and https://
str = str.replace(/(http|https):\/\//ig, "");
//replace all string ending with .com or .in only into link
str = str.replace( /((www\.)?[-a-zA-Z0-9#:%._\+~#=]{2,256}\.(com|in))/ig, '$1');
I used .com and .in for my specific requirement, else the solution on this http://regexr.com/39i0i will work
Though sill there is issue like- it doesn't convert shortened url into
links perfectly. e.g http://s.ly/qhdfTyuiOP will give link till s.ly
Still any suggestions?

^(http|https|ftp)?(www\.) does not mean "all www not starting with http/https/ftp" but rather "a string that starts with an optional http/https/ftp followed by www..
Indeed, ^ in this context isn't a negation but rather an anchor representing the start of the string. I suppose you used it this way because of its meaning when used in a character class ([^...]) ; it is rather tricky since its meaning change depending on the context it is found in.
You could just remove it and you should be fine, as I see no point of making sure the string does not start with http/https/ftp (you transformed those occurrences just before, there should be none left).
Edit : I mentioned lookbehind but forgot it's not available in JS...
If you wanted to make some kind of negation, the easiest way would be to use a negative lookbehind :
(?<!http|https|ftp)www\.
This matches "www." only when it's not preceded by http, https nor ftp.

replace page number using Regular Expression in Javascript

I have a URL as Follow:
http://example.com/category/news/page/2/
I need to replace any number that comes at the end of URL which represents page number.
If possible which I think it is, I want to use regular expression in case the domain changes, the code still works.
I am also using PHP ...
Could help me with a proper RegEx?

Find The Answer:
string.replace(/\/page\/[0-9]+/, '/page/' + pageNum);
pageNum can be any variable to replace the page number

Regular expression for detecting hyperlinks

I've got this regex pattern from WMD showdown.js file.
/<((https?|ftp|dict):[^'">\s]+)>/gi
and the code is:
text = text.replace(/<((https?|ftp|dict):[^'">\s]+)>/gi,"$1");
But when I set text to http://www.google.com, it does not anchor it, it returns the original text value as is (http://www.google.com).
P.S: I've tested it with RegexPal and it does not match.

Your code is searching for a url wrapped in <> like: <http://www.google.com>: RegexPal.
Just change it to /((https?|ftp|dict):[^'">\s]+)/gi if you don't want it to search for the <>: RegexPal

As long as you know your url's start with http:// or https:// or whatever you can use:
/((https?|s?ftp|dict|www)(://)?)[A-Za-z0-9.\-]+)/gi
The expression will match till it encounters a character not allowed in the URL i.e. is not A-Za-z\.\-. It will not however detect anything of the form google.com or anything that comes after the domain name like parameters or sub directory paths etc. If that is your requirement that you can simply choose to terminate the terminating condition as you have above in your regex.
I know it seems pointless but it may be useful if you want the display name to be something abbreviated rather than the whole url in case of complex urls.

You could use:
var re = /(http|https|ftp|dict)(:\/\/\S+?)(\.?\s|\.?$)/gi;
with:
el.innerHTML = el.innerHTML.replace(re, '<a href=\'$1$2\'>$1$2<\/a>$3');
to also match URLs at the end of sentences.
But you need to be very careful with this technique, make sure the content of the element is more or less plain text and not complex markup. Regular expressions are not meant for, nor are they good at, processing or parsing HTML.

Building a Hashtag in Javascript without matching Anchor Names, BBCode or Escaped Characters

I would like to convert any instances of a hashtag in a String into a linked URL:
#hashtag -> should have "#hashtag" linked.
This is a #hashtag -> should have "#hashtag" linked.
This is a [url=http://www.mysite.com/#name]named anchor[/url] -> should not be linked.
This isn't a pretty way to use quotes -> should not be linked.
Here is my current code:
String.prototype.parseHashtag = function() {
return this.replace(/[^&][#]+[A-Za-z0-9-_]+(?!])/, function(t) {
var tag = t.replace("#","")
return t.link("http://www.mysite.com/tag/"+tag);
});
};
Currently, this appears to fix escaped characters (by excluding matches with the amperstand), handles named anchors, but it doesn't link the #hashtag if it's the first thing in the message, and it seems to grab include the 1-2 characters prior to the "#" in the link.
Halp!

How about the following:
/(^|[^&])#([A-Za-z0-9_-]+)(?![A-Za-z0-9_\]-])/g
matches the hashtags in your example. Since JavaScript doesn't support lookbehind, it tries to either match the start of the string or any character except & before the hashtag. It captures the latter so it can later be replaced. It also captures the name of the hashtag.
So, for example:
subject.replace(/(^|[^&])#([A-Za-z0-9_-]+)(?![A-Za-z0-9_\]-])/g, "$1http://www.mysite.com/tag/$2");
will transform
#hashtag
This is a #hashtag and this one #too.
This is a [url=http://www.mysite.com/#name]named anchor[/url]
This isn't a pretty way to use quotes
into
http://www.mysite.com/tag/hashtag
This is a http://www.mysite.com/tag/hashtag and this one http://www.mysite.com/tag/too.
This is a [url=http://www.mysite.com/#name]named anchor[/url]
This isn't a pretty way to use quotes
This probably isn't what t.link() (which I don't know) would have returned, but I hope it's a good starting point.

There is an open-source Ruby gem to do this sort of thing (hashtags and #usernames) called twitter-text. You might get some ideas and regexes from that, or try out this JavaScript port.
Using the JavaScript port, you'll want to just do:
var linked = TwitterText.auto_link_hashtags(text, {hashtag_url_base: "http://www.mysite.come/tag/"});

Tim, your solution was almost perfect. Here's what I ended up using:
subject.replace(/(^| )#([A-Za-z0-9_-]+)(?![A-Za-z0-9_\]-])/g, "$1#$2");
The only change is the first conditional, changed it to match the beginning of the string or a space character. (I tried \s, but that didn't work at all.)

Develop Reference

JavaScript is the programming language of the Web.

Auto-shorten the URL in JS or Rails - javascript

Related

Javascript | Link/Bookmarklet to remove variables in url

regex to match all keywords in a string

replace page number using Regular Expression in Javascript

Regular expression for detecting hyperlinks

Building a Hashtag in Javascript without matching Anchor Names, BBCode or Escaped Characters

Categories

Resources