How to remove an apostrophe from a string in a loop? - javascript

I currently have a loop generating a URL to appear on a page. I have a few URL's that have an apostrophe in the string that it returns. I am attempting to remove the apostrophe from the URL. I am familiar with the .replace method which is what I'm attempting to implement. I am declaring a variable at the start of the case and then using the variable newUrl
Code:
} else {
let newUrl = sectorHTML += "<a href=\"" + linkURLPart1 + linkURLPart2 + jsonData.Column1[r].ReportURL + "/\">
<span style=\"color:#000000;\">"
//linkURLPart 1 = new ; linkUrlPart2 = article;
json.DataColumn1[r].ReportUrl = people's-home
sectorHTML += "<div>";
newUrl = newUrl.replace("'", "")
}
Current output:
/new/article/people's-home/
expected output:
/new/article/peoples-home/

Related

Convert String in the Parenthesis in the URL To Query Parameters: Javascript

I want to convert the string in {} in search URL to query parameters which would help users capture search terms in web analytics tools.
Here's what I am trying to do, Let's say
Search URL is:
example.com/search/newyork-gyms?dev=desktop&id=1220391131
User Input will be:
var search_url_format = '/search/{city}-{service}
Output URL:
example.com/search?city=newyork&service=gyms&dev=desktop&id=1220391131
The problem is when is use the regex {(.*)} it captures the whole string {city}-{service}.
But I what I want is [{city},{service}].
The URL format can also be like
search/{city}/{service}/
search/{city}_{service}/
What I have tried is for a single variable.
It returns correct output.
Eg: URL:/search/newyork
User Input: /search/{city}
Output: /search/newyork?city=newyork
URL: /search-germany
User Input: /search-{country}
Output: /search-germany?country=germany
var search_url_format = '/search/{city}' //User Enters any variable in brackets
var urloutput = '/search/newyork' //Demo URL
//Log output
console.log(URL2Query(search_url_format)) //Output: '/search/newyork?city=newyork'
function URL2Query(url) {
var variableReg = new RegExp(/{(.*)}/)
var string_url = variableReg.exec(url)
var variable1 = string_url[0]
//Capture the variable
var reg = new RegExp(url.replace(variable1, '([^\?|\/|&]+)'))
var search_string = reg.exec(urloutput)[1]
if (location.search.length > 0) // if no query parameters
{
return urloutput + "?" + string_url[1] + "=" + search_string
} else {
return urloutput + "&" + string_url[1] + "=" + search_string
}
}
You are missing two things:
parenthesis to match groups and you use .* which includes "{" sign.
So can use match instead of exec like this:
var search_url_format = '/search/{city}-{service}' //User Enters any
var variableReg = new RegExp(/({\w+})/g)
var string_url = url.match(variableReg); // [{city}, {service}]
You can probably assume your "variable" will be alphanumeric instead of any character. With this assumption "{", "-", "_" etc will be punctuation.
so your grouping regexp could be /({\w+})/g.
//example
const r = /({\w+})/g;
let variable;
const url = '/search/{city}-{service}';
while ((variable = r.exec(url)) !== null) {
let msg = 'Found ' + variable[0] + '. ';
msg += 'Next match starts at ' + r.lastIndex;
console.log(msg);
}

Adding the single string quote for value inside variable using Javascript

I have a variable, before I use that variable,I need to add string quotes to the value/data which is inside the variable using JavaScript need help
//i am getting like this //
var variable=king;
//i want to convert it with single quote//
variable=variable;
but i need the data inside variable should be in a single quotation.
You can concatenate the variable with your quotes like :
function addQuotes(value){
var quotedVar = "\'" + value + "\'";
return quotedVar;
}
And use it like :
var king = addQuotes('king');
console.log will display :
'king'
Edit : you can try it in the chrome/firefox console, I did it and it works perfectly when copy/paste from here.
var x = 'abc';
var sData = "\'" + x +"\'";
// sData will print "'abc'"
var x = 'pqr';
var sData = "\'" + x +"\'";
// sData will print "'abc'"
1) You can use doublequotes
var variable = "'" + variable + "'";
2) ... or You can escape single quote symbol with backslash
var variable = '\'' + variable + '\'';

Remove all html tags from string by list, except the first one

I have string of html tags and a list of forbidden tags:
Any tag that is found in forbiddenTags should be removed from str, except the first one.
Maybe it can be done by one loop of the string
I tried the next thing:
var forbiddenTags = ["div", "city"];
var str = '<?xml version="1.0" encoding="UTF-8"?>' +
'<ADDUMP>' +
' <HEADER>' +
' <div></div>' +
' <div>Help Wanted Line</div>' +
' </HEADER>' +
' <ADINFO>' +
' <CUSTOMER>' +
' <CITY></CITY>' +
' <Div></DIV>' +
' <STATE></STATE>' +
' </CUSTOMER>' +
' </ADINFO>' +
'</ADDUMP>' +
'</xml>';
var arrayLength = forbiddenTags.length;
for (var i = 0; i < arrayLength; i++) {
// remove all forbiddenTags (upper and lower case)
var re = new RegExp("</? *" + forbiddenTags[i] + "[^>]*>","gi");
str = str.replace(re, "");
}
console.log(str);
Unfortunately, there are two problems:
1) It removes also the first tag of the string that is found in forbiddenTags.
2) It doesn't remove the content of the tags.
example:
<div>hi</div>
<div>how</div>
<div></div>
should be:
<div>hi</div>
This is my jsfiddle:
http://jsfiddle.net/Ht6Ym/3469/
Any help appreciated!
To match the content of the tag as well as the tag itself, you need to change your regex to look for both the opening and closing tag at the same time. Currently, it only checks for one or the other, which is why the tag content is being left.
This regex looks for an opening tag (and any associated attributes) the matching closing tag, and any intervening text:
new RegExp("<(" + forbiddenTags[i] + ")[^>]*>(.*?)</\\1>", "gi")
Your other issue (not wanting to remove the first match) can be solved by passing an anonymous function as a parameter to str.replace. In that function, make use of a counter variable to determine when to remove a match.
To do that, you'll need to add a counter variable somewhere. If you want to leave the first match of each type of forbidden tag, put it inside your for loop. If you only want to keep the first forbidden tag found overall, initialize it outside your for loop (it's unclear which you want from your question). Then replace str = str.replace(re, ""); with this:
str = str.replace(re, function(matchedText){
if (++counter>1){
return "";
} else {
return matchedText;
}
});
This function runs against every match. If it's the first match, it simply returns that match (in effect, leaving it alone). Otherwise, it removes it.
Now, all together this makes your loop look like this:
for (var i = 0; i < forbiddenTags.length; i++) {
var counter=0
var re = new RegExp("<(" + forbiddenTags[i] + ")[^>]*>(.*?)</\\1>", "gi");
str = str.replace(re, function(matchedText){
if (++counter>1){
return "";
} else {
return matchedText;
}
});
}
If using jQuery is an option, you can make things look a bit cleaner (namely, removing that obnoxious regex) using the function found in this answer:
var removeElements = function(text, selector) {
var wrapped = $("<div>" + text + "</div>");
wrapped.find(selector+":not(:first)").remove();
return wrapped.html();
}
for (var i = 0; i < forbiddenTags.length; i++) {
str = removeElements(str, forbiddenTags[i]);
}
Use str.match to get all matches and discard all except for the first one.
It seems like the answer by Rob W on this post is what you are looking for.
All you need to change is the first = true to first = {} and check
if (!first[tag]) {
first[tag] = true;
} else {
return '';
}

How do you add a parameter to a URL and reload the page

I'm looking for the simplest way to add parameters to a URL and then reload the page via javascript/jquery. I'm trying to avoid any plugins. Essentially I want:
http://www.mysite.com/about
to become:
http://www.mysite.com/about?qa=newParam
or, if a parameter already exists, then add a second parameter:
http://www.mysite.com/about?qa=oldParam&qa=newParam
Here is a vanilla solution, it should work nicely for all cases (except wrong inputs of course).
function replace_search(name, value) {
var str = location.search;
if (new RegExp("[&?]"+name+"([=&].+)?$").test(str)) {
str = str.replace(new RegExp("(?:[&?])"+name+"[^&]*", "g"), "")
}
str += "&";
str += name + "=" + value;
str = "?" + str.slice(1);
// there is an official order for the query and the hash if you didn't know.
location.assign(location.origin + location.pathname + str + location.hash)
};
EDIT: if you want to add stuff and never remove anything the function is way smaller. I'm not very found of having multiple fields with different values but there is no specifications on that.
function replace_search(name, value) {
var str = "";
if (location.search.length == 0) {
str = "?"
} else {
str = "&"
}
str += name + "=" + value;
location.assign(location.origin + location.pathname + location.search + str + location.hash)
};
Have a look at Window.location (MDN) for information on window.location.
A quick and dirty solution is:
location += (location.search ? "&" : "?") + "qa=newParam"
It should work for your example, but misses some edge cases.
location.href will give you the current URL. You can then edit your query string and refresh the page by doing something like this:
if (location.href.indexOf("?") === -1) {
window.location = location.href += "?qa=newParam";
}
else {
window.location = location.href += "&qa=newParam";
}

How to Extract URL from the Text with javascript [duplicate]

Does anyone have suggestions for detecting URLs in a set of strings?
arrayOfStrings.forEach(function(string){
// detect URLs in strings and do something swell,
// like creating elements with links.
});
Update: I wound up using this regex for link detection… Apparently several years later.
kLINK_DETECTION_REGEX = /(([a-z]+:\/\/)?(([a-z0-9\-]+\.)+([a-z]{2}|aero|arpa|biz|com|coop|edu|gov|info|int|jobs|mil|museum|name|nato|net|org|pro|travel|local|internal))(:[0-9]{1,5})?(\/[a-z0-9_\-\.~]+)*(\/([a-z0-9_\-\.]*)(\?[a-z0-9+_\-\.%=&]*)?)?(#[a-zA-Z0-9!$&'()*+.=-_~:#/?]*)?)(\s+|$)/gi
The full helper (with optional Handlebars support) is at gist #1654670.
First you need a good regex that matches urls. This is hard to do. See here, here and here:
...almost anything is a valid URL. There
are some punctuation rules for
splitting it up. Absent any
punctuation, you still have a valid
URL.
Check the RFC carefully and see if you
can construct an "invalid" URL. The
rules are very flexible.
For example ::::: is a valid URL.
The path is ":::::". A pretty
stupid filename, but a valid filename.
Also, ///// is a valid URL. The
netloc ("hostname") is "". The path
is "///". Again, stupid. Also
valid. This URL normalizes to "///"
which is the equivalent.
Something like "bad://///worse/////"
is perfectly valid. Dumb but valid.
Anyway, this answer is not meant to give you the best regex but rather a proof of how to do the string wrapping inside the text, with JavaScript.
OK so lets just use this one: /(https?:\/\/[^\s]+)/g
Again, this is a bad regex. It will have many false positives. However it's good enough for this example.
function urlify(text) {
var urlRegex = /(https?:\/\/[^\s]+)/g;
return text.replace(urlRegex, function(url) {
return '' + url + '';
})
// or alternatively
// return text.replace(urlRegex, '$1')
}
var text = 'Find me at http://www.example.com and also at http://stackoverflow.com';
var html = urlify(text);
console.log(html)
// html now looks like:
// "Find me at http://www.example.com and also at http://stackoverflow.com"
So in sum try:
$$('#pad dl dd').each(function(element) {
element.innerHTML = urlify(element.innerHTML);
});
Here is what I ended up using as my regex:
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
This doesn't include trailing punctuation in the URL. Crescent's function works like a charm :)
so:
function linkify(text) {
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
return text.replace(urlRegex, function(url) {
return '' + url + '';
});
}
I googled this problem for quite a while, then it occurred to me that there is an Android method, android.text.util.Linkify, that utilizes some pretty robust regexes to accomplish this. Luckily, Android is open source.
They use a few different patterns for matching different types of urls. You can find them all here:
http://grepcode.com/file/repository.grepcode.com/java/ext/com.google.android/android/2.0_r1/android/text/util/Regex.java#Regex.0WEB_URL_PATTERN
If you're just concerned about url's that match the WEB_URL_PATTERN, that is, urls that conform to the RFC 1738 spec, you can use this:
/((?:(http|https|Http|Https|rtsp|Rtsp):\/\/(?:(?:[a-zA-Z0-9\$\-\_\.\+\!\*\'\(\)\,\;\?\&\=]|(?:\%[a-fA-F0-9]{2})){1,64}(?:\:(?:[a-zA-Z0-9\$\-\_\.\+\!\*\'\(\)\,\;\?\&\=]|(?:\%[a-fA-F0-9]{2})){1,25})?\#)?)?((?:(?:[a-zA-Z0-9][a-zA-Z0-9\-]{0,64}\.)+(?:(?:aero|arpa|asia|a[cdefgilmnoqrstuwxz])|(?:biz|b[abdefghijmnorstvwyz])|(?:cat|com|coop|c[acdfghiklmnoruvxyz])|d[ejkmoz]|(?:edu|e[cegrstu])|f[ijkmor]|(?:gov|g[abdefghilmnpqrstuwy])|h[kmnrtu]|(?:info|int|i[delmnoqrst])|(?:jobs|j[emop])|k[eghimnrwyz]|l[abcikrstuvy]|(?:mil|mobi|museum|m[acdghklmnopqrstuvwxyz])|(?:name|net|n[acefgilopruz])|(?:org|om)|(?:pro|p[aefghklmnrstwy])|qa|r[eouw]|s[abcdeghijklmnortuvyz]|(?:tel|travel|t[cdfghjklmnoprtvwz])|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]))|(?:(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}|[1-9][0-9]|[1-9])\.(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}|[1-9][0-9]|[1-9]|0)\.(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}|[1-9][0-9]|[1-9]|0)\.(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}|[1-9][0-9]|[0-9])))(?:\:\d{1,5})?)(\/(?:(?:[a-zA-Z0-9\;\/\?\:\#\&\=\#\~\-\.\+\!\*\'\(\)\,\_])|(?:\%[a-fA-F0-9]{2}))*)?(?:\b|$)/gi;
Here is the full text of the source:
"((?:(http|https|Http|Https|rtsp|Rtsp):\\/\\/(?:(?:[a-zA-Z0-9\\$\\-\\_\\.\\+\\!\\*\\'\\(\\)"
+ "\\,\\;\\?\\&\\=]|(?:\\%[a-fA-F0-9]{2})){1,64}(?:\\:(?:[a-zA-Z0-9\\$\\-\\_"
+ "\\.\\+\\!\\*\\'\\(\\)\\,\\;\\?\\&\\=]|(?:\\%[a-fA-F0-9]{2})){1,25})?\\#)?)?"
+ "((?:(?:[a-zA-Z0-9][a-zA-Z0-9\\-]{0,64}\\.)+" // named host
+ "(?:" // plus top level domain
+ "(?:aero|arpa|asia|a[cdefgilmnoqrstuwxz])"
+ "|(?:biz|b[abdefghijmnorstvwyz])"
+ "|(?:cat|com|coop|c[acdfghiklmnoruvxyz])"
+ "|d[ejkmoz]"
+ "|(?:edu|e[cegrstu])"
+ "|f[ijkmor]"
+ "|(?:gov|g[abdefghilmnpqrstuwy])"
+ "|h[kmnrtu]"
+ "|(?:info|int|i[delmnoqrst])"
+ "|(?:jobs|j[emop])"
+ "|k[eghimnrwyz]"
+ "|l[abcikrstuvy]"
+ "|(?:mil|mobi|museum|m[acdghklmnopqrstuvwxyz])"
+ "|(?:name|net|n[acefgilopruz])"
+ "|(?:org|om)"
+ "|(?:pro|p[aefghklmnrstwy])"
+ "|qa"
+ "|r[eouw]"
+ "|s[abcdeghijklmnortuvyz]"
+ "|(?:tel|travel|t[cdfghjklmnoprtvwz])"
+ "|u[agkmsyz]"
+ "|v[aceginu]"
+ "|w[fs]"
+ "|y[etu]"
+ "|z[amw]))"
+ "|(?:(?:25[0-5]|2[0-4]" // or ip address
+ "[0-9]|[0-1][0-9]{2}|[1-9][0-9]|[1-9])\\.(?:25[0-5]|2[0-4][0-9]"
+ "|[0-1][0-9]{2}|[1-9][0-9]|[1-9]|0)\\.(?:25[0-5]|2[0-4][0-9]|[0-1]"
+ "[0-9]{2}|[1-9][0-9]|[1-9]|0)\\.(?:25[0-5]|2[0-4][0-9]|[0-1][0-9]{2}"
+ "|[1-9][0-9]|[0-9])))"
+ "(?:\\:\\d{1,5})?)" // plus option port number
+ "(\\/(?:(?:[a-zA-Z0-9\\;\\/\\?\\:\\#\\&\\=\\#\\~" // plus option query params
+ "\\-\\.\\+\\!\\*\\'\\(\\)\\,\\_])|(?:\\%[a-fA-F0-9]{2}))*)?"
+ "(?:\\b|$)";
If you want to be really fancy, you can test for email addresses as well. The regex for email addresses is:
/[a-zA-Z0-9\\+\\.\\_\\%\\-]{1,256}\\#[a-zA-Z0-9][a-zA-Z0-9\\-]{0,64}(\\.[a-zA-Z0-9][a-zA-Z0-9\\-]{0,25})+/gi
PS: The top level domains supported by above regex are current as of June 2007. For an up to date list you'll need to check https://data.iana.org/TLD/tlds-alpha-by-domain.txt.
Based on Crescent Fresh answer
if you want to detect links with http:// OR without http:// and by www. you can use the following
function urlify(text) {
var urlRegex = /(((https?:\/\/)|(www\.))[^\s]+)/g;
//var urlRegex = /(https?:\/\/[^\s]+)/g;
return text.replace(urlRegex, function(url,b,c) {
var url2 = (c == 'www.') ? 'http://' +url : url;
return '' + url + '';
})
}
This library on NPM looks like it is pretty comprehensive https://www.npmjs.com/package/linkifyjs
Linkify is a small yet comprehensive JavaScript plugin for finding URLs in plain-text and converting them to HTML links. It works with all valid URLs and email addresses.
Function can be further improved to render images as well:
function renderHTML(text) {
var rawText = strip(text)
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
return rawText.replace(urlRegex, function(url) {
if ( ( url.indexOf(".jpg") > 0 ) || ( url.indexOf(".png") > 0 ) || ( url.indexOf(".gif") > 0 ) ) {
return '<img src="' + url + '">' + '<br/>'
} else {
return '' + url + '' + '<br/>'
}
})
}
or for a thumbnail image that links to fiull size image:
return '<img style="width: 100px; border: 0px; -moz-border-radius: 5px; border-radius: 5px;" src="' + url + '">' + '' + '<br/>'
And here is the strip() function that pre-processes the text string for uniformity by removing any existing html.
function strip(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
return tmp.innerText.replace(urlRegex, function(url) {
return '\n' + url
})
}
There is existing npm package: url-regex, just install it with yarn add url-regex or npm install url-regex and use as following:
const urlRegex = require('url-regex');
const replaced = 'Find me at http://www.example.com and also at http://stackoverflow.com or at google.com'
.replace(urlRegex({strict: false}), function(url) {
return '' + url + '';
});
let str = 'https://example.com is a great site'
str.replace(/(https?:\/\/[^\s]+)/g,"<a href='$1' target='_blank' >$1</a>")
Short Code Big Work!...
Result:-
<a href="https://example.com" target="_blank" > https://example.com </a>
If you want to detect links with http:// OR without http:// OR ftp OR other possible cases like removing trailing punctuation at the end, take a look at this code.
https://jsfiddle.net/AndrewKang/xtfjn8g3/
A simple way to use that is to use NPM
npm install --save url-knife
Detect URLs in text and make clickable.
const detectURLInText = ( contentElement ) => {
const elem = document.querySelector(contentElement);
elem.innerHTML = elem.innerHTML.replace(/(https?:\/\/[^\s]+)/g, `<a class='link' href="$1">$1</a>`)
return elem
}
detectURLInText( '#myContent');
<div id="myContent">
Hell world!, detect URLs in text and make clickable.
IP: https://123.0.1.890:8080
Web: https://any-domain.com
</div>
try this:
function isUrl(s) {
if (!isUrl.rx_url) {
// taken from https://gist.github.com/dperini/729294
isUrl.rx_url=/^(?:(?:https?|ftp):\/\/)?(?:\S+(?::\S*)?#)?(?:(?!(?:10|127)(?:\.\d{1,3}){3})(?!(?:169\.254|192\.168)(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)(?:\.(?:[a-z\u00a1-\uffff0-9]-*)*[a-z\u00a1-\uffff0-9]+)*(?:\.(?:[a-z\u00a1-\uffff]{2,}))\.?)(?::\d{2,5})?(?:[/?#]\S*)?$/i;
// valid prefixes
isUrl.prefixes=['http:\/\/', 'https:\/\/', 'ftp:\/\/', 'www.'];
// taken from https://w3techs.com/technologies/overview/top_level_domain/all
isUrl.domains=['com','ru','net','org','de','jp','uk','br','pl','in','it','fr','au','info','nl','ir','cn','es','cz','kr','ua','ca','eu','biz','za','gr','co','ro','se','tw','mx','vn','tr','ch','hu','at','be','dk','tv','me','ar','no','us','sk','xyz','fi','id','cl','by','nz','il','ie','pt','kz','io','my','lt','hk','cc','sg','edu','pk','su','bg','th','top','lv','hr','pe','club','rs','ae','az','si','ph','pro','ng','tk','ee','asia','mobi'];
}
if (!isUrl.rx_url.test(s)) return false;
for (let i=0; i<isUrl.prefixes.length; i++) if (s.startsWith(isUrl.prefixes[i])) return true;
for (let i=0; i<isUrl.domains.length; i++) if (s.endsWith('.'+isUrl.domains[i]) || s.includes('.'+isUrl.domains[i]+'\/') ||s.includes('.'+isUrl.domains[i]+'?')) return true;
return false;
}
function isEmail(s) {
if (!isEmail.rx_email) {
// taken from http://stackoverflow.com/a/16016476/460084
var sQtext = '[^\\x0d\\x22\\x5c\\x80-\\xff]';
var sDtext = '[^\\x0d\\x5b-\\x5d\\x80-\\xff]';
var sAtom = '[^\\x00-\\x20\\x22\\x28\\x29\\x2c\\x2e\\x3a-\\x3c\\x3e\\x40\\x5b-\\x5d\\x7f-\\xff]+';
var sQuotedPair = '\\x5c[\\x00-\\x7f]';
var sDomainLiteral = '\\x5b(' + sDtext + '|' + sQuotedPair + ')*\\x5d';
var sQuotedString = '\\x22(' + sQtext + '|' + sQuotedPair + ')*\\x22';
var sDomain_ref = sAtom;
var sSubDomain = '(' + sDomain_ref + '|' + sDomainLiteral + ')';
var sWord = '(' + sAtom + '|' + sQuotedString + ')';
var sDomain = sSubDomain + '(\\x2e' + sSubDomain + ')*';
var sLocalPart = sWord + '(\\x2e' + sWord + ')*';
var sAddrSpec = sLocalPart + '\\x40' + sDomain; // complete RFC822 email address spec
var sValidEmail = '^' + sAddrSpec + '$'; // as whole string
isEmail.rx_email = new RegExp(sValidEmail);
}
return isEmail.rx_email.test(s);
}
will also recognize urls such as google.com , http://www.google.bla , http://google.bla , www.google.bla but not google.bla
Generic Object Oriented Solution
For people like me that use frameworks like angular that don't allow manipulating DOM directly, I created a function that takes a string and returns an array of url/plainText objects that can be used to create any UI representation that you want.
URL regex
For URL matching I used (slightly adapted) h0mayun regex: /(?:(?:https?:\/\/)|(?:www\.))[^\s]+/g
My function also drops punctuation characters from the end of a URL like . and , that I believe more often will be actual punctuation than a legit URL ending (but it could be! This is not rigorous science as other answers explain well) For that I apply the following regex onto matched URLs /^(.+?)([.,?!'"]*)$/.
Typescript code
export function urlMatcherInText(inputString: string): UrlMatcherResult[] {
if (! inputString) return [];
const results: UrlMatcherResult[] = [];
function addText(text: string) {
if (! text) return;
const result = new UrlMatcherResult();
result.type = 'text';
result.value = text;
results.push(result);
}
function addUrl(url: string) {
if (! url) return;
const result = new UrlMatcherResult();
result.type = 'url';
result.value = url;
results.push(result);
}
const findUrlRegex = /(?:(?:https?:\/\/)|(?:www\.))[^\s]+/g;
const cleanUrlRegex = /^(.+?)([.,?!'"]*)$/;
let match: RegExpExecArray;
let indexOfStartOfString = 0;
do {
match = findUrlRegex.exec(inputString);
if (match) {
const text = inputString.substr(indexOfStartOfString, match.index - indexOfStartOfString);
addText(text);
var dirtyUrl = match[0];
var urlDirtyMatch = cleanUrlRegex.exec(dirtyUrl);
addUrl(urlDirtyMatch[1]);
addText(urlDirtyMatch[2]);
indexOfStartOfString = match.index + dirtyUrl.length;
}
}
while (match);
const remainingText = inputString.substr(indexOfStartOfString, inputString.length - indexOfStartOfString);
addText(remainingText);
return results;
}
export class UrlMatcherResult {
public type: 'url' | 'text'
public value: string
}
Here is a little solution for react app without using any library please note that this method work if the url is not attached to any character
this component will return a paragraph with kink detection !
import React from "react";
interface Props {
paragraph: string,
}
const REGEX = /^(http:\/\/www\.|https:\/\/www\.|http:\/\/|https:\/\/)?[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(:[0-9]{1,5})?(\/.*)?$/gm;
const Paragraph: React.FC<Props> = ({ paragraph }) => {
const paragraphArray = paragraph.split(' ');
return <div>
{
paragraphArray.map((word: any) => {
return word.match(REGEX) ? (
<>
{word} {' '}
</>
) : word + ' '
})
}
</div>;
};
export default LinkParaGraph;
tmp.innerText is undefined. You should use tmp.innerHTML
function strip(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
var urlRegex =/(\b(https?|ftp|file):\/\/[-A-Z0-9+&##\/%?=~_|!:,.;]*[-A-Z0-9+&##\/%=~_|])/ig;
return tmp.innerHTML .replace(urlRegex, function(url) {
return '\n' + url
})
You can use a regex like this to extract normal url patterns.
(https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|www\.[a-zA-Z0-9][a-zA-Z0-9-]+[a-zA-Z0-9]\.[^\s]{2,}|https?:\/\/(?:www\.|(?!www))[a-zA-Z0-9]+\.[^\s]{2,}|www\.[a-zA-Z0-9]+\.[^\s]{2,})
If you need more sophisticated patterns, use a library like this.
https://www.npmjs.com/package/pattern-dreamer

Categories

Resources