I am having trouble with javascript split method. I would like some code to 'split' up a list of emails.
example: test#test.comfish#fish.comnone#none.com
how do you split that up?
Regardless of programming language, you will need to write (create) artificial intelligence which will recognize emails (since there is no pattern).
But since you are asking how to do it, I assume that you need really simple solution. In that case split text based on .com, .net, .org ...
This is easy to do, but it will generate probably a lot of invalid emails.
UPDATE: Here is code example for simple solution (please note that this will work only for all domains that end with 3 letter like: .com, .net, .org, .biz...):
var emails = "test#test.comfish#fish.comnone#none.com"
var emailsArray = new Array()
while (emails !== '')
{
//ensures that dot is searched after # symbol (so it can find this email as well: test.test#test.com)
//adding 4 characters makes up for dot + TLD ('.com'.length === 4)
var endOfEmail = emails.indexOf('.', emails.indexOf('#')) + 4
var tmpEmail = emails.substring(0, endOfEmail)
emails = emails.substring(endOfEmail)
emailsArray.push(tmpEmail)
}
alert(emailsArray)
This code has downsides of course:
It won't work for other then 3-char's TLS's
It won't work if domain has subdomain, like test#test.test.com
But I believe that it has best time_to_do_it/percent_of_valid_emails ratio due to very very little time needed to make it.
Assuming you have different domains, like .com, .net etc and can't just split on .com, AND assuming your domain names and recipient names are the same like in each of your three examples, you might be able to do something crazy like this:
var emails = "test#test.comfish#fish.comnone#none.com"
// get the string between # and . to get the domain name
var domain = emails.substring(emails.lastIndexOf("#")+1,emails.lastIndexOf("."));
// split the string on the index before "domain#"
var last_email = split_on(emails, emails.indexOf( domain + "#" ) );
function split_on(value, index) {
return value.substring(0, index) + "," + value.substring(index);
}
// this gives the first emails together and splits "none#none.com"
// I'd loop through repeating this sort of process but moving in the
// index of the length of the email, so that you split the inner emails too
alert(last_email);
>>> test#test.comfish#fish.com, none#none.com
Related
I'm trying to get the base url from a string (So no window.location).
It needs to remove the trailing slash
It needs to be regex (No New URL)
It need to work with query parameters and anchor links
In other words all the following should return https://apple.com or https://www.apple.com for the last one.
https://apple.com?query=true&slash=false
https://apple.com#anchor=true&slash=false
http://www.apple.com/#anchor=true&slash=true&whatever=foo
These are just examples, urls can have different subdomains like https://shop.apple.co.uk/?query=foo should return https://shop.apple.co.uk - It could be any url like: https://foo.bar
The closer I got is with:
const baseUrl = url.replace(/^((\w+:)?\/\/[^\/]+\/?).*$/,'$1').replace(/\/$/, ""); // Base Path & Trailing slash
But this doesn't work with anchor links and queries which start right after the url without the / before
Any idea how I can get it to work on all cases?
You could add # and ? to your negated character class. You don't need .* because that will match until the end of the string.
For your example data, you could match:
^https?:\/\/[^#?\/]+
Regex demo
strings = [
"https://apple.com?query=true&slash=false",
"https://apple.com#anchor=true&slash=false",
"http://www.apple.com/#anchor=true&slash=true&whatever=foo",
"https://foo.bar/?q=true"
];
strings.forEach(s => {
console.log(s.match(/^https?:\/\/[^#?\/]+/)[0]);
})
You could use Web API's built-in URL for this. URL will also provide you with other parsed properties that are easy to get to, like the query string params, the protocol, etc.
Regex is a painful way to do something that the browser makes otherwise very simple.
I know that you asked about using regex, but in the event that you (or someone coming here in the future) really just cares about getting the information out and isn't committed to using regex, maybe this answer will help.
let one = "https://apple.com?query=true&slash=false"
let two = "https://apple.com#anchor=true&slash=false"
let three = "http://www.apple.com/#anchor=true&slash=true&whatever=foo"
let urlOne = new URL(one)
console.log(urlOne.origin)
let urlTwo = new URL(two)
console.log(urlTwo.origin)
let urlThree = new URL(three)
console.log(urlThree.origin)
const baseUrl = url.replace(/(.*:\/\/.*)[\?\/#].*/, '$1');
This will get you everything up to the .com part. You will have to append .com once you pull out the first part of the url.
^http.*?(?=\.com)
Or maybe you could do:
myUrl.Replace(/(#|\?|\/#).*$/, "")
To remove everything after the host name.
I know this has been asked a thousand times before (apologies), but searching SO/Google etc I am yet to get a conclusive answer.
Basically, I need a JS function which when passed a string, identifies & extracts all URLs based on a regex, returning an array of all found. e.g:
function findUrls(searchText){
var regex=???
result= searchText.match(regex);
if(result){return result;}else{return false;}
}
The function should be able to detect and return any potential urls. I am aware of the inherant difficulties/isses with this (closing parentheses etc), so I have a feeling the process needs to be:
Split the string (searchText) into distinct sections starting/ending) with either nothing, a space or carriage return either side of it, resulting in distinct content chunks, e.g. do a split.
For each content chunk that results from the split, see whether it fits the logic for a URL of any construction, namely, does it contain a period immediately followed the text (the one constant rule for qualifying a potential URL).
The regex should see whether the period is immediately followed by other text, of the type allowable for a tld, directory structure & query string, and preceded by text of the allowable type for a URL.
I am aware false positives may result, however any returned values will then be checked with a call to the URL itself, so this can be ignored. The other functions I have found often dont return the URLs query string too, if present.
From a block of text, the function should thus be able to return any type of URL, even if it means identifying will.i.am as a valid one!
eg. http://www.google.com, google.com, www.google.com, http://google.com,
ftp.google.com, https:// etc...and any derivation thereof with a query string
should be returned...
Many thanks, apologies again if this exists elsewhere on SO but my searches havent returned it..
I just use URI.js -- makes it easy.
var source = "Hello www.example.com,\n"
+ "http://google.com is a search engine, like http://www.bing.com\n"
+ "http://exämple.org/foo.html?baz=la#bumm is an IDN URL,\n"
+ "http://123.123.123.123/foo.html is IPv4 and "
+ "http://fe80:0000:0000:0000:0204:61ff:fe9d:f156/foobar.html is IPv6.\n"
+ "links can also be in parens (http://example.org) "
+ "or quotes »http://example.org«.";
var result = URI.withinString(source, function(url) {
return "<a>" + url + "</a>";
});
/* result is:
Hello <a>www.example.com</a>,
<a>http://google.com</a> is a search engine, like <a>http://www.bing.com</a>
<a>http://exämple.org/foo.html?baz=la#bumm</a> is an IDN URL,
<a>http://123.123.123.123/foo.html</a> is IPv4 and <a>http://fe80:0000:0000:0000:0204:61ff:fe9d:f156/foobar.html</a> is IPv6.
links can also be in parens (<a>http://example.org</a>) or quotes »<a>http://example.org</a>«.
*/
https://github.com/medialize/URI.js
http://medialize.github.io/URI.js/
You could use the regex from URI.js:
// gruber revised expression - http://rodneyrehm.de/t/url-regex.html
var uri_pattern = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/ig;
String#match and or String#replace may help…
Following regular expression extract URLs from string (inc. query string) and returns array
var url = "asdasdla hakjsdh aaskjdh https://www.google.com/search?q=add+a+element+to+dom+tree&oq=add+a+element+to+dom+tree&aqs=chrome..69i57.7462j1j1&sourceid=chrome&ie=UTF-8 askndajk nakjsdn aksjdnakjsdnkjsn";
var matches = strings.match(/\bhttps?::\/\/\S+/gi) || strings.match(/\bhttps?:\/\/\S+/gi);
Output:
["https://www.google.com/search?q=format+to+6+digir&…s=chrome..69i57.5983j1j1&sourceid=chrome&ie=UTF-8"]
Note:
This handles both http:// with single colon and http::// with double colon in string, vice versa for https, So it's safe for you to use. :)
try this
var expression = /[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/gi;
you could use this website to test regexp http://gskinner.com/RegExr/
In UIPath Studio the following built-in regex rule has been defined:
/(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-a-zA-Z0-9+&##\/%=~_|$?!:,.]*\)|[-a-zA-Z0-9+&##\/%=~_|$?!:,.])*(?:\([-a-zA-Z0-9+&##\/%=~_|$?!:,.]*\)|[a-zA-Z0-9+&##\/%=~_|$])/
I'm trying to make a JavaScript regexp such as the facebook uses for real names:
Names can’t include:
Symbols
numbers
unusual capitalization
repeating characters or punctuation
source: Facebook help center
Here is my regexp:
/^[a-z \,\.\'\-]+$/i
The problem with this regexp is that it doesn't check for repeated characters or punctuation:
then I found this :
/(.)\1/
so I'm now checking it like this:
$('input [type=text]).keyup(function(){
var name = $(this).val();
var myregex = /^[a-z\,\.\'\-]+$/i
var duplicate = /(.)\1/
if(name != myregex.exec(name) || name == /(.)\1/)
{// the name entered is wrong
}
else
//the name is ok
but the problem I'm facing is with inputs like:
Moore
Callie
Maggie
what can I do to get the problem solved?
You should stop trying to solve this problem:
It is very complicated
Names are very personal
For instance your system will never be able to validate names from China
or Japan.... (For instance: Børre Ørevål ,汉/漢 )
So just leave the whole idea, and let people freely enter their names with no restrictions.
Related: Regular expression for validating names and surnames?
I got this from antoher post:
host = ".mylocal.com";
var reg = new RegExp('^https?://([^.]*' + host + ')');
console.log(reg.test('http://www.mylocal.com/'));
but it can only match with www.mylocal.com , whatever.mylocal.com but not two levels down, like dev.www.mylocal.com
i tried writing something else but couldn't get it check the sub -sub . or even 3, 4 levels down. :( how should i write it?
so, what I would like to achieve is:
.local.com
will match:
www.local.com
dev.www.local.com
abc.dev.abc.local.com
and
www.local.com
will only match
www.local.com
NOT dev.www.local.com
:) that's more clear
The basic problem is that you want to match all kinds of prefixes, followed by host, if host begins with .; but you want to match host alone if not. This could be done with lookahead assertions, but since you're constructing the regex anyway, it's much simpler to just construct it differently depending on the case. I don't know Javascript, so I'll use pseudocode.
For the first case, we want to match zero or more (non-capturing) groups of non-periods followed by one period, then at least one non-period, all before host. For the second case, we just want to match host.
if host starts with '.':
var reg = new RegExp('^https?://((?:[^.]+\.)*[^.]+' + host + ')');
else:
var reg = new RegExp('^https?://(' + host + ')');
not sure what you're trying to do. this regex will match any number of levels:
r = new RegExp("^https?://([^.]+[.])*([^.])+$");
I am trying to use an HTML form and javascript (i mention this, because some advanced features of regex processing are not available when using it on javascript) to acomplish the following:
feed the form some text, and use a regex to look into it and "capture" certain parts of it to be used as variables...
i.e. the text is:
"abcde email: asdf#gfds.com email: fake#mail.net sdfsdaf..."
... now, my problem is that I cannot think of an elegant way of capturing both emails as the variables e1 and e2, for example.
the regex I have so far is something like this: /email: (\b\w+\b)/g but for some reason, this is not giving back the 2 matches... it only gives back asdf#gfds.com ><
sugestions?
You can use RegExp.exec() to repeatedly apply a regex to a string, returning a new match each time:
var entry = "[...]"; //Whatever your data entry is
var regex = /email: (\b\w+\b)/g
var emails = []
while ((match = regex.exec(entry))) {
emails[emails.length] = match[1];
}
I stored all the e-mails in an array (so as to make this work far arbitrary input). It looks like your regex might be a little off, too; you'll have to change it if you just want to capture the full e-mail.