Struggle with a Regex to change subdomain, and subsubdomain - javascript

I got this from antoher post:
host = ".mylocal.com";
var reg = new RegExp('^https?://([^.]*' + host + ')');
console.log(reg.test('http://www.mylocal.com/'));
but it can only match with www.mylocal.com , whatever.mylocal.com but not two levels down, like dev.www.mylocal.com
i tried writing something else but couldn't get it check the sub -sub . or even 3, 4 levels down. :( how should i write it?
so, what I would like to achieve is:
.local.com
will match:
www.local.com
dev.www.local.com
abc.dev.abc.local.com
and
www.local.com
will only match
www.local.com
NOT dev.www.local.com
:) that's more clear

The basic problem is that you want to match all kinds of prefixes, followed by host, if host begins with .; but you want to match host alone if not. This could be done with lookahead assertions, but since you're constructing the regex anyway, it's much simpler to just construct it differently depending on the case. I don't know Javascript, so I'll use pseudocode.
For the first case, we want to match zero or more (non-capturing) groups of non-periods followed by one period, then at least one non-period, all before host. For the second case, we just want to match host.
if host starts with '.':
var reg = new RegExp('^https?://((?:[^.]+\.)*[^.]+' + host + ')');
else:
var reg = new RegExp('^https?://(' + host + ')');

not sure what you're trying to do. this regex will match any number of levels:
r = new RegExp("^https?://([^.]+[.])*([^.])+$");

Related

how to split list of emails with javascript split

I am having trouble with javascript split method. I would like some code to 'split' up a list of emails.
example: test#test.comfish#fish.comnone#none.com
how do you split that up?
Regardless of programming language, you will need to write (create) artificial intelligence which will recognize emails (since there is no pattern).
But since you are asking how to do it, I assume that you need really simple solution. In that case split text based on .com, .net, .org ...
This is easy to do, but it will generate probably a lot of invalid emails.
UPDATE: Here is code example for simple solution (please note that this will work only for all domains that end with 3 letter like: .com, .net, .org, .biz...):
var emails = "test#test.comfish#fish.comnone#none.com"
var emailsArray = new Array()
while (emails !== '')
{
//ensures that dot is searched after # symbol (so it can find this email as well: test.test#test.com)
//adding 4 characters makes up for dot + TLD ('.com'.length === 4)
var endOfEmail = emails.indexOf('.', emails.indexOf('#')) + 4
var tmpEmail = emails.substring(0, endOfEmail)
emails = emails.substring(endOfEmail)
emailsArray.push(tmpEmail)
}
alert(emailsArray)
This code has downsides of course:
It won't work for other then 3-char's TLS's
It won't work if domain has subdomain, like test#test.test.com
But I believe that it has best time_to_do_it/percent_of_valid_emails ratio due to very very little time needed to make it.
Assuming you have different domains, like .com, .net etc and can't just split on .com, AND assuming your domain names and recipient names are the same like in each of your three examples, you might be able to do something crazy like this:
var emails = "test#test.comfish#fish.comnone#none.com"
// get the string between # and . to get the domain name
var domain = emails.substring(emails.lastIndexOf("#")+1,emails.lastIndexOf("."));
// split the string on the index before "domain#"
var last_email = split_on(emails, emails.indexOf( domain + "#" ) );
function split_on(value, index) {
return value.substring(0, index) + "," + value.substring(index);
}
// this gives the first emails together and splits "none#none.com"
// I'd loop through repeating this sort of process but moving in the
// index of the length of the email, so that you split the inner emails too
alert(last_email);
>>> test#test.comfish#fish.com, none#none.com

How to match that using a JavaScript regex?

This is my code:
var name = 'somename';
var pass = '123somen456';
var regex = new RegExp('.*' + pass + '.*', 'i');
alert(name.match(regex));
The regex just wont match, what I dont understand. Whats wrong here? I want to have a match as soon as any part of name is contained in pass, as long as that match is at least 4 chars long. Example:
som --> no match
some --> match
Thanks!
This regex requires that there are any amount of any character, then 123somen456, and then any amount of any character. name.match(regex) will not return anything because name does not contain the string 123somen456.
To test regular expressions, I recommend using http://regexpal.com/.
It sounds, you may need to apply some algorithms like this or this.
If it is possible using regex in javascript, I'm interested to know.
sorry buddy, I have no option to comment.

Regex to find urls with hashes and exclamation marks #! [duplicate]

I know this has been asked a thousand times before (apologies), but searching SO/Google etc I am yet to get a conclusive answer.
Basically, I need a JS function which when passed a string, identifies & extracts all URLs based on a regex, returning an array of all found. e.g:
function findUrls(searchText){
var regex=???
result= searchText.match(regex);
if(result){return result;}else{return false;}
}
The function should be able to detect and return any potential urls. I am aware of the inherant difficulties/isses with this (closing parentheses etc), so I have a feeling the process needs to be:
Split the string (searchText) into distinct sections starting/ending) with either nothing, a space or carriage return either side of it, resulting in distinct content chunks, e.g. do a split.
For each content chunk that results from the split, see whether it fits the logic for a URL of any construction, namely, does it contain a period immediately followed the text (the one constant rule for qualifying a potential URL).
The regex should see whether the period is immediately followed by other text, of the type allowable for a tld, directory structure & query string, and preceded by text of the allowable type for a URL.
I am aware false positives may result, however any returned values will then be checked with a call to the URL itself, so this can be ignored. The other functions I have found often dont return the URLs query string too, if present.
From a block of text, the function should thus be able to return any type of URL, even if it means identifying will.i.am as a valid one!
eg. http://www.google.com, google.com, www.google.com, http://google.com,
ftp.google.com, https:// etc...and any derivation thereof with a query string
should be returned...
Many thanks, apologies again if this exists elsewhere on SO but my searches havent returned it..
I just use URI.js -- makes it easy.
var source = "Hello www.example.com,\n"
+ "http://google.com is a search engine, like http://www.bing.com\n"
+ "http://exämple.org/foo.html?baz=la#bumm is an IDN URL,\n"
+ "http://123.123.123.123/foo.html is IPv4 and "
+ "http://fe80:0000:0000:0000:0204:61ff:fe9d:f156/foobar.html is IPv6.\n"
+ "links can also be in parens (http://example.org) "
+ "or quotes »http://example.org«.";
var result = URI.withinString(source, function(url) {
return "<a>" + url + "</a>";
});
/* result is:
Hello <a>www.example.com</a>,
<a>http://google.com</a> is a search engine, like <a>http://www.bing.com</a>
<a>http://exämple.org/foo.html?baz=la#bumm</a> is an IDN URL,
<a>http://123.123.123.123/foo.html</a> is IPv4 and <a>http://fe80:0000:0000:0000:0204:61ff:fe9d:f156/foobar.html</a> is IPv6.
links can also be in parens (<a>http://example.org</a>) or quotes »<a>http://example.org</a>«.
*/
https://github.com/medialize/URI.js
http://medialize.github.io/URI.js/
You could use the regex from URI.js:
// gruber revised expression - http://rodneyrehm.de/t/url-regex.html
var uri_pattern = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/ig;
String#match and or String#replace may help…
Following regular expression extract URLs from string (inc. query string) and returns array
var url = "asdasdla hakjsdh aaskjdh https://www.google.com/search?q=add+a+element+to+dom+tree&oq=add+a+element+to+dom+tree&aqs=chrome..69i57.7462j1j1&sourceid=chrome&ie=UTF-8 askndajk nakjsdn aksjdnakjsdnkjsn";
var matches = strings.match(/\bhttps?::\/\/\S+/gi) || strings.match(/\bhttps?:\/\/\S+/gi);
Output:
["https://www.google.com/search?q=format+to+6+digir&…s=chrome..69i57.5983j1j1&sourceid=chrome&ie=UTF-8"]
Note:
This handles both http:// with single colon and http::// with double colon in string, vice versa for https, So it's safe for you to use. :)
try this
var expression = /[-a-zA-Z0-9#:%_\+.~#?&//=]{2,256}\.[a-z]{2,4}\b(\/[-a-zA-Z0-9#:%_\+.~#?&//=]*)?/gi;
you could use this website to test regexp http://gskinner.com/RegExr/
In UIPath Studio the following built-in regex rule has been defined:
/(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)(?:\([-a-zA-Z0-9+&##\/%=~_|$?!:,.]*\)|[-a-zA-Z0-9+&##\/%=~_|$?!:,.])*(?:\([-a-zA-Z0-9+&##\/%=~_|$?!:,.]*\)|[a-zA-Z0-9+&##\/%=~_|$])/

Javascript lookbehind stopped at the first match

I have to parse (in JS) an e-mail address that originally, instead of a # has a dot. Obviously I want to show the # instead of the dot.
var mail = "name.domain.xx"
We have two cases:
name contains some dots itself and they are backslashed:
name\.surname.domain.xx
name contains only regular characters (non dots)
In this topic I found a way to implement the negative lookbehind and this is what I did:
mail = mail.replace(/(\\)?\./, function ($0, $1) { return $1?$0:"#"; });
but it's not working because in case (1) it finds the \., it does not touch it, and of course it stops.
On the other end, if I use the option g, it substitute also the third dot obtaining name.surname#domain#xx
Now, is there a way to say:
I want to look in the whole string but I want to stop in the first match?
I hope I explained myself.
Cheers
I misunderstood when first answering your question. So I have changed my answer.
If you don't put the /g flag, it will just replace the first match, meaning you can look for the first punctuation without a \ in front of it. Second you can replace all the \. belonging to the user part of the email with a regular punctuation.
http://jsfiddle.net/GLVNY/2/
var emailSingle = 'myname.domain.com',
emailDouble = 'my\\.name\\.another\\.name.domain.com',
regAt = /([^\\])\./,
repAt = '$1#',
regPunct = /\\./g,
repPunct = '.';
emailSingle = emailSingle.replace(regAt, repAt).replace(regPunct, repPunct);
emailDouble = emailDouble.replace(regAt, repAt).replace(regPunct, repPunct);
alert('emailSingle: ' + emailSingle + '\nemailDouble: ' + emailDouble);

Matching invisible characters in JavaScript RegEx

I've got some string that contain invisible characters, but they are in somewhat predictable places. Typically the surround the piece of text I want to extract, and then after the 2nd occurrence I want to keep the rest of the text.
I can't seem to figure out how to both key off of the invisible characters, and exclude them from my result. To match invisibles I've been using this regex: /\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F/ which does seem to work.
Here's an example: [invisibles]Keep as match 1[invisibles]Keep as match 2
Here's what I've been using so far without success:
/([\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+)(.+)([\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+)/(.+)
I've got the capture groups in there, but it's bee a while since I've had to use regex's in this way, so I know I'm missing something important. I was hoping to just make the invisible matches non-capturing groups, but it seems that JavaScript does not support this.
Something like this seems like what you want. The second regex you have pretty much works, but the / is in totally the wrong place. Perhaps you weren't properly reading out the group data.
var s = "\x0EKeep as match 1\x0EKeep as match 2";
var r = /[\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+(.+)[\xA0\x00-\x09\x0B\x0C\x0E-\x1F\x7F]+(.+)/;
var match = s.match(r);
var part1 = match[1];
var part2 = match[2];

Categories

Resources