Regex expression to match the First url after a space followed - javascript

I want to match the First url followed by a space using regex expression while typing in the input box.
For example :
if I type www.google.com it should be matched only after a space followed by the url
ie www.google.com<SPACE>
Code
$(".site").keyup(function()
{
var site=$(this).val();
var exp = /^http(s?):\/\/(\w+:{0,1}\w*)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/;
var find = site.match(exp);
var url = find? find[0] : null;
if (url === null){
var exp = /[-\w]+(\.[a-z]{2,})+(\S+)?(\/|\/[\w#!:.?+=&%#!\-\/])?/g;
var find = site.match(exp);
url = find? 'http://'+find[0] : null;
}
});
Fiddle
Please help, Thanks in advance

you should be using a better regex to correctly match the query & fragment parts of your url. Have a look here (What is the best regular expression to check if a string is a valid URL?) for a correct IRI/URI structured Regex test.
But here's a rudimentary version:
var regex = /[-\w]+(\.[a-z]{2,})+(\/?)([^\s]+)/g;
var text = 'test google.com/?q=foo basdasd www.url.com/test?q=asdasd#cheese something else';
console.log(text.match(regex));
Expected Result:
["google.com/?q=foo", "www.url.com/test?q=asdasd#cheese"]
If you really want to check for URLs, make sure you include scheme, port, username & password checks just to be safe.
In the context of what you're trying to achieve, you should really put in some delay so that you don't impact browser performance. Regex tests can be expensive when you use complex rules especially so when running the same rule every time a new character is entered. Just think about what you're trying to achieve and whether or not there's a better solution to get there.

With a lookahead:
var exp = /[-\w]+(\.[a-z]{2,})+(\S+)?(\/|\/[\w#!:.?+=&%#!\-\/])?(?= )/g;
I only added this "(?= )" to your regex.
Fiddle

Related

regex email pattern in a negate way

I was working around some regex pattern
I have a variable,
var url = "last_name=Ahuja&something#test.com";
The url contains emailId. I have a regex pattern to check if the variable contains emailId.
var filter = /^([a-zA-Z0-9_\.\-])+\#(([a-zA-Z0-9\-])+\.)+([a-zA-Z0-9]{2,4})+$/;
My requirement is:
I want exact negate(opposite) of the above pattern. Like, My condition should be false if the url contains email pattern.
I mean the regex pattern should be in that way.
Can somebody please help me on this.
Instead of negating your regex, you can test whether it matches.
var filter = /^([a-zA-Z0-9_\.\-])+\#(([a-zA-Z0-9\-])+\.)+([a-zA-Z0-9]{2,4})+$/;
var url = "last_name=Ahuja&something#test.com";
if(-1 === url.search(filter))
{
alert("Didn't find it");
}
else
{
alert("Found it");
}
Using a negative lookahead you can check if the string does not have an email at the beginning, which mimics the behavior you want since your regex had start and end anchors.
^(?!([a-zA-Z0-9_\.\-])+\#(([a-zA-Z0-9\-])+\.)+([a-zA-Z0-9]{2,4})+).*$
But if you wanted to make sure there was no valid email anywhere in the line, you could modify this a little.
^((?!([a-zA-Z0-9_\.\-])+\#(([a-zA-Z0-9\-])+\.)+([a-zA-Z0-9]{2,4})+).)*$
See an explanation here
But you should just do this in code like Jonathan suggests.

Why is this javascript url validator failing?

I have the following code to validate if a person entered a "valid" url in a textbox:
function validateURL(textval) {
var urlregex = new RegExp(
"^(http:\/\/www.|https:\/\/www.|ftp:\/\/www.|www.){1}([0-9A-Za-z]+\.)");
return urlregex.test(textval);
}
a user is getting an error where this is returning false for what seems like a valid urL
http://a.website.com/issues/i#browse/TEST-111
Can someone confirm why this example wouldn't pass the "valid url" test?
Can someone confirm why this example wouldn't pass the "valid url" test?
The main trouble with the regex is that www. part is obligatory in the pattern.
If you want to make it optional, use a ? modifier with a group around it ((?:www\.)?):
^(?:(?:(?:ftp|https?):\/\/)?)(?:www\.)?[0-9A-Za-z]+(?:\.[0-9A-Za-z]+)*
This will match http://a.website.com part. To match the whole string, you can use:
^(?:(?:(?:ftp|https?):\/\/)?)(www\.)?[0-9A-Za-z]+(?:\.[0-9A-Za-z]+)*(?:\/[^\/]*)*$
See demo
var re = /^(?:(?:(?:ftp|https?):\/\/)?)(www\.)?[0-9A-Za-z]+(?:\.[0-9A-Za-z]+)*(?:\/[^\/]*)*$/;
var str = 'http://a.website.com/issues/i#browse/TEST-111';
if ((m = re.exec(str)) !== null) {
document.getElementById("res").innerHTML = m[0];
}
<div id="res"/>
Your regex requires that the host name portion starts with www. (this is not a requirement for URLs in general). The URL you are testing does not include www..
There are many other reasons why the regex is broken (you don't test past the first character after www., your attempt to do so bans many characters that are allowed in URLs, etc) but that is why the URL you have isn't passing.

Regex to detect a string that contains a URL or file extension

I'm trying to create a small script that detects whether the string input is either:
1) a URL (which will hold a filename): 'http://ajax.googleapis.com/html5shiv.js'
2) just a filename: 'html5shiv.js'
So far I've found this but I think it just checks the URL and file extension. Is there an easy way to make it so it uses an 'or' check? I'm not very experienced with RegExp.
var myRegExp = /[^\\]*\.(\w+)$/i;
Thank you in advance.
How bout this regex?
(\.js)$
it checks the end of the line if it has a .js on it.
$ denotes end of line.
tested here.
Basically, to use 'OR' in regex, simply use the 'pipe' delimiter.
(aaa|bbb)
will match
aaa
or
bbb
For regex to match a url, I'd suggest the following:
\w+://[\w\._~:/?#\[\]#!$&'()*+,;=%]*
This is based on the allowed character set for a url.
For the file, what's your definition of a filename?
If you want to search for strings, that match "(at least) one to many non-fullstop characters, followed by a fullstop, followed by (at least) one to many non-fullstop characters", I'd suggest the following regex:
[^\.]+\.[^\.]+
And altogether:
(\w+://[\w\._~:/?#\[\]#!$&'()*+,;=%]*|[^\.]+\.[^\.]+)
Here's an example of working (in javascript): jsfiddle
You can test it out regex online here: http://gskinner.com/RegExr/
If it is for the purpose of flow control you can do the following:
var test = "http://ajax.googleapis.com/html5shiv.js";
// to recognize http & https
var regex = /^https?:\/\/.*/i;
var result = regex.exec(test);
if (result == null){
// no URL found code
} else {
// URL found code
}
For the purpose of capturing the file name you could use:
var test = "http://ajax.googleapis.com/html5shiv.js";
var regex = /(\w+\.\w+)$/i;
var filename = regex.exec(test);
Yes, you can use the alternation operator |. Be careful, though, because its priority is very low. Lower than sequencing. You will need to write things like /(cat)|(dog)/.
It's very hard to understand what you exactly want with so few use/test cases, but
(http://[a-zA-Z0-9\./]+)|([a-zA-Z0-9\.]+)
should give you a starting point.
If it's a URL, strip it down to the last part and treat it the same way as "just a filename".
function isFile(fileOrUrl) {
// This will return everything after the last '/'; if there's
// no forward slash in the string, the unmodified string is used
var filename = fileOrUrl.split('/').pop();
return (/.+\..+/).test(filename);
}
Try this:
var ajx = 'http://ajax.googleapis.com/html5shiv.js';
function isURL(str){
return /((\/\w+)|(^\w+))\.\w{2,}$/.test(str);
}
console.log(isURL(ajx));
Have a look at this (requires no regex at all):
var filename = string.indexOf('/') == -1
? string
: string.split('/').slice(-1)[0];
Here is the program!
<script>
var url="Home/this/example/file.js";
var condition=0;
var result="";
for(var i=url.length; i>0 && condition<2 ;i--)
{
if(url[i]!="/" && url[i]!="."){result= (condition==1)? (url[i]+result):(result);}
else{condition++;}
}
document.write(result);
</script>

regex test for repeat words for url and filepath

I need to alert user when the entered value
does"t start with http:// or https:// or //
if any of the above mentioned 3 words(http:// or https:// or //) were repeated in the entered
value.
I tried the below regex in which the 1st case succeeds where 2nd case fails
var regexp = /^(http:(\/\/)|https:(\/\/)|(\\\\))/;
var enteredvalue="http://facebookhttp://"
if (!regexp.test(enteredvalue.value)) {
alert("not valid url or filepath);
}
Please help me regarding the same.
This seems to work (though there will be more elegant solutions). Hope it helps at all.
var regex = /http[s]{0,1}:\/\/|\/\//;
var x = enteredvalue.split(regex);
if(!(x[0]=='' && x.length==2))
alert("not valid url or filepath");
Cheers.
Try
var regexp = /^(?!(.*\/\/){2})(https?:)?\/\//;
var enteredvalue = "http://facebookhttp://";
if (!regexp.test(enteredvalue)) {
console.log("not valid url or filepath");
}
A negative look-ahead is used to prevent a match if two sets of // appear in the string.
To check for multiple matches you could use String.match in conjunction with RegexP and the "global search" option. Below is a simplified version of your code:
var enteredvalue="http://facebookhttp://"
var test_pattern = new RegExp("(https://|http://|//)", "g"); //RegExP(pattern, [option])
enteredvalue.match(test_pattern); // should return ["http://", "http://"]
When match returns more than one instance then it is clear that the pattern is used more than once. That should help with identifying incorrect urls.
Also, it's alot cleaner than splits.
Hope this helps.

Regex: Getting content from URL

I want to get "the-game" using regex from URLs like
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/
http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/
What parts of the URL could vary and what parts are constant? The following regex will always match whatever is in the slashes following "/en/" - the-game in your example.
(?<=/en/).*?(?=/)
This one will match the contents of the 2nd set of slashes of any URL containing "webdev", assuming the first set of slashes contains a 2 or 3 character language code.
(?<=.*?webdev.*?/.{2,3}/).*?(?=/)
Hopefully you can tweak these examples to accomplish what you're looking for.
var myregexp = /^(?:[^\/]*\/){4}([^\/]+)/;
var match = myregexp.exec(subject);
if (match != null) {
result = match[1];
} else {
result = "";
}
matches whatever lies between the fourth and fifth slash and stores the result in the variable result.
You probably should use some kind of url parsing library rather than resorting to using regex.
In python:
from urlparse import urlparse
url = urlparse('http://www.somesite.com.domain.webdev.domain.com/en/the-game/another-one/another-one/another-one/')
print url.path
Which would yield:
/en/the-game/another-one/another-one/another-one/
From there, you can do simple things like stripping /en/ from the beginning of the path. Otherwise, you're bound to do something wrong with a regular expression. Don't reinvent the wheel!

Categories

Resources