Javascript/RegExp: Lookbehind Assertion is causing a "Invalid group" error

Javascript/RegExp: Lookbehind Assertion is causing a "Invalid group" error - javascript

I'm doing a simple Lookbehind Assertion to get a segment of the URL (example below) but instead of getting the match I get the following error:
Uncaught SyntaxError: Invalid regular expression: /(?<=\#\!\/)([^\/]+)/: Invalid group
Here is the script I'm running:
var url = window.location.toString();
url == http://my.domain.com/index.php/#!/write-stuff/something-else
// lookbehind to only match the segment after the hash-bang.
var regex = /(?<=\#\!\/)([^\/]+)/i;
console.log('test this url: ', url, 'we found this match: ', url.match( regex ) );
the result should be write-stuff.
Can anyone shed some light on why this regex group is causing this error? Looks like a valid RegEx to me.
I know of alternatives on how to get the segment I need, so this is really just about helping me understand what's going on here rather than getting an alternative solution.
Thanks for reading.
J.

I believe JavaScript does not support positive lookbehind. You will have to do something more like this:
<script>
var regex = /\#\!\/([^\/]+)/;
var url = "http://my.domain.com/index.php/#!/write-stuff/something-else";
var match = regex.exec(url);
alert(match[1]);
</script>

Javascript doesn't support look-behind syntax, so the (?<=) is what's causing the invalidity error. However, you can mimick it with various techniques: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript

Also you could use String.prototype.match() instead of RegExp.prototype.exec() in the case of global(/g) or sticky flags(/s) are not set.
var regex = /\#\!\/([^\/]+)/;
var url = "http://my.domain.com/index.php/#!/write-stuff/something-else";
var match = url.match(regex); // ["#!/write-stuff", "write-stuff", index: 31, etc.,]
console.log(match[1]); // "write-stuff"

Related

Regexp match everything after a word

I am trying to extract the base64 string from a data-url. The string looks like this, so I am trying to extract everything after the word base64
test = 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQAB'
So I want to extract the following from the above string
,/9j/4AAQSkZJRgABAQAAAQAB
Here is my regex
const base64rgx = new RegExp('(?<=base64)(?s)(.*$)');
console.log(test.match(base64rgx))
But this fails with the error:
VM3616:1 Uncaught SyntaxError: Invalid regular expression: /(?<=base64)(?s)(.*$)/: Invalid group

var test = 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQAB';
var regex = /(?<=base64).+/;
var r = test.match(regex);
console.log(r);
Here's the regex: https://regex101.com/r/uzyu0a/1

It appears that lookbehinds are not being supported. But the good news is that you don't need a lookaround here, the following pattern should work:
test = 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQAB'
var regex = /.*?base64(.*?)/g;
var match = regex.exec(test);
console.log(match[1]);
I'm not sure precisely how much of the string after base64 you want to capture. If, for example, you don't want the comma, or the 9j portion, then we can easily modify the pattern to handle that.

You may not need regular expression: just find base64 string with indexOf and use substr
var test = 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQAB';
var data = test.substr(test.indexOf('base64')+6); // 6 is length of "base64" string
console.log(data);

Regular expression not matching specialist unicode characters

I'm trying to match unicode regular expression but somehow the \p{L} wont work.
<script>
var input="teëst";
var re = /^[a-zA-Z-. \pL]{2,32}$/;
var is_valid=input.match(re);
if(is_valid){
document.write('Regularexpression valid');
} else {
document.write('Regularexpression invalid');
}
</script>
Plnkr.co:
https://plnkr.co/edit/3PCMxqCnwsyrueYQbB8q?p=preview
What am I doing wrong?
UPDATE
https://stackoverflow.com/a/280762/989121
Workaround:
var re = /^[a-zA-Z- \u00c0-\u017e]{2,32}$/;

My google search on javascript online regular expression check brought me to regex101.com and this validated my regexp so during the creation of this question I thought I was doing something wrong elsewhere in the code. Points out unicode is not supported yet.
https://stackoverflow.com/a/280762/989121
Workaround:
var re = /^[a-zA-Z- \u00c0-\u017e]{2,32}$/;

You can use:
var re = /^[a-zA-Z\u00C0-\u017F-. \pL]{2,32}$/;
It uses unicode matching, See here

Try with this regular expression:
var re = /[^\x00-\x7F]+/;

Regular expression failed in URL address test

I try to build an Regular expression to check valid URL address. for now I tested different address and all was good , but those next (valid) address's failed:
url = "http://example.com/tr/vvf/index.php/docs/po/trf"
//url = "http://example-a.mydomain.com/test/ny" also not working
var pattern = new RegExp("(https|ftp|http)://[\w-]+(\.[\w-]+)+([\w.,#?^=%&:/~+#-]*[\w#?^=%&/~+#-])?");
pattern.test(url)
I think because of the index.php/doc... Any ideas how to fix it

Just use regex literal instead of RegExp object:
var pattern = /(https|ftp|http):\/\/[\w-]+(\.[\w-]+)+([\w.,#?^=%&:\/~+#-]*[\w#?^=%&\/~+#-])?/;
RegExp works with a string, that requires you to do double escaping so \w becomes \\w in it.
See it working here

Regex expression to match the First url after a space followed

I want to match the First url followed by a space using regex expression while typing in the input box.
For example :
if I type www.google.com it should be matched only after a space followed by the url
ie www.google.com<SPACE>
Code
$(".site").keyup(function()
{
var site=$(this).val();
var exp = /^http(s?):\/\/(\w+:{0,1}\w*)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!\-\/]))?/;
var find = site.match(exp);
var url = find? find[0] : null;
if (url === null){
var exp = /[-\w]+(\.[a-z]{2,})+(\S+)?(\/|\/[\w#!:.?+=&%#!\-\/])?/g;
var find = site.match(exp);
url = find? 'http://'+find[0] : null;
}
});
Fiddle
Please help, Thanks in advance

you should be using a better regex to correctly match the query & fragment parts of your url. Have a look here (What is the best regular expression to check if a string is a valid URL?) for a correct IRI/URI structured Regex test.
But here's a rudimentary version:
var regex = /[-\w]+(\.[a-z]{2,})+(\/?)([^\s]+)/g;
var text = 'test google.com/?q=foo basdasd www.url.com/test?q=asdasd#cheese something else';
console.log(text.match(regex));
Expected Result:
["google.com/?q=foo", "www.url.com/test?q=asdasd#cheese"]
If you really want to check for URLs, make sure you include scheme, port, username & password checks just to be safe.
In the context of what you're trying to achieve, you should really put in some delay so that you don't impact browser performance. Regex tests can be expensive when you use complex rules especially so when running the same rule every time a new character is entered. Just think about what you're trying to achieve and whether or not there's a better solution to get there.

With a lookahead:
var exp = /[-\w]+(\.[a-z]{2,})+(\S+)?(\/|\/[\w#!:.?+=&%#!\-\/])?(?= )/g;
I only added this "(?= )" to your regex.
Fiddle

Regex to detect a string that contains a URL or file extension

I'm trying to create a small script that detects whether the string input is either:
1) a URL (which will hold a filename): 'http://ajax.googleapis.com/html5shiv.js'
2) just a filename: 'html5shiv.js'
So far I've found this but I think it just checks the URL and file extension. Is there an easy way to make it so it uses an 'or' check? I'm not very experienced with RegExp.
var myRegExp = /[^\\]*\.(\w+)$/i;
Thank you in advance.

How bout this regex?
(\.js)$
it checks the end of the line if it has a .js on it.
$ denotes end of line.
tested here.

Basically, to use 'OR' in regex, simply use the 'pipe' delimiter.
(aaa|bbb)
will match
aaa
or
bbb
For regex to match a url, I'd suggest the following:
\w+://[\w\._~:/?#\[\]#!$&'()*+,;=%]*
This is based on the allowed character set for a url.
For the file, what's your definition of a filename?
If you want to search for strings, that match "(at least) one to many non-fullstop characters, followed by a fullstop, followed by (at least) one to many non-fullstop characters", I'd suggest the following regex:
[^\.]+\.[^\.]+
And altogether:
(\w+://[\w\._~:/?#\[\]#!$&'()*+,;=%]*|[^\.]+\.[^\.]+)
Here's an example of working (in javascript): jsfiddle
You can test it out regex online here: http://gskinner.com/RegExr/

If it is for the purpose of flow control you can do the following:
var test = "http://ajax.googleapis.com/html5shiv.js";
// to recognize http & https
var regex = /^https?:\/\/.*/i;
var result = regex.exec(test);
if (result == null){
// no URL found code
} else {
// URL found code
}
For the purpose of capturing the file name you could use:
var test = "http://ajax.googleapis.com/html5shiv.js";
var regex = /(\w+\.\w+)$/i;
var filename = regex.exec(test);

Yes, you can use the alternation operator |. Be careful, though, because its priority is very low. Lower than sequencing. You will need to write things like /(cat)|(dog)/.
It's very hard to understand what you exactly want with so few use/test cases, but
(http://[a-zA-Z0-9\./]+)|([a-zA-Z0-9\.]+)
should give you a starting point.

If it's a URL, strip it down to the last part and treat it the same way as "just a filename".
function isFile(fileOrUrl) {
// This will return everything after the last '/'; if there's
// no forward slash in the string, the unmodified string is used
var filename = fileOrUrl.split('/').pop();
return (/.+\..+/).test(filename);
}

Try this:
var ajx = 'http://ajax.googleapis.com/html5shiv.js';
function isURL(str){
return /((\/\w+)|(^\w+))\.\w{2,}$/.test(str);
}
console.log(isURL(ajx));

Have a look at this (requires no regex at all):
var filename = string.indexOf('/') == -1
? string
: string.split('/').slice(-1)[0];

Here is the program!
<script>
var url="Home/this/example/file.js";
var condition=0;
var result="";
for(var i=url.length; i>0 && condition<2 ;i--)
{
if(url[i]!="/" && url[i]!="."){result= (condition==1)? (url[i]+result):(result);}
else{condition++;}
}
document.write(result);
</script>

Develop Reference

JavaScript is the programming language of the Web.

Javascript/RegExp: Lookbehind Assertion is causing a "Invalid group" error - javascript

I believe JavaScript does not support positive lookbehind. You will have to do something more like this: <script> var regex = /\#\!\/([^\/]+)/; var url = "http://my.domain.com/index.php/#!/write-stuff/something-else"; var match = regex.exec(url); alert(match[1]); </script>

Javascript doesn't support look-behind syntax, so the (?<=) is what's causing the invalidity error. However, you can mimick it with various techniques: http://blog.stevenlevithan.com/archives/mimic-lookbehind-javascript

Related

Regexp match everything after a word

Regular expression not matching specialist unicode characters

Regular expression failed in URL address test

Regex expression to match the First url after a space followed

Regex to detect a string that contains a URL or file extension

Categories

Resources