URL regex does not work in javascript - javascript

I am trying to use John Gruber's URL regex in Javascript but NetBeans keeps telling me there is a syntax error and illegal errors:
var patt = "/(?i)\b((?:[a-z][\w-]+:(?:/{1,3}|[a-z0-9%])
|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]
{2,4}/)(?:[^\s()<>]+|\(([^\s()<>]+|
(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|
(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:
'".,<>?«»“”‘’]))/";
Anyone know how to solve this?

As others have said, it's the double quote. But alternatively, you can just write the regexp as a literal in javascript (but then you need to escape the forward slashes in lines 1 and 3 instead).
var regexp = /\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/i;
I also moved the case-insensitive modifier to the end. Just because. (edit: Well, not just "because" - see Alan Moore's comment below)
Note: Whether you use a literal or a string, it has to be on 1 line.

put the whole expression in one line, and remove the quotes at the start and end so it looks like this var patt = /the-long-patttern/;, netbeans will still complain, but the browsers won't and thats what matters.

You should write it like this in NetBeans:
"(?i)\\b((?:[a-z][\\w-]+:(?:\\/{1,3}|[a-z0-9%])|www\\d{0,3}[.]|[a-z0-9.\\-]"
+ "+[.][a-z]{2,4}\\/)(?:[^\\s()<>]+|\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\))"
+ "+(?:\\(([^\\s()<>]+|(\\([^\\s()<>]+\\)))*\\)|[^\\s`!()\\[\\]{};:'\".,<>?«»“”‘’]))";

Related

Regex returns nothing to repeat [duplicate]

I'm new to Regex and I'm trying to work it into one of my new projects to see if I can learn it and add it to my repitoire of skills. However, I'm hitting a roadblock here.
I'm trying to see if the user's input has illegal characters in it by using the .search function as so:
if (name.search("[\[\]\?\*\+\|\{\}\\\(\)\#\.\n\r]") != -1) {
...
}
However, when I try to execute the function this line is contained it, it throws the following error for that specific line:
Uncaught SyntaxError: Invalid regular expression: /[[]?*+|{}\()#.
]/: Nothing to repeat
I can't for the life of me see what's wrong with my code. Can anyone point me in the right direction?
You need to double the backslashes used to escape the regular expression special characters. However, as #Bohemian points out, most of those backslashes aren't needed. Unfortunately, his answer suffers from the same problem as yours. What you actually want is:
The backslash is being interpreted by the code that reads the string, rather than passed to the regular expression parser. You want:
"[\\[\\]?*+|{}\\\\()#.\n\r]"
Note the quadrupled backslash. That is definitely needed. The string passed to the regular expression compiler is then identical to #Bohemian's string, and works correctly.
Building off of #Bohemian, I think the easiest approach would be to just use a regex literal, e.g.:
if (name.search(/[\[\]?*+|{}\\()#.\n\r]/) != -1) {
// ... stuff ...
}
Regex literals are nice because you don't have to escape the escape character, and some IDE's will highlight invalid regex (very helpful for me as I constantly screw them up).
For Google travelers: this stupidly unhelpful error message is also presented when you make a typo and double up the + regex operator:
Okay:
\w+
Not okay:
\w++
Firstly, in a character class [...] most characters don't need escaping - they are just literals.
So, your regex should be:
"[\[\]?*+|{}\\()#.\n\r]"
This compiles for me.
Well, in my case I had to test a Phone Number with the help of regex, and I was getting the same error,
Invalid regular expression: /+923[0-9]{2}-(?!1234567)(?!1111111)(?!7654321)[0-9]{7}/: Nothing to repeat'
So, what was the error in my case was that + operator after the / in the start of the regex. So enclosing the + operator with square brackets [+], and again sending the request, worked like a charm.
Following will work:
/[+]923[0-9]{2}-(?!1234567)(?!1111111)(?!7654321)[0-9]{7}/
This answer may be helpful for those, who got the same type of error, but their chances of getting the error from this point of view, as mine! Cheers :)
for example I faced this in express node.js when trying to create route for paths not starting with /internal
app.get(`\/(?!internal).*`, (req, res)=>{
and after long trying it just worked when passing it as a RegExp Object using new RegExp()
app.get(new RegExp("\/(?!internal).*"), (req, res)=>{
this may help if you are getting this common issue in routing
This can also happen if you begin a regex with ?.
? may function as a quantifier -- so ? may expect something else to come before it, thus the "nothing to repeat" error. Nothing preceded it in the regex string so it didn't get to quantify anything; there was nothing to repeat / nothing to quantify.
? also has another role -- if the ? is preceded by ( it may indicate the beginning of a lookaround assertion or some other special construct. See example below.
If one forgets to write the () parentheses around the following lookbehind assertion ?<=x, this will cause the OP's error:
Incorrect: const xThenFive = /?<=x5/;
Correct:
const xThenFive = /(?<=x)5/;
This /(?<=x)5/ is a positive lookbehind: we're looking for a 5 that is preceded by an x e.g. it would match the 5 in x563 but not the 5 in x652.

js: special char in regex

in my regular expression, I tried to remove all "{" and "}"s from a string.
Pushing the script with packer/minimizer scripts, breaks them.
That's why I'd like to know about a better and more compatible way of writing:
mystring.replace(/\{/g,"");?
You can just use a string instead of a regex. I'm not sure if this is "better" but it should not break when minified. If you provide the minified example, we may be able to help with that.
mystring.replace("}", "").replace("{", "");
Edit:
If the curly bracket is causing the problem, perhaps this would work...
var reg = new RegExp("\\{|\\}", "g");
mystring.replace(reg, "");
Example from the console...
> var mystring = "test{foo}bar{baz}";
> var reg = new RegExp("\\{|\\}", "g");
> mystring.replace(reg, "");
"testfoobarbaz"
Lastly, you could do this:
If a regex really wont work for you, this will replace all {'s and }'s
It is probably a horrible solution, considering performance, but...
mystring.split("}").join("").split("{").join("");
You could try
mystring.replace(/\u007B/g,"");
This uses unicode rather than the actual symbol, so your packer won't get confused. If you want to replace more than one character in a single statement, you can use the "or" pipe:
mystring.replace(/\u007B|\u007D/g,"");
{ = \u007B
} = \u007D
For more unicode codes see:
http://www.unicode.org/charts/PDF/U0000.pdf
After re-reading the question, it sounds like you've found a bug with the minifier/packer. My first suggestion would be to use a better minimizer that doesn't have these issues, but if you're stuck with what you're using, you could try using the unicode escape sequence in the regular expression:
mystring.replace(/\u007b/g, '');
Alternatively, you could try String.prototype.split and Array.prototype.join:
mystring.split('{').join('');

Invalid regular expression in javascript

I'm trying to find out if a string contains css code with this expression:
var pattern = new RegExp('\s(?[a-zA-Z-]+)\s[:]{1}\s*(?[a-zA-Z0-9\s.#]+)[;]{1}');
But I get "invalid regular expression" error on the line above...
What's wrong with it?
found the regex here: http://www.catswhocode.com/blog/10-regular-expressions-for-efficient-web-development
It's for PHP but it should work in javascript too, right?
What are the ? at the start of the two [a-zA-z-] blocks for? They look wrong to me.
The ? is unfortunately somewhat overload in regexp syntax, it can have three different meanings that I know of, and none of them match what I see in your example.
Also, your \s sequences need the backslash escaping because this is a string - they should look like \\s. To avoid escaping, just use the /.../ syntax instead of new Regexp("...").
That said, even that is insufficient - the regexp still produces an Invalid Group error in Chrome, probably related to the {1} sequences.
The ?'s are messing it up. I'm not sure what they are for.
/\s[a-zA-Z\-]+\s*:\s*[a-zA-Z0-9\s.#]+;/
worked for me (as far as compiling. I didn't test to see if it properly detected a CSS string).
Replace the quotes with / (slashes):
var pattern = /\s([a-zA-Z-]+)\s[:]{1}\s*([a-zA-Z0-9\s.#]+)[;]{1}/;
You also don't need the new RegExp() part either, which is why it's been removed; instead of using a quote or double quote to denote a string, JavaScript uses a slash / to denote a regular expression, which isn't a normal string.
That regular expression is very bad and I would avoid its source in the future. That said, I cleaned it up a bit and got the following result:
var pattern = /\s(?:[a-zA-Z-]+)\s*:\s*(?:[^;\n\r]+);/;
this matches something that looks like css, for example:
background-color: red;
Here's the fiddle to prove it, though I'd recommend to find a different solution to your problem. This is a very simple regex and it's not save to say that it is reliable.

javascript regex invalid quantifier error

I have the following javascript code:
if (url.match(/?rows.*?(?=\&)|.*/g)){
urlset= url.replace(/?rows.*?(?=\&)|.*/g,"rows="+document.getElementById('rowcount').value);
}else{
urlset= url+"&rows="+document.getElementById('rowcount').value;
}
I get the error invalid quantifier at the /?rows.*?.... This same regex works when testing it on http://www.pagecolumn.com/tool/regtest.htm using the test string
?srt=acc_pay&showfileCL=yes&shownotaryCL=yes&showclientCL=no&showborrowerCL=yes&shownotaryStatusCL=yes&showclientStatusCL=yes&showbillCL=yes&showfeeCL=yes&showtotalCL=yes&dir=asc&closingDate=12/01/2011&closingDate2=12/31/2011&sort=notaryname&pageno=0&rows=anything&Start=0','bodytable','xyz')
In this string, the above regex is supposed to match:
rows=anything
I actually don't even need the /? to get it to work, but if I don't put that into my javascript, it acts like it's not even regex... I'm terrible with Regex period, so this one has me pretty confused. And that error is the only one I am getting in Firefox's error console.
EDIT
Using that link I posted above, it seems that the leading / tries to match an actual forward slash instead of just marking the code as the beginning of a regex statement. So the ? is in there so that if it doesn't match the / to anything, it continues anyway.
RESOLUTION
Ok, so in the end, I had to change my regex to this:
/rows=.*(?=\&?)/g
This matched the word "rows=" followed by anything until it hit an ampersand or ran out of text.
You need to escape the first ?, since it has special meaning in a regex.
/\?rows.*?(?=\&)|.*/g
// ^---escaped
regtest.htm produces
new RegExp("?rows.?(?=\&)|.", "") returned a SyntaxError: invalid
quantifier
The value you put into the web site shouldn't have the / delimiters on the regex, so put in ?rows.*?(?=\&)|.* and it shows the same problem. Your JavaScript code should look like
re = /rows.*?(?=\&)|.*/g;
or similar (but that is a pointless regex as it matches everything). If you can't fix it, please describe what you want to match and show your JavaScript
You might consider refactoring you code to look something like this:
var url = "sort=notaryname&pageno=0&rows=anything&Start=0"
var rowCount = "foobar";
if (/[\?\&]rows=/.test(url))
{
url = url.replace(/([\?\&]rows=)[^\&]+/g,"$1"+rowCount);
}
console.log(url);
Output
sort=notaryname&pageno=0&rows=foobar&Start=0

What is the correct javascript RegEx formula for removing trailing commas on a string?

I found this in a message forum, I don't know regex so I was hoping you could explain it to me or give me a better solution.
StrippedPrefix_JS_ItemNo = StrippedPrefix_JS_ItemNo.replace(/,$/,'');
What is the opening / for?
$ is end of line, I know that much, and I can see the empty replace ''.
/,$/
the /expression goes here/ is how javascript can define a regular expression. Without the /:
var expression = ,$;
That's a syntax error. So the slashes mark it as an expression. It can also be written, var expression = new RegExp(",$");.
More Info about JavaScript RegExp

Categories

Resources