RegEx Syntax Error - nothing to repeat - javascript

Could someone please tell me why this RegEx fails?
http://jsfiddle.net/SrKPG/
^(\+[0-9]+ )[1-9]{2,} [0-9]{2,}(\-[0-9]+|)$
The funny thing is - when I test it at http://jsregex.com/ it works.
But in my code it fails.

The reason you're failing to match is because your second sequence of numbers does not accept zeroes:
^([+][0-9]+ )[1-9]{2,} [0-9]{2,}(\-[0-9]+|)$
+43 660 1234556

It fails because you write it as a string, without escaping the \.
You could write
var regex = "^(\\+[0-9]+ )[1-9]{2,} [0-9]{2,}(\\-[0-9]+|)$";
But, instead of using a string and the RegExp constructor, you should directly use a regex literal :
text.match(/^(\+[0-9]+ )[1-9]{2,} [0-9]{2,}(\-[0-9]+|)$/g);
You were also refusing 0 in the middle, which doesn't comply with your test string. It seems that what you want is
text.match(/^(\+[0-9]+ )[0-9]{2,} [0-9]{2,}(\-[0-9]+|)$/g);

Yours
"^(\+[0-9]+ )[1-9]{2,} [0-9]{2,}(\-[0-9]+|)$"
Correct
"^(\\+[0-9]+ )[1-9]{2,} [0-9]{2,}(-[0-9]+|)$"
The double escaping is a requirement of JavaScript string literals. It has nothing to do with regex.
Upon parsing your program your string literal becomes "^(+[0-9]+ )[1-9]{2,} [0-9]{2,}(-[0-9]+|)$" in memory, because \+ (as opposed to, let's say, \n) has no meaning in JS strings.
At this time the regex engine complains about the lone + that follows nothing.
Note that the something-or-nothing (something|) is better written as (something)?.
Apart from that: Thou shalt not use regex to validate phone numbers.
EDIT: The proof is in the comments. ;)

Related

Regex returns nothing to repeat [duplicate]

I'm new to Regex and I'm trying to work it into one of my new projects to see if I can learn it and add it to my repitoire of skills. However, I'm hitting a roadblock here.
I'm trying to see if the user's input has illegal characters in it by using the .search function as so:
if (name.search("[\[\]\?\*\+\|\{\}\\\(\)\#\.\n\r]") != -1) {
...
}
However, when I try to execute the function this line is contained it, it throws the following error for that specific line:
Uncaught SyntaxError: Invalid regular expression: /[[]?*+|{}\()#.
]/: Nothing to repeat
I can't for the life of me see what's wrong with my code. Can anyone point me in the right direction?
You need to double the backslashes used to escape the regular expression special characters. However, as #Bohemian points out, most of those backslashes aren't needed. Unfortunately, his answer suffers from the same problem as yours. What you actually want is:
The backslash is being interpreted by the code that reads the string, rather than passed to the regular expression parser. You want:
"[\\[\\]?*+|{}\\\\()#.\n\r]"
Note the quadrupled backslash. That is definitely needed. The string passed to the regular expression compiler is then identical to #Bohemian's string, and works correctly.
Building off of #Bohemian, I think the easiest approach would be to just use a regex literal, e.g.:
if (name.search(/[\[\]?*+|{}\\()#.\n\r]/) != -1) {
// ... stuff ...
}
Regex literals are nice because you don't have to escape the escape character, and some IDE's will highlight invalid regex (very helpful for me as I constantly screw them up).
For Google travelers: this stupidly unhelpful error message is also presented when you make a typo and double up the + regex operator:
Okay:
\w+
Not okay:
\w++
Firstly, in a character class [...] most characters don't need escaping - they are just literals.
So, your regex should be:
"[\[\]?*+|{}\\()#.\n\r]"
This compiles for me.
Well, in my case I had to test a Phone Number with the help of regex, and I was getting the same error,
Invalid regular expression: /+923[0-9]{2}-(?!1234567)(?!1111111)(?!7654321)[0-9]{7}/: Nothing to repeat'
So, what was the error in my case was that + operator after the / in the start of the regex. So enclosing the + operator with square brackets [+], and again sending the request, worked like a charm.
Following will work:
/[+]923[0-9]{2}-(?!1234567)(?!1111111)(?!7654321)[0-9]{7}/
This answer may be helpful for those, who got the same type of error, but their chances of getting the error from this point of view, as mine! Cheers :)
for example I faced this in express node.js when trying to create route for paths not starting with /internal
app.get(`\/(?!internal).*`, (req, res)=>{
and after long trying it just worked when passing it as a RegExp Object using new RegExp()
app.get(new RegExp("\/(?!internal).*"), (req, res)=>{
this may help if you are getting this common issue in routing
This can also happen if you begin a regex with ?.
? may function as a quantifier -- so ? may expect something else to come before it, thus the "nothing to repeat" error. Nothing preceded it in the regex string so it didn't get to quantify anything; there was nothing to repeat / nothing to quantify.
? also has another role -- if the ? is preceded by ( it may indicate the beginning of a lookaround assertion or some other special construct. See example below.
If one forgets to write the () parentheses around the following lookbehind assertion ?<=x, this will cause the OP's error:
Incorrect: const xThenFive = /?<=x5/;
Correct:
const xThenFive = /(?<=x)5/;
This /(?<=x)5/ is a positive lookbehind: we're looking for a 5 that is preceded by an x e.g. it would match the 5 in x563 but not the 5 in x652.

RegEx that works in Javascript won't do so in PHP

I will try to make my question short yet understandable, I have a simple RegEx I use in javascript to check for characters that aren't alphanumeric (AKA Symbols). It would be "/[$-/:-?{-~!"^_`[]]/"
In javascript, doing
if(/[$-/:-?{-~!"^_`\[\]]/.test( string ))
just works, if any of those characters are in the string, it will give true, else, it will give false. I tried to do the same in PHP, the following way
if(preg_match('/[$-/:-?{-~!"^_`\[\]]/', $string ))
other regexes work when done this way, but this particular one simply will give false no matter what when ran in PHP.
Is there any reason to this? Am I doing something wrong? Does PHP comprehend regexes in a different way? What should I change to make it work?
Thanks for your time.
Since php uses PCRE, you will get a pattern error using delimiter / as seen here http://regex101.com/r/3ILGgE/1
So, it should be escaped correctly.
Using / as the delimiter, the string is
'/[$-\/:-?{-~!"^_`\[\]]/'
Using ~ as the delimiter, the string is
'~[$-/:-?{-\~!"^_`\[\]]~'
Also, be aware you have a couple of range's in the class $-/ and :-? and {-~
that will include the characters between the from/to range characters as well
and does not include the range character - itself as it is an operator.

Regex: get string between last character occurence before a comma

I need some help with Regex.
I have this string: \\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam
and want to get the result: ["dolor", "conseteteur", "diam"]So in words the word between the last backslash and a comma or the end.
I've already figured out a working test, but because of reasons it won't work in neitherChrome (v44.0.2403.130) nor IE (v11.0.9600.17905) console.There i'm getting the result: ["\loremipsumdolor,", "\sitametconseteteur,", "\sadipscingelitrseddiam"]
Can you please tell me, why the online testers aren't working and how i can achieve the right result?
Thanks in advance.
PS: I've tested a few online regex testers with all the same result. (regex101.com, regexpal.com, debuggex.com, scriptular.com)
The string
'\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam'
is getting escaped, if you try the following in the browser's console you'll see what happens:
var s = '\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam'
console.log(s);
// prints '\loremipsumdolor,\sitametconseteteur,\sadipscingelitrseddiam'
To use your original string you have to add additional backslashes, otherwise it becomes a different one because it tries to escape anything followed by a single backslash.
The reason why it works in regexp testers is because they probably sanitize the input string to make sure it gets evaluated as-is.
Try this (added an extra \ for each of them):
str = '\\\\lorem\\ipsum\\dolor,\\\\sit\\amet\\conseteteur,\\\\sadipscing\\elitr\\sed\\diam'
re = /\\([^\\]*)(?:,|$)/g
str.match(re)
// should output ["\dolor,", "\conseteteur,", "\diam"]
UPDATE
You can't prevent the interpreter from escaping backslashes in string literals, but this functionality is coming with EcmaScript6 as String.raw
s = String.raw`\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam`
Remember to use backticks instead of single quotes with String.raw.
It's working in latest Chrome, but I can't say for all other browsers, if they're moderately old, it probably isn't implemented.
Also, if you want to avoid matching the last backslash you need to:
remove the \\ at the start of your regexp
use + instead of * to avoid matching the line end (it will create an extra capture)
use a positive lookahead ?=
like this
s = String.raw`\\lorem\ipsum\dolor,\\sit\amet\conseteteur,\\sadipscing\elitr\sed\diam`;
re = /([^\\]+)(?=,|$)/g;
s.match(re);
// ["dolor", "conseteteur", "diam"]
You may try this,
string.match(/[^\\,]+(?=,|$)/gm);
DEMO

difference between ruby regex and javascript regex

I made this regular expression: /.net.(\w*)/
I'm trying to capture the qa in a string like this:
https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG
I'm doing .replace on it like so location.replace(/.net.(\w*)/,data.newName));
But instead of capturing qa, it captures .net, when I run the code in Javascript
According to this online regex tool made for ruby, it captures qa as intended
http://rubular.com/r/ItrG7BRNRn
What's the difference between Javascript regexes and Ruby regexes, and how can I make my regex work as intended in javascript?
Edit:
I changed my code to this:
var str = `https://xxxxxxxxxx.cloudfront.net/qa/club`;
var re = /\.net\/([^\/]*)\//;
console.log(data2.files[i].location.replace(re,'$1'+ "test"));
And instead of
https://dm7svtk8jb00c.cloudfront.net/test/club
I get this:
https://dm7svtk8jb00c.cloudfrontqatestclub
If I remove the $1 I get https://dm7svtk8jb00c.cloudfronttestclub, which is closer, but I want to keep the slashes.
This would be a better regex:
/\.net\/([^\/]*)\//
Remember that . will match any character, not the period character. For that you need to escape it with a leading backslash: \.
Also, \w will only match numbers, letters and underscores. You could quite legitimately have a dash in that part of the URL. Therefore you're far better off matching anything that isn't a forward slash.
I am not sure how Ruby works, but JavaScript replace will not just replace the capture group, it replaces the whole matched string. By adding another capture group, you can use $1 to add back in the string you want to keep.
...replace(/(.net.)(\w*)/,"$1" + data.newName");
You have to do that like this:
location.replace(/(\.net.)(\w*)/, '$1' + data.newName)
replace replaces the whole matched substring, not a particular group. Ruby works exactly in the same way:
ruby -e "puts 'https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG'.sub(/.net.(\w*)/, '##')"
https://xxxxxx.cloudfront##/club/Slide1.PNG
ruby -e "puts 'https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG'.sub(/(.net.)(\w*)/, '\\1' + '##')"
https://xxxxxx.cloudfront.net/##/club/Slide1.PNG
There's no difference (at least with the pattern you've provided). In both cases, the expression matches ".net/qa", with qa being the first capture group within the expression. Notice that even in your linked example the entire match is highlighted.
I'd recommend something like this:
location.replace(/(.net.)\w*/, "$1" + data.newName);
Or this, to be a bit safer:
location.replace(/(.net.)\w*/, function(m, a) { return a + data.newName; });
It's not so much a different between JavaScript and Ruby's implementations of regular expressions, it's your pattern that needs a bit of work. It's not tight enough.
You can use something like /\.net\/([^\/]+)/, which you can see in action at Rubular.
That returns the characters delimited by / following .net.
Regex patterns are very powerful, but they're also fraught with dangerous side-effects that open up big holes easily, causing false-positives, which can ruin results unexpectedly. Until you know them well, start simply, and test them every imaginable way. And, once you think you know them well, keep doing that; Patterns in code we write where I work are a particular hot-button for me, and I'm always finding holes in them in our code-reviews and requiring them to be tightened until they do exactly what the developer meant, not what they thought they meant.
While the pattern above works, I'd probably do it a bit differently in Ruby. Using the tools made for the job:
require 'uri'
URL = 'https://xxxxxx.cloudfront.net/qa/club/Slide1.PNG'
uri = URI.parse(URL)
path = uri.path # => "/qa/club/Slide1.PNG"
path.split('/')[1] # => "qa"
Or, more succinctly:
URI.parse(URL).path.split('/')[1] # => "qa"

Invalid regular expression in javascript

I'm trying to find out if a string contains css code with this expression:
var pattern = new RegExp('\s(?[a-zA-Z-]+)\s[:]{1}\s*(?[a-zA-Z0-9\s.#]+)[;]{1}');
But I get "invalid regular expression" error on the line above...
What's wrong with it?
found the regex here: http://www.catswhocode.com/blog/10-regular-expressions-for-efficient-web-development
It's for PHP but it should work in javascript too, right?
What are the ? at the start of the two [a-zA-z-] blocks for? They look wrong to me.
The ? is unfortunately somewhat overload in regexp syntax, it can have three different meanings that I know of, and none of them match what I see in your example.
Also, your \s sequences need the backslash escaping because this is a string - they should look like \\s. To avoid escaping, just use the /.../ syntax instead of new Regexp("...").
That said, even that is insufficient - the regexp still produces an Invalid Group error in Chrome, probably related to the {1} sequences.
The ?'s are messing it up. I'm not sure what they are for.
/\s[a-zA-Z\-]+\s*:\s*[a-zA-Z0-9\s.#]+;/
worked for me (as far as compiling. I didn't test to see if it properly detected a CSS string).
Replace the quotes with / (slashes):
var pattern = /\s([a-zA-Z-]+)\s[:]{1}\s*([a-zA-Z0-9\s.#]+)[;]{1}/;
You also don't need the new RegExp() part either, which is why it's been removed; instead of using a quote or double quote to denote a string, JavaScript uses a slash / to denote a regular expression, which isn't a normal string.
That regular expression is very bad and I would avoid its source in the future. That said, I cleaned it up a bit and got the following result:
var pattern = /\s(?:[a-zA-Z-]+)\s*:\s*(?:[^;\n\r]+);/;
this matches something that looks like css, for example:
background-color: red;
Here's the fiddle to prove it, though I'd recommend to find a different solution to your problem. This is a very simple regex and it's not save to say that it is reliable.

Categories

Resources