Range out of order in character class - regex - javascript

I have the following set of code
var nameStartChar = /[A-Z_a-z\xC0-\xD6\xD8-\xF6\u00F8-\u02FF\u0370-\u037D\u037F-\u1FFF\u200C-\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD]///\u10000-\uEFFFF
var nameChar = new RegExp("[\\-\\.0-9"+nameStartChar.source.slice(1,-1)+"\u00B7\u0300-\u036F\\ux203F-\u2040]");
var tagNamePattern = new RegExp('^'+nameStartChar.source+nameChar.source+'*(?:\:'+nameStartChar.source+nameChar.source+'*)?$');
It throws the following error:
sqmtest I/JS: SyntaxError: Invalid regular expression: /[\-\.0-9A-Z_a-z\xC0-\xD6\xD8-\xF6\u00F8-\u02FF\u0370-\u037D\u037F-\u1FFF\u200C-\u200D\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD????-??\ux203F-???]/: Range out of order in character class
at new RegExp (<anonymous>)
at RegExp (<anonymous>)
at Object.$$_sax (http://com.hashcube.sqtest/modules/devkit-core/src/clientapi/native/dom/sax.js:1:5952)
at I (none:615:5092)
at z (none:615:6690)
at Object.jsio (none:615:7357)
at Object.$$_dom_parser (http://com.hashcube.sqtest/modules/devkit-core/src/clientapi/native/dom/dom_parser.js:1:3511)
at I (none:615:5092)
at z (none:615:6690)
at Object.jsio (none:615:7357)
at Object.$$_dom_DOMParser (http://com.hashcube.sqtest/modules/devkit-core/src/clientapi/native/dom/DOMParser.js:1:66)
at I (none:615:5092)
at z (none:615:6690)
at Object.jsio (none:615:7357)
at Object.$$_common.exports.install (http://com.hashcube.sqtest/modules/devkit-core/src/clientapi
The full code is here -> https://github.com/hashcube/devkit-core/blob/hc/src/clientapi/native/dom/sax.js
Any idea why the regex could be failing.
I have not confirmed this, but looks like minification could be causing an issue.
Any thoughts suggestions would help me. I can provide more details if needed
EDIT
I have more information. I looked at the minifed code for both these lines from 2 machines(1 machine where I get this error and another where I don't get the error). It looks like an encoding issue to me. Any help would be appreciated
Machine with Error
var nameStartChar=/[A-Z_a-z\\xC0-\\xD6\\xD8-\\xF6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD]/,
nameChar=RegExp("[\\\\-\\\\.0-9"+nameStartChar.source.slice(1,-1)+"????-??\\\\ux203F-???]"),
tagNamePattern=RegExp("^"+nameStartChar.source+nameChar.source+"*(?::"+nameStartChar.source+nameChar.source+"*)?$"),S_TAG=0,S_ATTR=1,S_ATTR_S=2,S_EQ=3,S_V=4,S_E=5,S_S=6,S_C=7;
Machine without error
var nameStartChar=/[A-Z_a-z\\xC0-\\xD6\\xD8-\\xF6\\u00F8-\\u02FF\\u0370-\\u037D\\u037F-\\u1FFF\\u200C-\\u200D\\u2070-\\u218F\\u2C00-\\u2FEF\\u3001-\\uD7FF\\uF900-\\uFDCF\\uFDF0-\\uFFFD]/,
nameChar=RegExp("[\\\\-\\\\.0-9"+nameStartChar.source.slice(1,-1)+"·�~#-ͯ\\\\ux203F-�~A~#]"),]"),A~#
tagNamePattern=RegExp("^0-9"+nameStartChar.sou+nameChar.source+"*(?::"+nameStartChar.source+nameChar.source+"*)?$"),S_TAG=0,S_ATTR=1,S_ATTR_S=2,S_EEouQ=3,S_V=4,S_E=5,S_S=6,S_C=7;

Seems like you are providing ranges that are not really range for Regex like;
????-??\\\\ux203F-??? => ????-??\\ux203F-???
\\ux203F-??? This is not a range for regex and causes the problem. Rest seems like correct.
Regex accept this one ·�~#-ͯ\\\\ux203F-�~A~#, you can use this if it works for you.
Check Character ranges link, this might help you restructure your regex.
PS: "Machine without error" returns "Unexpected token ]"

Related

Using JavaScript Regex to replace some content I get a Unmatched ')' Regex error

I'm moving a large blog from WordPress to another platform, and I am trying to re-create WordPress shortcodes with Javascript.
I've been able to write the Regex to replace [youtube]'s shortcodes without problems, but I'm having problems with [soundcloud]'s.
The format of the shortcode is the following:
[soundcloud url=”http://api.soundcloud.com/tracks/33660734”]
I created the following Regex rule, that seems to work on Regex 101
\[soundcloud url="http(s?):\/\/api\.soundcloud\.com\/tracks\/([a-zA-Z0-9]*)"(\s*?)\]
https://regex101.com/r/JKL45q/2
But when I include it in my script:
{
service: 'soundcloud',
regex: new RegExp('\[soundcloud url="http(s?):\/\/api\.soundcloud\.com\/tracks\/([a-zA-Z0-9]*)"(\s*?)\]', 'ig'),
},
I get this error, which I can't understand:
main.js:29 Uncaught SyntaxError: Invalid regular expression: /[soundcloud url="http(s?)://api.soundcloud.com/tracks/([a-zA-Z0-9]*)"(s*?)]/: Unmatched ')' (at main.js:29:14)
at new RegExp (<anonymous>)
at main.js:29:14
at main.js:142:3
The source script is here:
https://hotmc.pages.dev/assets/js/main.js
This is a page where the error occurs in the console:
https://hotmc.pages.dev/2011/01/10/esclusivo-anteprima-album-micha-soul-un-brano-in-free-download
Can anyone help? I found other questions about this error, but haven't been able to apply the suggested fixes.
Thank you in advance,
S.
I think you can remove the () of the ([a-zA-Z0-9]*)
new RegExp('[soundcloud url="http(s?):\/\/api\.soundcloud\.com\/tracks\/[a-zA-Z0-9]*"(\s*?)]', 'ig').test(`[soundcloud url="http://api.soundcloud.com/tracks/33660734"]`)
I have been able to get rid of the error by changing hot the RegExp object is initialized.
From this:
new RegExp('\[soundcloud url="http(s?):\/\/api\.soundcloud\.com\/tracks\/([a-zA-Z0-9]*)"(\s*?)\]', 'ig')
To this:
new RegExp(/\[soundcloud url="http(s?):\/\/api\.soundcloud\.com\/tracks\/([a-zA-Z0-9]*)"(\s*?)\]/, 'ig')

Regex returns nothing to repeat [duplicate]

I'm new to Regex and I'm trying to work it into one of my new projects to see if I can learn it and add it to my repitoire of skills. However, I'm hitting a roadblock here.
I'm trying to see if the user's input has illegal characters in it by using the .search function as so:
if (name.search("[\[\]\?\*\+\|\{\}\\\(\)\#\.\n\r]") != -1) {
...
}
However, when I try to execute the function this line is contained it, it throws the following error for that specific line:
Uncaught SyntaxError: Invalid regular expression: /[[]?*+|{}\()#.
]/: Nothing to repeat
I can't for the life of me see what's wrong with my code. Can anyone point me in the right direction?
You need to double the backslashes used to escape the regular expression special characters. However, as #Bohemian points out, most of those backslashes aren't needed. Unfortunately, his answer suffers from the same problem as yours. What you actually want is:
The backslash is being interpreted by the code that reads the string, rather than passed to the regular expression parser. You want:
"[\\[\\]?*+|{}\\\\()#.\n\r]"
Note the quadrupled backslash. That is definitely needed. The string passed to the regular expression compiler is then identical to #Bohemian's string, and works correctly.
Building off of #Bohemian, I think the easiest approach would be to just use a regex literal, e.g.:
if (name.search(/[\[\]?*+|{}\\()#.\n\r]/) != -1) {
// ... stuff ...
}
Regex literals are nice because you don't have to escape the escape character, and some IDE's will highlight invalid regex (very helpful for me as I constantly screw them up).
For Google travelers: this stupidly unhelpful error message is also presented when you make a typo and double up the + regex operator:
Okay:
\w+
Not okay:
\w++
Firstly, in a character class [...] most characters don't need escaping - they are just literals.
So, your regex should be:
"[\[\]?*+|{}\\()#.\n\r]"
This compiles for me.
Well, in my case I had to test a Phone Number with the help of regex, and I was getting the same error,
Invalid regular expression: /+923[0-9]{2}-(?!1234567)(?!1111111)(?!7654321)[0-9]{7}/: Nothing to repeat'
So, what was the error in my case was that + operator after the / in the start of the regex. So enclosing the + operator with square brackets [+], and again sending the request, worked like a charm.
Following will work:
/[+]923[0-9]{2}-(?!1234567)(?!1111111)(?!7654321)[0-9]{7}/
This answer may be helpful for those, who got the same type of error, but their chances of getting the error from this point of view, as mine! Cheers :)
for example I faced this in express node.js when trying to create route for paths not starting with /internal
app.get(`\/(?!internal).*`, (req, res)=>{
and after long trying it just worked when passing it as a RegExp Object using new RegExp()
app.get(new RegExp("\/(?!internal).*"), (req, res)=>{
this may help if you are getting this common issue in routing
This can also happen if you begin a regex with ?.
? may function as a quantifier -- so ? may expect something else to come before it, thus the "nothing to repeat" error. Nothing preceded it in the regex string so it didn't get to quantify anything; there was nothing to repeat / nothing to quantify.
? also has another role -- if the ? is preceded by ( it may indicate the beginning of a lookaround assertion or some other special construct. See example below.
If one forgets to write the () parentheses around the following lookbehind assertion ?<=x, this will cause the OP's error:
Incorrect: const xThenFive = /?<=x5/;
Correct:
const xThenFive = /(?<=x)5/;
This /(?<=x)5/ is a positive lookbehind: we're looking for a 5 that is preceded by an x e.g. it would match the 5 in x563 but not the 5 in x652.

Javascript string variable unquoted?

I am using the QuickBlox JavaScript API. Looking through their code, I found this line:
var URL_REGEXP = /\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/gi;
It appears that it has declared a string variable that is a regular expression pattern. Then it goes ahead to use that variable thus:
return str.replace(URL_REGEXP, function(match) {
url = (/^[a-z]+:/i).test(match) ? match : 'http://' + match;
url_text = match;
return '' + escapeHTML(url_text) + '';
});
I am wondering how is this possible? The var declared in the first line should be a string, but it is unquoted. Shouldn't this be a syntax error?
I went ahead and tested this code on my browser, and it works! This mean's I've got some learning to do here... Can anyone explain how this variable is declared?
Additionally, I tried to run the same code on my friends computer, the Chrome debugger throws a syntax error on the variable declaration line (unexpected token '/'). I am using Chrome Version 36.0.1985.143 m, my friend is using the same thing, but on my computer, it all works fine, on my friends computer, the code stops at the first variable declaration because of "syntax error".
Is there some setting that is different?
Any help would be appreciated.
UPDATE
Thanks for the quick answers. I've come from a PHP background, so thought that all regular expressions has to be initialized as strings :P.
Anyone can reproduce the syntax error I'm getting on my friends computer? (It still happens after disabling all extensions). I can't reproduce it either, and that's what is frustrating me.
UPDATE 2
I have tested and my friends computer and looked through the source. It appear to be due to some encoding problems (I'm not sure what). The regular expression is shown like this:
var URL_REGEXP = /\b((?:https?:\/\/|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?芦禄鈥溾€濃€樷€橾))/gi;
(The characters at the end of the code is some random chinese characters, it seems).
How can I change the encoding to match his browser/system? (He is running on a Windows 7 Chinese simplified system).
It is not a String variable. It is a regular expression.
Calling var varname = /pattern/flags;
is effective to calling var varname = new RegExp("pattern", "flags");.
You can execute the following in any browser that supports a JavaScript console:
>>> var regex = /(?:[\w-]+\.)+[\w-]+/i
>>> regex.exec("google.com")
... ["google.com"]
>>> regex.exec("www.google.com")
... ["www.google.com"]
>>> regex.exec("ftp://ftp.google.com")
... ["ftp.google.com"]
>>> regex.exec("http://www.google.com")
Anyone can reproduce the syntax error I'm getting on my friends computer? (It still happens after disabling all extensions). I can't reproduce it either, and that's what is frustrating me.
According to RegExp - JavaScript documentation:
Regex literals was present in ECMAScript 1st Edition, implemented in JavaScript 1.1. Use an updated browser.
No, it shouldn't be a syntax error. In Javascript, RegExp objects are not strings, they are a distinct class of objects. /.../modifiers is the syntax for a RegExp literal.
I can't explain the syntax error you got on your friend's computer, it looks fine to me. I pasted it into the Javascript console and it was fine.

javascript regex invalid quantifier error

I have the following javascript code:
if (url.match(/?rows.*?(?=\&)|.*/g)){
urlset= url.replace(/?rows.*?(?=\&)|.*/g,"rows="+document.getElementById('rowcount').value);
}else{
urlset= url+"&rows="+document.getElementById('rowcount').value;
}
I get the error invalid quantifier at the /?rows.*?.... This same regex works when testing it on http://www.pagecolumn.com/tool/regtest.htm using the test string
?srt=acc_pay&showfileCL=yes&shownotaryCL=yes&showclientCL=no&showborrowerCL=yes&shownotaryStatusCL=yes&showclientStatusCL=yes&showbillCL=yes&showfeeCL=yes&showtotalCL=yes&dir=asc&closingDate=12/01/2011&closingDate2=12/31/2011&sort=notaryname&pageno=0&rows=anything&Start=0','bodytable','xyz')
In this string, the above regex is supposed to match:
rows=anything
I actually don't even need the /? to get it to work, but if I don't put that into my javascript, it acts like it's not even regex... I'm terrible with Regex period, so this one has me pretty confused. And that error is the only one I am getting in Firefox's error console.
EDIT
Using that link I posted above, it seems that the leading / tries to match an actual forward slash instead of just marking the code as the beginning of a regex statement. So the ? is in there so that if it doesn't match the / to anything, it continues anyway.
RESOLUTION
Ok, so in the end, I had to change my regex to this:
/rows=.*(?=\&?)/g
This matched the word "rows=" followed by anything until it hit an ampersand or ran out of text.
You need to escape the first ?, since it has special meaning in a regex.
/\?rows.*?(?=\&)|.*/g
// ^---escaped
regtest.htm produces
new RegExp("?rows.?(?=\&)|.", "") returned a SyntaxError: invalid
quantifier
The value you put into the web site shouldn't have the / delimiters on the regex, so put in ?rows.*?(?=\&)|.* and it shows the same problem. Your JavaScript code should look like
re = /rows.*?(?=\&)|.*/g;
or similar (but that is a pointless regex as it matches everything). If you can't fix it, please describe what you want to match and show your JavaScript
You might consider refactoring you code to look something like this:
var url = "sort=notaryname&pageno=0&rows=anything&Start=0"
var rowCount = "foobar";
if (/[\?\&]rows=/.test(url))
{
url = url.replace(/([\?\&]rows=)[^\&]+/g,"$1"+rowCount);
}
console.log(url);
Output
sort=notaryname&pageno=0&rows=foobar&Start=0

javascript plain text url parsing

I'm trying to search plain old strings for urls that begin with http, but all the regex I find doesn't seem to work in javascript nor can I seem to find an example of this in javascript.
This is the one I'm trying to use from here and here:
var test = /\b(?:(?:https?|ftp|file)://www\.|ftp\.)[-A-Z0-9+&##/%=~_|$?!:,.]*[A-Z0-9+&##/%=~_|$]/;
But when I try to run it, I get "Unexpected token |" errors.
Ok, a comment seems to be not enough, hard to find full answer. I rewrite whole proper regexp: (tested, it works good)
var test = /\b(?:(?:https?|ftp|file):\/\/www\.|ftp\.)[-A-Z0-9+&##\/%=~_|$?!:,.]*[A-Z0-9+&##\/%=~_|$]/i;
The i on the end means 'ignore case', so it is necessary for this regexp.
You're using / as your regex delimiter, and are also using / within the regex (before www), so the regex actually terminates after the first / before www. Change it to:
var test = /\b(?:(?:https?|ftp|file):\/\/www\.|ftp\.)[-A-Z0-9+&##/%=~_|$?!:,.]*[A-Z0-9+&##/%=~_|$]/;
^^^^ escape here

Categories

Resources