Regex validation rules - javascript

I'm writing a database backup function as part of my school project.
I need to write a regex rule so the database backup name can only contain legal characters.
By 'legal' I mean a string that doesn't contain ANY symbols or spaces. Only letters from the alphabet and numbers.
An example of a valid string would be '31Jan2012' or '63927jkdfjsdbjk623' or 'hello123backup'.
Here's my JS code so far:
// Check if the input box contains the charactes a-z, A-Z ,or 0-9 with a regular expression.
function checkIfContainsNumbersOrCharacters(elem, errorMessage){
var regexRule = new RegExp("^[\w]+$");
if(regexRule.test( $(elem).val() ) ){
return true;
}else{
alert(errorMessage);
return false;
}
}
//call the function
checkIfContainsNumbersOrCharacters("#backup-name", "Input can only contain the characters a-z or 0-9.");
I've never really used regular expressions before though, however after a quick bit of googling i found this tool, from which I wrote the following regex rule:
^[\w]+$
^ = start of string
[/w] = a-z/A-Z/0-9
'+' = characters after the string.
When running my function, the whatever string I input seems to return false :( is my code wrong? or am I not using regex rules correctly?

The problem here is, that when writing \w inside a string, you escape the w, and the resulting regular expression looks like this: ^[w]+$, containing the w as a literal character. When creating a regular expression with a string argument passed to the RegExp constructor, you need to escape the backslash, like so: new RegExp("^[\\w]+$"), which will create the regex you want.
There is a way to avoid that, using the shorthand notation provided by JavaScript: var regex = /^[\w]+$/; which does not need any extra escaping.

It can be simpler. This works:
function checkValid(name) {
return /^\w+$/.test(name);
}
/^\w+$/ is the literal notation for new RegExp(). Since the .test function returns a boolean, you only need to return its result. This also reads better than new RegExp("^\\w+$"), and you're less likely to goof up (thanks #x3ro for pointing out the need for two backslashes in strings).

The \w is a synonym for [[:alnum:]], which matches a single character of the alnum class. Note that using character classes means that you may match characters that are not part of the ASCII character encoding, which may or may not be what you want. If what you really intend to match is [0-9A-Za-z], then that's what you should use.

When you declare the regex as a string parameter to the RegExp constructor, you need to escape it. Both
var regexRule = new RegExp("^[\\w]+$");
...and...
var regexRule = new RegExp(/^[\w]+$/);
will work.
Keep in mind though, that client side validation for database data will never be enough, as the validation is easily bypassed by disabling javascript in the browser, and invalid/malicious data can reach your DB. You need to validate the data on the server side, but preventing the request with invalid data, but validating client side is good practice.

This is the official spec: http://dev.mysql.com/doc/refman/5.0/en/identifiers.html but it's not very easily converted to a regular expression. Just a regular expression won't do it as there are also reserved words.
Why not just put it in the query (don't forget to escape it properly) and let MySQL give you an error? There might for instance be a bug in the MySQL version you're using, and even though your check is correct, MySQL might still refuse.

Related

Jquery validate free emails using validator

I used jquery validator for validation.I have 50 free emails like(gmail.com, yahoo.com) so I need validate it.I chose an array then stored all the emails within an array.Below see my code there you could see I used the regular expression.I passed a variable in regular expression but it doesn't work for me.It threw the error like this SyntaxError: invalid range in character class
My code
$.validator.addMethod('nofreeemail', function (value) {
var emails = ["gmail.com","yahoo.com","hotmail.com"]
$.each(emails,function(i, val){
console.log("email", val)
var regex = new RegExp("/^([\w-.]+#(?!"+val+")([\w-]+.)+[\w-]{2,4})?$/");
console.log("regex", regex)
return regex.test(value);
});
}, 'Free email addresses are not allowed.');
I will post an answer since it is not evident here what is going on, but the underlying reasons are quite common.
You are using a constructor notation to define the regex. It is a correct approach when you need to build a pattern dynamically using a variable. However, a literal backslash must be written as "\\". All single backslashes are removed. Thus, you get an error since [\w-.] turns into [w-.] and it is an invalid character class. Also, the regex delimiters (those /..../ around the pattern) should never be used in the constructor notation unless you really need to match a string enclosed with /.
Besides, your emails contain non-word chars, and you need to escape them.
Use
var regex = new RegExp("^([\\w-.]+#(?!"+val.replace(/[-\/\\^$*+?.()|[\]{}]/g,'\\$&')+")([\\w-]+\\.)+[\\w-]{2,4})?$");
I also believe the dot in ([\w-]+.)+ must be escaped, it is supposed to match a literal dot,

Remove slashes from string using RegEx in JavaScript

I am trying to remove all special characters except punctuation from a customer complaint textarea using this code:
var tmp = complaint;
complaint = new RegExp(tmp.replace(/[^a-zA-Z,.!?\d\s:]/gi, ''));
but it keeps placing "/" in front, and in back of the string after sanitizing.
Example:
Hi, I h#ve a% probl&em wit#h (one) of your products.
Comes out like this
/Hi, I have a problem with one of your products./
I want
Hi, I have a problem with one of your products.
Thanks in advance for any help given.
The variable complaint is converted to a regular expression because you use the RegExp() constructor.
This probably isn't what you want. (I assume you want complaint to be a string).
Strings and regular expressions are two completely different data types.
Your output demonstrates how JavaScript displays regular expressions (surrounded by / characters).
If you want a string, don't create a regular expression (i.e. remove the RegExp constructor).
In other words:
complaint = complaint.replace(/[^a-zA-Z,.!?\d\s:]/gi, '');
You don't need the RegExp constructor:
complaint = tmp.replace(/[^a-zA-Z,.!?\d\s:]/gi, '');

How do I replace a double-quote with an escape-char double-quote in a string using JavaScript?

Say I have a string variable (var str) as follows-
Dude, he totally said that "You Rock!"
Now If I'm to make it look like as follows-
Dude, he totally said that "You Rock!"
How do I accomplish this using the JavaScript replace() function?
str.replace("\"","\\""); is not working so well. It gives unterminated string literal error.
Now, if the above sentence were to be stored in a SQL database, say in MySQL as a LONGTEXT (or any other VARCHAR-ish) datatype, what else string optimizations I need to perform?
Quotes and commas are not very friendly with query strings. I'd appreciate a few suggestions on that matter as well.
You need to use a global regular expression for this. Try it this way:
str.replace(/"/g, '\\"');
Check out regex syntax and options for the replace function in Using Regular Expressions with JavaScript.
Try this:
str.replace("\"", "\\\""); // (Escape backslashes and embedded double-quotes)
Or, use single-quotes to quote your search and replace strings:
str.replace('"', '\\"'); // (Still need to escape the backslash)
As pointed out by helmus, if the first parameter passed to .replace() is a string it will only replace the first occurrence. To replace globally, you have to pass a regex with the g (global) flag:
str.replace(/"/g, "\\\"");
// or
str.replace(/"/g, '\\"');
But why are you even doing this in JavaScript? It's OK to use these escape characters if you have a string literal like:
var str = "Dude, he totally said that \"You Rock!\"";
But this is necessary only in a string literal. That is, if your JavaScript variable is set to a value that a user typed in a form field you don't need to this escaping.
Regarding your question about storing such a string in an SQL database, again you only need to escape the characters if you're embedding a string literal in your SQL statement - and remember that the escape characters that apply in SQL aren't (usually) the same as for JavaScript. You'd do any SQL-related escaping server-side.
The other answers will work for most strings, but you can end up unescaping an already escaped double quote, which is probably not what you want.
To work correctly, you are going to need to escape all backslashes and then escape all double quotes, like this:
var test_str = '"first \\" middle \\" last "';
var result = test_str.replace(/\\/g, '\\\\').replace(/\"/g, '\\"');
depending on how you need to use the string, and the other escaped charaters involved, this may still have some issues, but I think it will probably work in most cases.
var str = 'Dude, he totally said that "You Rock!"';
var var1 = str.replace(/\"/g,"\\\"");
alert(var1);

Remove a long dash from a string in JavaScript?

I've come across an error in my web app that I'm not sure how to fix.
Text boxes are sending me the long dash as part of their content (you know, the special long dash that MS Word automatically inserts sometimes). However, I can't find a way to replace it; since if I try to copy that character and put it into a JavaScript str.replace statement, it doesn't render right and it breaks the script.
How can I fix this?
The specific character that's killing it is —.
Also, if it helps, I'm passing the value as a GET parameter, and then encoding it in XML and sending it to a server.
This code might help:
text = text.replace(/\u2013|\u2014/g, "-");
It replaces all – (–) and — (—) symbols with simple dashes (-).
DEMO: http://jsfiddle.net/F953H/
That character is call an Em Dash. You can replace it like so:
str.replace('\u2014', '');​​​​​​​​​​
Here is an example Fiddle: http://jsfiddle.net/x67Ph/
The \u2014 is called a unicode escape sequence. These allow to to specify a unicode character by its code. 2014 happens to be the Em Dash.
There are three unicode long-ish dashes you need to worry about: http://en.wikipedia.org/wiki/Dash
You can replace unicode characters directly by using the unicode escape:
'—my string'.replace( /[\u2012\u2013\u2014\u2015]/g, '' )
There may be more characters behaving like this, and you may want to reuse them in html later. A more generic way to to deal with it could be to replace all 'extended characters' with their html encoded equivalent. You could do that Like this:
[yourstring].replace(/[\u0080-\uC350]/g,
function(a) {
return '&#'+a.charCodeAt(0)+';';
}
);
With the ECMAScript 2018 standard, JavaScript RegExp now supports Unicode property (or, category) classes. One of them, \p{Dash}, matches any Unicode character points that are dashes:
/\p{Dash}/gu
In ES5, the equivalent expression is:
/[-\u058A\u05BE\u1400\u1806\u2010-\u2015\u2053\u207B\u208B\u2212\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u2E5D\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D]|\uD803\uDEAD/g
See the Unicode Utilities reference.
Here are some JavaScript examples:
const text = "Dashes: \uFF0D\uFE63\u058A\u1400\u1806\u2010-\u2013\uFE32\u2014\uFE58\uFE31\u2015\u2E3A\u2E3B\u2053\u2E17\u2E40\u2E5D\u301C\u30A0\u2E1A\u05BE\u2212\u207B\u208B\u3030𐺭";
const es5_dash_regex = /[-\u058A\u05BE\u1400\u1806\u2010-\u2015\u2053\u207B\u208B\u2212\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u2E5D\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D]|\uD803\uDEAD/g;
console.log(text.replace(es5_dash_regex, '-')); // Normalize each dash to ASCII hyphen
// => Dashes: ----------------------------
To match one or more dashes and replace with a single char (or remove in one go):
/\p{Dash}+/gu
/(?:[-\u058A\u05BE\u1400\u1806\u2010-\u2015\u2053\u207B\u208B\u2212\u2E17\u2E1A\u2E3A\u2E3B\u2E40\u2E5D\u301C\u3030\u30A0\uFE31\uFE32\uFE58\uFE63\uFF0D]|\uD803\uDEAD)+/g

Javascript regexp lets undesirable characters

I'm using a regExp in my project but some how I'm getting some undesirable characters
my RegExp looks like this:
new RegExp("[א-ת,A-z,',','(',')','.','-',''']");
which supposed to avoid characters like \ or []
but let my use one and more from (,),-,alphabets etc.
Unfortunately it doesnt happen
Which pattren includes both desirable and undesirable characters??
thanks for your help
Well your regular expression just says to match one "good" character (and incorrectly at that).
I think something closer to this would be what you want, though I'm not sure about the higher-page UTC characters:
var regexp = /^[א-תA-Za-z,()\-']*$/;
If the alefbet part doesn't work (it looks backwards to me, but I guess that's kind of a conundrum :-), try:
var regexp = /^[\u05DA-\05EAA-Za-z,()\-']*$/;
Might be good to tack an "i" (ignore case) modifier on the end too:
var regexp = /^[\u05DA-\05EAA-Za-z,()\-']*$/i;
This also does not handler the various diacritical marks; I don't know if you need those matched or not.
First of all, you don't need all those single quotes and commas. Second, you want A-Za-z, not.A-z. The latter includes ASCII characters between "Z" and "a".
var re = new RegExp("[א-תA-Za-z,()\.'\s-]");

Categories

Resources