Why doesn't this particular regex work in JavaScript? - javascript

I have this regex on Javascript :
var myString="aaa#aaa.com";
var mailValidator = new RegExp("\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*");
if (!mailValidator.test(myString))
{
alert("incorrect");
}
but it shouldn't alert "incorrect" with aaa#aaa.com.
It should return "incorrect" for aaaaaa.com instead (as example).
Where am I wrong?

When you create a regex from a string, you have to take into account the fact that the parser will strip out backslashes from the string before it has a chance to be parsed as a regex.
Thus, by the time the RegExp() constructor gets to work, all the \w tokens have already been changed to just plain "w" in the string constant. You can either double the backslashes so the string parse will leave just one, or you can use the native regex constant syntax instead.

It works if you do this:
var mailValidator = /\w+([-+.]\w+)*#\w+([-.]\w+)*\.\w+([-.]\w+)*/;
What happens in yours is that you need to double escape the backslash because they're inside a string, like "\\w+([-+.]\\w+)*...etc
Here's a link that explains it (in the "How to Use The JavaScript RegExp Object" section).

Try var mailValidator = new RegExp("\\w+([-+.]\\w+)*#\\w+([-.]\\w+)*\\.\\w+([-.]\\w+)*");

Related

Regex substitution '$1' - why is it a string?

Silly question, but I'll ask it anyway: Why is the substitution part of a regular expression in JavaScript encompassed in quotes as a string, where it seems to be a variable in its own right? eg '$2'
alert("banana split") // nana split
function reg(afilename)
{
var rexp = new RegExp(/^(ba)(.+)/gim)
var newName = afilename.replace(rexp, '$2')
return newName
}
Because it's not a [Javascript] variable in its own right.
If you didn't single-quote it, JavaScript would try to pass the value of the variable $2 as an argument (yes, you can give JavaScript variables names starting with $), except you don't have one.
This way, the Regex engine gets the actual, literal string $2, and gives it its own special meaning.
It's a perfect example of abstraction, where you can witness two "layers" of software interacting. Consider also document.write('<p>Text</p>'); — you wouldn't want JavaScript to try to parse that HTML, right? You want to pass it verbatim to the entity that is going to handle it.

How do I replace a double-quote with an escape-char double-quote in a string using JavaScript?

Say I have a string variable (var str) as follows-
Dude, he totally said that "You Rock!"
Now If I'm to make it look like as follows-
Dude, he totally said that "You Rock!"
How do I accomplish this using the JavaScript replace() function?
str.replace("\"","\\""); is not working so well. It gives unterminated string literal error.
Now, if the above sentence were to be stored in a SQL database, say in MySQL as a LONGTEXT (or any other VARCHAR-ish) datatype, what else string optimizations I need to perform?
Quotes and commas are not very friendly with query strings. I'd appreciate a few suggestions on that matter as well.
You need to use a global regular expression for this. Try it this way:
str.replace(/"/g, '\\"');
Check out regex syntax and options for the replace function in Using Regular Expressions with JavaScript.
Try this:
str.replace("\"", "\\\""); // (Escape backslashes and embedded double-quotes)
Or, use single-quotes to quote your search and replace strings:
str.replace('"', '\\"'); // (Still need to escape the backslash)
As pointed out by helmus, if the first parameter passed to .replace() is a string it will only replace the first occurrence. To replace globally, you have to pass a regex with the g (global) flag:
str.replace(/"/g, "\\\"");
// or
str.replace(/"/g, '\\"');
But why are you even doing this in JavaScript? It's OK to use these escape characters if you have a string literal like:
var str = "Dude, he totally said that \"You Rock!\"";
But this is necessary only in a string literal. That is, if your JavaScript variable is set to a value that a user typed in a form field you don't need to this escaping.
Regarding your question about storing such a string in an SQL database, again you only need to escape the characters if you're embedding a string literal in your SQL statement - and remember that the escape characters that apply in SQL aren't (usually) the same as for JavaScript. You'd do any SQL-related escaping server-side.
The other answers will work for most strings, but you can end up unescaping an already escaped double quote, which is probably not what you want.
To work correctly, you are going to need to escape all backslashes and then escape all double quotes, like this:
var test_str = '"first \\" middle \\" last "';
var result = test_str.replace(/\\/g, '\\\\').replace(/\"/g, '\\"');
depending on how you need to use the string, and the other escaped charaters involved, this may still have some issues, but I think it will probably work in most cases.
var str = 'Dude, he totally said that "You Rock!"';
var var1 = str.replace(/\"/g,"\\\"");
alert(var1);

Regex validation rules

I'm writing a database backup function as part of my school project.
I need to write a regex rule so the database backup name can only contain legal characters.
By 'legal' I mean a string that doesn't contain ANY symbols or spaces. Only letters from the alphabet and numbers.
An example of a valid string would be '31Jan2012' or '63927jkdfjsdbjk623' or 'hello123backup'.
Here's my JS code so far:
// Check if the input box contains the charactes a-z, A-Z ,or 0-9 with a regular expression.
function checkIfContainsNumbersOrCharacters(elem, errorMessage){
var regexRule = new RegExp("^[\w]+$");
if(regexRule.test( $(elem).val() ) ){
return true;
}else{
alert(errorMessage);
return false;
}
}
//call the function
checkIfContainsNumbersOrCharacters("#backup-name", "Input can only contain the characters a-z or 0-9.");
I've never really used regular expressions before though, however after a quick bit of googling i found this tool, from which I wrote the following regex rule:
^[\w]+$
^ = start of string
[/w] = a-z/A-Z/0-9
'+' = characters after the string.
When running my function, the whatever string I input seems to return false :( is my code wrong? or am I not using regex rules correctly?
The problem here is, that when writing \w inside a string, you escape the w, and the resulting regular expression looks like this: ^[w]+$, containing the w as a literal character. When creating a regular expression with a string argument passed to the RegExp constructor, you need to escape the backslash, like so: new RegExp("^[\\w]+$"), which will create the regex you want.
There is a way to avoid that, using the shorthand notation provided by JavaScript: var regex = /^[\w]+$/; which does not need any extra escaping.
It can be simpler. This works:
function checkValid(name) {
return /^\w+$/.test(name);
}
/^\w+$/ is the literal notation for new RegExp(). Since the .test function returns a boolean, you only need to return its result. This also reads better than new RegExp("^\\w+$"), and you're less likely to goof up (thanks #x3ro for pointing out the need for two backslashes in strings).
The \w is a synonym for [[:alnum:]], which matches a single character of the alnum class. Note that using character classes means that you may match characters that are not part of the ASCII character encoding, which may or may not be what you want. If what you really intend to match is [0-9A-Za-z], then that's what you should use.
When you declare the regex as a string parameter to the RegExp constructor, you need to escape it. Both
var regexRule = new RegExp("^[\\w]+$");
...and...
var regexRule = new RegExp(/^[\w]+$/);
will work.
Keep in mind though, that client side validation for database data will never be enough, as the validation is easily bypassed by disabling javascript in the browser, and invalid/malicious data can reach your DB. You need to validate the data on the server side, but preventing the request with invalid data, but validating client side is good practice.
This is the official spec: http://dev.mysql.com/doc/refman/5.0/en/identifiers.html but it's not very easily converted to a regular expression. Just a regular expression won't do it as there are also reserved words.
Why not just put it in the query (don't forget to escape it properly) and let MySQL give you an error? There might for instance be a bug in the MySQL version you're using, and even though your check is correct, MySQL might still refuse.

Javascript .test either returns nothing or false, when regex matches the tested string?

I'm using javascript's inbuilt .test() function to test whether a string or regex matches. I took the regex from RegexLib, so I know that it matches the string it's tested against (in this case joe#aol.com), however it either returns false or nothing at all.
Here's my code:
var string = "joe#aol.com";
var pattern = [\w-]+#([\w-]+\.)+[\w-]+/i;
var match = pattern.test(string);
document.write(match);
When the regex is encased in quotes, the test returns false, when it's not encased in anything, it returns nothing.
What I've tried so far:
Simply using a single line, var match = '[\w-]+#([\w-]+\.)+[\w-]+/i'.test("joe#aol.com");.
Using both ' single quotes and " double quotes for both regex and string.
Appending the regex with /i and /g.
I honestly don't know what's causing this issue, so any help would be great. It could be a complete rookie mistake, a forgotten syntax perhaps.
Here's a link to the jsFiddle I made up for you to play around with if you think you've got some idea of how to fix this up: http://jsfiddle.net/wFhEJ/1/
You missed the opening slash for a regexp. The notation for a regexp is:
/regexp/flags
What happened with enclosing the regexp is that it became a string, and on jsFiddle String.prototype.test has been set by MooTools. MooTools seems to provide a String.prototype.test function, but it's not the same as RegExp.prototype.test.
http://jsfiddle.net/wFhEJ/2/
var string = "joe#aol.com";
var pattern = /[\w-]+#([\w-]+\.)+[\w-]+/i;
var match = pattern.test(string);
document.write(match);
Do note though that document.write is frowned upon. You might rather want document.body.appendChild(document.createTextNode(match)) or something alike.

JS - RegExp for detecting ".-" , "-."

I am bit confused with the RegExp I should be using to detect ".-", "-." it indeed passes this combinations as valid but in the same time, "-_","_-" get validated as well. Am I missing something or not escaping something properly?
var reg=new RegExp("(\.\-)|(\-\.)");
Actually seems any combination containing '-' gets passed. it
Got it thank you everyone.
You need to use
"(\\.-)|(-\\.)"
Since you're using a string with the RegExp constructor rather than /, you need to escape twice.
>>> "asd_-ads".search("(\.\-)|(\-\.)")
3
>>> "asd_-ads".search(/(\.\-)|(\-\.)/)
-1
>>> "asd_-ads".search(new RegExp('(\\.\-)|(\-\\.)'))
-1
In notation /(\.\-)|(\-\.)/, the expression would be right.
In the notation you chose, you must double all backslashes, because it still has a special meaning of itself, like \\, \n and so on.
Note there is no need to escape the dash here: var reg = new RegExp("(\\.-)|(-\\.)");
If you don't need to differentiate the matches, you can use a single enclosing capture, or none at all if you only want to check the match: "\\.-|-\\." is still valid.
You are using double quotes so the . doesn't get escaped with one backslash, use this notation:
var reg = /(\.\-)|(\-\.)/;

Categories

Resources