Alpha-numeric with whitespace regex - javascript

Just when you think you've got a handle on regex; it all comes undone. Hoping to return a false check if anything other than alpha numeric and whitespace characters are found.
function checkName(fname)
{
var rexp = new RegExp(/[^a-zA-Z0-9]\s/gim)
if (!rexp.test(fname))
{
alert ("'" + fname + "'\nis okay")
}
else
{
alert ("'" + fname + "'\nis NOT okay")
}
return !rexp.test(fname)
}
I would hope that the above code would return for the following
"This is ok" - true
"This, is not ok" -false
"Nor is this ok!" -false
"Nor is \"this ok" - false

Although much of the discussion is right, everything seems to be missing the point that you are inverting character classes and then inverting the results in your function. This is logically hard to read. You also do two tests on the regex for no good reason. Much cleaner would be something like this:
function checkName(fname) {
var result = /^[a-z0-9\s]+$/i.test(fname)
if (result) {
alert ("'" + fname + "'\nis okay")
} else {
alert ("'" + fname + "'\nis NOT okay")
}
return result;
}
Update: It looks like Jack's edits captured these points too. (Always a minute late and a nickel short...)

[^a-zA-Z0-9]\s
Your regex requires the whitespace to be after the letters/numbers.
To fix it, move the \s inside the brackets.
You still have to do one more thing though. The regex will only match one of these characters. Add a + to match one or more.
Therefore, fixed regex:
[^a-zA-Z0-9\s]+

A few things:
/something/ is the short notation for new RegExp('something'); you shouldn't mix them up.
You need to move the \s inside the character class; otherwise you match a character that's not alphanumeric followed by a space.
I don't think you need all those modifiers:
/m is only useful if you have anchors in your expression,
/i can be used if you remove A-Z or a-z from the character class,
/g is only useful for when you need to match multiple times, but in your case the first match is enough.
var rexp = /[^a-zA-Z0-9\s]/;
The whole function can be written like this:
function checkName(fname)
{
return !/[^a-zA-Z0-9\s]/.test(fname);
}
Instead of using double negatives, it would be better to say "only allow these characters":
function checkName(fname)
{
return /^[a-zA-Z0-9\s]*$/.test(fname);
}
If you need to test for non-empty names as well, you should use /^[a-zA-Z0-9\s]+$/.

Try:
function checkName(fname)
{
var rexp = new RegExp(/^[a-z0-9\s]+$/i)
if (!rexp.test(fname))
{
alert ("'" + fname + "'\nis okay")
}
else
{
alert ("'" + fname + "'\nis NOT okay")
}
return !rexp.test(fname)
}

Related

Splitting string to array while ignoring content between apostrophes

I need something that takes a string, and divides it into an array.
I want to split it after every space, so that this -
"Hello everybody!" turns into ---> ["Hello", "Everybody!"]
However, I want it to ignore spaces inbetween apostrophes. So for examples -
"How 'are you' today?" turns into ---> ["How", "'are you'", "today?"]
Now I wrote the following code (which works), but something tells me that what I did is pretty much horrible and that it can be done with probably 50% less code.
I'm also pretty new to JS so I guess I still don't adhere to all the idioms of the language.
function getFixedArray(text) {
var textArray = text.split(' '); //Create an array from the string, splitting by spaces.
var finalArray = [];
var bFoundLeadingApostrophe = false;
var bFoundTrailingApostrophe = false;
var leadingRegExp = /^'/;
var trailingRegExp = /'$/;
var concatenatedString = "";
for (var i = 0; i < textArray.length; i++) {
var text = textArray[i];
//Found a leading apostrophe
if(leadingRegExp.test(text) && !bFoundLeadingApostrophe && !trailingRegExp.test(text)) {
concatenatedString =concatenatedString + text;
bFoundLeadingApostrophe = true;
}
//Found the trailing apostrophe
else if(trailingRegExp.test(text ) && !bFoundTrailingApostrophe) {
concatenatedString = concatenatedString + ' ' + text;
finalArray.push(concatenatedString);
concatenatedString = "";
bFoundLeadingApostrophe = false;
bFoundTrailingApostrophe = false;
}
//Found no trailing apostrophe even though the leading flag indicates true, so we want this string.
else if (bFoundLeadingApostrophe && !bFoundTrailingApostrophe) {
concatenatedString = concatenatedString + ' ' + text;
}
//Regular text
else {
finalArray.push(text);
}
}
return finalArray;
}
I would deeply appreciate it if somebody could go through this and teach me how this should be rewritten, in a more correct & efficient way (and perhaps a more "JS" way).
Thanks!
Edit -
Well I just found a few problems, some of which I fixed, and some I'm not sure how to handle without making this code too complex (for example the string "hello 'every body'!" doesn't split properly....)
You could try matching instead of splitting:
string.match(/(?:['"].+?['"])|\S+/g)
The above regex will match anything in between quotes (including the quotes), or anything that's not a space otherwise.
If you want to also match characters after the quotes, like ? and ! you can try:
/(?:['"].+?['"]\W?)|\S+/g
For "hello 'every body'!" it will give you this array:
["hello", "'every body'!"]
Note that \W matches space as well, if you want to match punctuation you could be explicit by using a character class in place of \W
[,.?!]
Or simply trim the strings after matching:
string.match(regex).map(function(x){return x.trim()})

match word not capitalized a certain way

I want a regular expression that matches all instances of "capitalizedExactlyThisWay" that are not capitalizedExactlyThisWay.
I created a function that finds the indexes of all case insensitive matches and then pushes the values back in like this (JSBIN)
But I would rather just say something like text.replace(regexp,"<highlight>$1</highlight>");
replace has a callback function too.
s = s.replace(reg1, function(m){
if(m===word) return m;
return '<highlight>'+m+'</highlight>';
});
Unfortunately JavaScript regular expressions do not support making only a part of the expression case-insensitive.
You could write a little helper function that does the dirty work:
function capitalizationSensitiveRegex(word) {
var chars = word.split(""), i;
for (i = 0; i < chars.length; i++) {
chars[i] = "[" + chars[i].toLowerCase() + chars[i].toUpperCase() + "]";
}
return new RegExp("(?=\\b" + chars.join("") + "\\b)(?!" + word + ").{" + word.length + "}", "g");
}
Result:
capitalizationSensitiveRegex("capitalizedExactlyThisWay");
=> /(?=\b[cC][aA][pP][iI][tT][aA][lL][iI][zZ][eE][dD][eE][xX][aA][cC][tT][lL][yY][tT][hH][iI][sS][wW][aA][yY]\b)(?!capitalizedExactlyThisWay).{25}/g
Note that this assumes ASCII letters due to limitations of how \b works in JavaScript. It also assumes you're not using any regex meta characters in word (brackets, backslashes, parentheses, stars, dots, etc). An extra step of regex-quoting each char is necessary to make the above stable.
You can use match and map method with a callback:
tok=[], input.match(/\bcapitalizedexactlythisway\b/ig).map( function (m) {
if (m!="capitalizedExactlyThisWay") tok.push(m); });
console.log( tok );
["capitalizedEXACTLYTHISWAY", "capitalizedexactlYthisWay", "capitalizedexactlythisway"]
You could try this regex to match all the case-insensitive exactlythisway string but not of ExactlyThisWay ,
\bcapitalized(?!ExactlyThisWay)(?:[Ee][Xx][Aa][Cc][Tt][Ll][Yy][Tt][Hh][Ii][Ss][Ww][Aa][Yy])\b
Demo
If you could somehow get JavaScript to work with partial case-insensitive matching, i.e. (?i), you could use the following expression:
capitalized(?!ExactlyThisWay)(?i)exactlythisway
If not, you're probably stuck with something like this:
capitalized(?!ExactlyThisWay)[a-zA-Z]+
The downside is that it will also match other variations such as capitalizedfoobar etc.
Demo

JavaScript check if string contains any of the words from regex

I'm trying to check if a string contains any of these words:
AB|AG|AS|Ltd|KB|University
My current code:
var acceptedwords = '/AB|AG|AS|Ltd|KB|University/g'
var str = 'Hello AB';
var matchAccepted = str.match(acceptedwords);
console.log(matchAccepted);
if (matchAccepted !== null) { // Contains the accepted word
console.log("Contains accepted word: " + str);
} else {
console.log("Does not contain accepted word: " + str);
}
But for some strange reason this does not match.
Any ideas what I'm doing wrong?
That's not the right way to define a literal regular expression in Javascript.
Change
var acceptedwords = '/AB|AG|AS|Ltd|KB|University/g'
to
var acceptedwords = /AB|AG|AS|Ltd|KB|University/;
You might notice I removed the g flag : it's useless as you only want to know if there's one match, you don't want to get them all. You don't even have to use match here, you could use test :
var str = 'Hello AB';
if (/AB|AG|AS|Ltd|KB|University/.test(str)) { // Contains the accepted word
console.log("Contains accepted word: " + str);
} else {
console.log("Does not contain accepted word: " + str);
}
If you want to build a regex with strings, assuming none of them contains any special character, you could do
var words = ['AB','AG', ...
var regex = new RegExp(words.join('|'));
If your names may contain special characters, you'll have to use a function to escape them.
If you want your words to not be parts of other words (meaning you don't want to match "ABC") then you should check for words boundaries :
regex = new RegExp(words.map(function(w){ return '\\b'+w+'\\b' }).join('|'),'g');

Regex for empty string or white space

I am trying to detect if a user enter whitespace in a textbox:
var regex = "^\s+$" ;
if($("#siren").val().match(regex)) {
echo($("#siren").val());
error+=1;
$("#siren").addClass("error");
$(".div-error").append("- Champ Siren/Siret ne doit pas etre vide<br/>");
}
if($("#siren").val().match(regex)) is supposed to match whitespace string, however, it doesn' t seems to work, what am I doing wrong ?
Thanks for your helps
The \ (backslash) in the .match call is not properly escaped. It would be easier to use a regex literal though. Either will work:
var regex = "^\\s+$";
var regex = /^\s+$/;
Also note that + will require at least one space. You may want to use *.
If you are looking for empty string in addition to whitespace you meed to use * rather than +
var regex = /^\s*$/ ;
^
If you're using jQuery, you have .trim().
if ($("#siren").val().trim() == "") {
// it's empty
}
If one only cares about whitespace at the beginning and end of the string (but not in the middle), then another option is to use String.trim():
" your string contents ".trim();
// => "your string contents"
http://jsfiddle.net/DqGB8/1/
This is my solution
var error=0;
var test = [" ", " "];
if(test[0].match(/^\s*$/g)) {
$("#output").html("MATCH!");
error+=1;
} else {
$("#output").html("no_match");
}
Had similar problem, was looking for white spaces in a string, solution:
To search for 1 space:
var regex = /^.+\s.+$/ ;
example: "user last_name"
To search for multiple spaces:
var regex = /^.+\s.+$/g ;
example: "user last name"

Why is my RegExp ignoring start and end of strings?

I made this helper function to find single words, that are not part of bigger expressions
it works fine on any word that is NOT first or last in a sentence, why is that?
is there a way to add "" to regexp?
String.prototype.findWord = function(word) {
var startsWith = /[\[\]\.,-\/#!$%\^&\*;:{}=\-_~()\s]/ ;
var endsWith = /[^A-Za-z0-9]/ ;
var wordIndex = this.indexOf(word);
if (startsWith.test(this.charAt(wordIndex - 1)) &&
endsWith.test(this.charAt(wordIndex + word.length))) {
return wordIndex;
}
else {return -1;}
}
Also, any improvement suggestions for the function itself are welcome!
UPDATE: example: I want to find the word able in a string, I waht it to work in cases like [able] able, #able1 etc.. but not in cases that it is part of another word like disable, enable etc
A different version:
String.prototype.findWord = function(word) {
return this.search(new RegExp("\\b"+word+"\\b"));
}
Your if will only evaluate to true if endsWith matches after the word. But the last word of a sentence ends with a full stop, which won't match your alphanumeric expression.
Did you try word boundary -- \b?
There is also \w which match one word character ([a-zA-Z_]) -- this could help you too (depends on your word definition).
See RegExp docs for more details.
If you want your endsWith regexp also matches the empty string, you just need to append |^$ to it:
var endsWith = /[^A-Za-z0-9]|^$/ ;
Anyway, you can easily check if it is the beginning of the text with if (wordIndex == 0), and if it is the end with if (wordIndex + word.length == this.length).
It is also possible to eliminate this issue by operating on a copy of the input string, surrounded with non-alphanumerical characters. For example:
var s = "#" + this + "#";
var wordIndex = this.indexOf(word) - 1;
But I'm afraid there is another problems with your function:
it would never match "able" in a string like "disable able enable" since the call to indexOf would return 3, then startsWith.test(wordIndex) would return false and the function would exit with -1 without searching further.
So you could try:
String.prototype.findWord = function (word) {
var startsWith = "[\\[\\]\\.,-\\/#!$%\\^&\*;:{}=\\-_~()\\s]";
var endsWith = "[^A-Za-z0-9]";
var wordIndex = ("#"+this+"#").search(new RegExp(startsWith + word + endsWith)) - 1;
if (wordIndex == -1) { return -1; }
return wordIndex;
}

Categories

Resources