regex - How do I exclude "%" and "_"? - javascript

Im allowing numbers, letters, and special characters except for % and _ in my html textbox. I have the pattern /[a-zA-Z0-9!##$^&*()-+=]/. I think its not the best way to do it because I have to list all special characters except the two mentioned. Is there a way in which I don't have to list all special characters and don't include the two mentioned? BTW, Im using javascript regex.
For the demo please see http://jsfiddle.net/ce8Th/
Please help.

There's no need for that complex loop. Just call replace directly on the whole string:
$(this).val(function (i, v) {
return v.replace(/%|_/g, '');
});
Here's your fiddle: http://jsfiddle.net/ce8Th/1/

You could just do the reverse:
/[%_]/
if (pattern.test( ....
It's also nice to not use regex if you don't have to, not that it makes a big difference in this case:
if ("%_".split().indexOf(text.charAt(i)) > -1) {

A white list is always best. I would recommend keeping what you have except adding a length modifier and start and end characters:
/^[a-zA-Z0-9!##$^&*()-+=]+$/

Would I happen to be corrent in guessing that you are using this user input for a MySQL query involving LIKE to search for partial matches?
If so, don't exclude characters. Instead, escape them on the server-side. For instance:
$output = str_replace(Array("%","_"),Array("\\%","\\_"),$input);

Related

Match between simple delimiters, but not delimiters themselves

I was looking at JSON data that was just in a text file. I don't want to do anything aside from just use regex to get the values in between quotes. I'm just using this as a way to help practice regex and got to this point that seems like it should be simple, but it turns out it's not (at least to me and a few other people at the office). I've matched complicated urls with ease in regex so I'm not completely new to regex. This just seems like a weird case for me.
I've tried:
/(?:")(.*?)(?:")/
/"(.*?)"/
and several others but these got me the closest.
Basically we can forget that it's JSON and just say I want to match the words value and stuff out of "value" and "stuff". Everything I try includes the quotes, so I'd have to clean the strings afterwards of the delimiters or else the string is literally "value" with the quotes.
Any help would be much appreciated, whether this is simple or complicated, I'd love to know! Thanks
Update: Alright so I think I'll go with (?<=")(.*?)(?=") and read things by line without the global setting on so I just get the first match on each line. In my code I was just plopping in a huge string into a var in the code instead of actually opening a file with ajax/filereader or having a form setup to input data. I think I'll mark this as solved, much appreciated!
You have two choices to solve this problem:
Use capturing groups
You can match the delimiters and use capturing groups to get the text within. In this case your two regexes will work, but you need to use access capturing group 1 to get the results (demo). See How do you access the matched groups in a JavaScript regular expression? for how to do that.
Use zero-width assertions
You can use zero-width assertions to match only the text within, require delimiters around them without actually matching them (demo):
(?<=")(.*?)(?=")
but now since I'm not consuming the quotes it'll find instances between each quote, not just between pairs of quotes: e.g., a"b"c" would find b and c.
As for getting just the first match, I think that'll happen by default in JavaScript. You'd have to ask for repeated matching before you see the subsequent ones. So if you process your file one line at a time, you should get what you want.
get the values in between quotes
One thing to keep in mind is that valid JSON accepts escaped quotes inside the quoted values. Therefore, the RegEx should take this into account when capturing the groups which is done with the “unrolling-the-loop” pattern.
var pattern = /"[^"\\]*(?:\\.[^"\\]*)*"/g;
var data = {
"value": "This is \"stuff\".",
"empty": "",
"null": null,
"number": 50
};
var dataString = JSON.stringify(data);
console.log(dataString);
var matched = dataString.match(pattern);
matched.map(item => console.log(JSON.parse(item)));

Javascript string validation. How to write a character only once in string and only in the start?

I am writing validation for phone numbers. I need to allow users to write + character only in the begining of input field and prevent users from writing it later in the field.
In other words:
+11111111 - right,
111111111 - right,
+111+111+ - false,
1111+111+ - false
The problem is that I need to perform validation while typing. As result I cannot analyse whole string after submision, thus it is not possible to fetch the position of + character because 'keyup' always returns 0.
I have tryed many approaches, this is one of them:
$('#signup-form').find('input[name="phone"]').on('keyup', function(e) {
// prevent from typing letters
$(this).val($(this).val().replace(/[^\d.+]/g, ''));
var textVal = $(this).val();
// check if + character occurs
if(textVal === '+'){
// remove + from occurring twice
// check if + character is not the first
if(textVal.indexOf('+') > 0){
var newValRem = textVal.replace(/\+/, '');
$(this).val(newValRem);
}
}
});
When I am trying to replace + character with empty string then it is replaced only once which is not enough, because user might type it a cople of times by mistake.
Here is the link to the fiddle: https://jsfiddle.net/johannesMt/rghLowxq/6/
Please give me any hint in this situation. Thanks!
To help you with the current code fix (#Thomas Mauduit-Blin is right that there are a lot more to do here than just allow plus symbol at the beginning only), you may remove any plus symbols that are preceded with any character. Just capture that character and restore with a backreference in the replacement pattern:
$(this).val($(this).val().replace(/[^\d.+]|(.)\++/g, '$1'));
See the updated fiddle and the regex demo.
The pattern is updated with a (.)\++ alternative. (.) captures any character but a newline into Group 1 that is followed with one or more plus symbols, and the contents of Group 1 is placed back during the replacement with the help of $1 backreference.
For better validation Why don't you use Jquery maskedinput library which will do lots of additional task for you without over head for other purpose also
$("#phone").mask("+999-999-9999");
$("#phone").mask("+9999-999-9999");
$("#phone").mask("+99999999999");
If you want to do the validation on your own, you must use a regex.
But, as described in another related thread here:
don't use a regular expression to validate complex real-world data like phone numbers or URLs. Use a specialized library.
You must let the user enter an invalid phone number, and perform the check later, or on form submit and/or on server side for example. Here, you want to take care of the "+" character, but there are lot's of other stuff to do to have a trustable validation.
If your textVal has a +, indexOf will only check for the first occurence. You need to ensure that first character is not checked by indexOf. So use substring to take out first character from the equation.
Simply replace
if(textVal.indexOf('+') > 0){
with
if(textVal.substring(1).indexOf('+') > -1){
Demo

Basic Regex to remove any space

I'm looking for a basic regex that removes any space. I want to use it for ZIP code.
Some people insert space after, before or in between the ZIP code.
I'm using /^\d{5}$/ now. I want to expand it to include space removal.
How can this be improved?
(I'm considering you want to remove spaces in your string, not verifying if it is valid even with spaces)
You can substitute one or more spaces (globally)
/\s+/g
by nothing.
zip.replace(/\s+/g, "");
Example in my browser's JS console:
> " 02 1 3 4".replace(/\s+/g, "");
"02134"
Here's a regex you can use instead of your current one to ignore any and all spaces.
/^(\s*\d){5}\s*$/
If you're sanitizing a form input or something, it's probably easiest to use:
zip = zip.replace(/\D/g,'');
you can then validate without a regex, just use the .length property on String.
if(zip.length != 5) alert('failed!');

removing phpbb tag using regex javascript

I'm trying to remove a rectangular brackets(bbcode style) using javascript, this is for removing unwanted bbcode.
I try with this.
theString .replace(/\[quote[^\/]+\]*\[\/quote\]/, "")
it works with this string sample:
theString = "[quote=MyName;225]Test 123[/quote]";
it will fail within this sample:
theString = "[quote=MyName;225]Test [quote]inside quotes[/quote]123[/quote]";
if there any solution beside regex no problem
The other 2 solutions simply do not work (see my comments). To solve this problem you first need to craft a regex which matches the innermost matching quote elements (which contain neither [QUOTE..] nor [/QUOTE]). Next, you need to iterate, applying this regex over and over until there are no more QUOTE elements left. This tested function does what you want:
function filterQuotes(text)
{ // Regex matches inner [QUOTE]non-quote-stuff[/quote] tag.
var re = /\[quote[^\[]+(?:(?!\[\/?quote\b)\[[^\[]*)*\[\/quote\]/ig;
while (text.search(re) !== -1)
{ // Need to iterate removing QUOTEs from inside out.
text = text.replace(re, "");
}
return text;
}
Note that this regex employs Jeffrey Friedl's "Unrolling the loop" efficiency technique and is not only accurate, but is quite fast to boot.
See: Mastering Regular Expressions (3rd Edition) (highly recommended).
Try this one:
/\[quote[^\/]+\].*\[\/quote\]$/
The $ sign indicates that only the closing quote element at the end of the string should be used to determine the ending of the quote you're trying to remove.
And i added a "." before the asterisk so that this will match any sign in between. I tested this with your two strings and it worked.
edit: I don't exactly know how you are using that. But just as an addition. If you want the pattern also to match to a string where no attributes are added for example:
[quote]Hello[/quote]
You should change the "+" sign into an asterisk as well like this:
/\[quote[^\/]*\].*\[\/quote\]$/
This answer has flaws, see Ridgerunner's answer for a more correct one.
Here's my crack at it.
function filterQuotes(text)
{
return text.replace(/\[(\/)?quote([^\/]*)?\]/g,"");
}

Breaking a String into Chunks based on Pattern

I have one string, that looks like this:
a[abcdefghi,2,3,jklmnopqr]
The beginning "a" is fixed and non-changing, however the content within the brackets is and can follow a pattern. It will always be an alphabetical string, possibly followed by numbers separate by commas or more strings and/or numbers.
I'd like to be able to break it into chunks of the string and any numbers that follow it until the "]" or another string is met.
Probably best explained through examples and expected ideal results:
a[abcdefghi] -> "abcdefghi"
a[abcdefghi,2] -> "abcdefghi,2"
a[abcdefghi,2,3,jklmnopqr] -> "abcdefghi,2,3" and "jklmnopqr"
a[abcdefghi,2,3,jklmnopqr,stuvwxyz] -> "abcdefghi,2,3" and "jklmnopqr" and "stuvwxyz"
a[abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz] -> "abcdefghi,2,3" and "jklmnopqr,1,9" and "stuvwxyz"
a[abcdefghi,1,jklmnopqr,2,stuvwxyz,3,4] -> "abcdefghi,1" and "jklmnopqr,2" and "stuvwxyz,3,4"
Ideally a malformed string would be partially caught (but this is a nice extra):
a[2,3,jklmnopqr,1,9,stuvwxyz] -> "jklmnopqr,1,9" and "stuvwxyz"
I'm using Javascript and I realize a regex won't bring me all the way to the solution I'd like but it could be a big help. The alternative is to do a lot of manually string parsing which I can do but doesn't seem like the best answer.
Advice, tips appreciated.
UPDATE: Yes I did mean alphametcial (A-Za-z) instead of alphanumeric. Edited to reflect that. Thanks for letting me know.
You'd probably want to do this in 2 steps. First, match against:
a\[([^[\]]*)\]
and extract group 1. That'll be the stuff in the square brackets.
Next, repeatedly match against:
[a-z]+(,[0-9]+)*
That'll match things like "abcdefghi,2,3". After the first match you'll need to see if the next character is a comma and if so skip over it. (BTW: if you really meant alphanumeric rather than alphabetic like your examples, use [a-z0-9]*[a-z][a-z0-9]* instead of [a-z]+.)
Alternatively, split the string on commas and reassemble into your word with number groups.
Why wouldn't a regex bring you all the way to a solution?
The following regex works against the given data, but it makes a few assumptions (at least two alphas followed by comma separated single digits).
([a-z]{2,}(?:,\\d)*)
Example:
re = new RegExp('[a-z]{2,}(?:,\\d)*', 'g')
matches = re.exec("a[abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz]")
Assuming you can easily break out the string between the brackets, something like this might be what you're after:
> re = new RegExp('[a-z]+(?:,\\d)*(?:,?)', 'gi')
> while (match = re.exec("abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz")) { print(match[0]) }
abcdefghi,2,3,
jklmnopqr,1,9,
stuvwxyz
This has the advantage of working partially in your malformed case:
> while (match = re.exec("abcdefghi,2,3,jklmnopqr,1,9,stuvwxyz")) { print(match[0]) }
jklmnopqr,1,9,
stuvwxy
The first character class [a-z] can be modified if you meant for it to be truly alphanumeric.

Categories

Resources