Regular Expression: Any character that is not a letter or number - javascript

I need a regular expression that will match any character that is not a letter or a number. Once found I want to replace it with a blank space.

To match anything other than letter or number you could try this:
[^a-zA-Z0-9]
And to replace:
var str = 'dfj,dsf7lfsd .sdklfj';
str = str.replace(/[^A-Za-z0-9]/g, ' ');

This regular expression matches anything that isn't a letter, digit, or an underscore (_) character.
\W
For example in JavaScript:
"(,,#,£,() asdf 345345".replace(/\W/g, ' '); // Output: " asdf 345345"

You are looking for:
var yourVar = '1324567890abc§$)%';
yourVar = yourVar.replace(/[^a-zA-Z0-9]/g, ' ');
This replaces all non-alphanumeric characters with a space.
The "g" on the end replaces all occurrences.
Instead of specifying a-z (lowercase) and A-Z (uppercase) you can also use the in-case-sensitive option: /[^a-z0-9]/gi.

This is way way too late, but since there is no accepted answer I'd like to provide what I think is the simplest one: \D - matches all non digit characters.
var x = "123 235-25%";
x.replace(/\D/g, '');
Results in x: "12323525"
See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions

Match letters only /[A-Z]/ig
Match anything not letters /[^A-Z]/ig
Match number only /[0-9]/g or /\d+/g
Match anything not number /[^0-9]/g or /\D+/g
Match anything not number or letter /[^A-Z0-9]/ig
There are other possible patterns

try doing str.replace(/[^\w]/);
It will replace all the non-alphabets and numbers from your string!
Edit 1: str.replace(/[^\w]/g, ' ')

Just for others to see:
someString.replaceAll("([^\\p{L}\\p{N}])", " ");
will remove any non-letter and non-number unicode characters.
Source

To match anything other than letter or number or letter with diacritics like é you could try this:
[^\wÀ-úÀ-ÿ]
And to replace:
var str = 'dfj,dsf7é#lfsd .sdklfàj1';
str = str.replace(/[^\wÀ-úÀ-ÿ]/g, '_');
Inspired by the top post with support for diacritics
source

Have you tried str = str.replace(/\W|_/g,''); it will return a string without any character and you can specify if any especial character after the pipe bar | to catch them as well.
var str = "1324567890abc§$)% John Doe #$#'.replace(/\W|_/g, ''); it will return str = 1324567890abcJohnDoe
or look for digits and letters and replace them for empty string (""):
var str = "1324567890abc§$)% John Doe #$#".replace(/\w|_/g, ''); it will return str = '§$)% #$#';

Working with unicode, best for me:
text.replace(/[^\p{L}\p{N}]+/gu, ' ');

Related

Validate string in regular expression

I want to have a regular expression in JavaScript which help me to validate a string with contains only lower case character and and this character -.
I use this expression:
var regex = /^[a-z][-\s\.]$/
It doesn't work. Any idea?
Just use
/^[a-z-]+$/
Explanation
^ : Match from beginning string.
[a-z-] : Match all character between a-z and -.
[] : Only characters within brackets are allowed.
a-z : Match all character between a-z. Eg: p,s,t.
- : Match only strip (-) character.
+ : The shorthand of {1,}. It's means match 1 or more.
$: Match until the end of the string.
Example
const regex= /^[a-z-]+$/
console.log(regex.test("abc")) // true
console.log(regex.test("aBcD")) // false
console.log(regex.test("a-c")) // true
Try this:
var regex = /^[-a-z]+$/;
var regex = /^[-a-z]+$/;
var strs = [
"a",
"aB",
"abcd",
"abcde-",
"-",
"-----",
"a-b-c",
"a-D-c",
" "
];
strs.forEach(str=>console.log(str, regex.test(str)));
Try this
/^[a-z-]*$/
it should match the letters a-z or - as many times as possible.
What you regex does is trying to match a-z one single time, followed by any of -, whitespace or dot one single time. Then expect the string to end.
Use this regular expression:
let regex = /^[a-z\-]+$/;
Then:
regex.test("abcd") // true
regex.test("ab-d") // true
regex.test("ab3d") // false
regex.test("") // false
PS: If you want to allow empty string "" to pass, use /^[a-z\-]*$/. Theres an * instead of + at the end. See Regex Cheat Sheet: https://www.rexegg.com/regex-quickstart.html
I hope this helps
var str = 'asdadWW--asd';
console.log(str.match(/[a-z]|\-/g));
This will work:
var regex = /^[a-z|\-|\s]+$/ //For this regex make use of the 'or' | operator
str = 'test- ';
str.match(regex); //["test- ", index: 0, input: "test- ", groups: undefined]
str = 'testT- ' // string now contains an Uppercase Letter so it shouldn't match anymore
str.match(regex) //null

JS Regex for a string contains fixed number of letters

Let's say I need to have minimum 5 letters in a string not requiring that they are subsequent. The regex below checks subsequent letters
[A-Za-z]{5,}
So, "aaaaa" -- true, but "aaa1aa" -- false.
What is the regex to leave the sequence condition, that both of the strings above would pass as true.
You could remove all non-letter chars with .replace(/[^A-Za-z]+/g, '') and then run the regex:
var strs = ["aaaaa", "aaa1aa"];
var val_rx = /[a-zA-Z]{5,}/;
for (var s of strs) {
console.log( val_rx.test(s.replace(/[^A-Za-z]+/g, '')) );
}
Else, you may also use a one step solution like
var strs = ["aaaaa", "aaa1aa"];
var val_rx = /(?:[^a-zA-Z]*[a-zA-Z]){5,}/;
for (var s of strs) {
console.log( s, "=>", val_rx.test(s) );
}
See this second regex demo online. (?:[^a-zA-Z]*[a-zA-Z]){5,} matches 5 or more consecutive occurrences of 0 or more non-letter chars ([^a-zA-Z]*) followed with a letter char.
Allow non-letter characters between the letters:
(?:[A-Za-z][^A-Za-z]*){5,}
If you have to use a regular expression only, here's one somewhat ugly option:
const check = str => /^(.*[A-Za-z].*){5}/.test(str);
console.log(check("aaaaa"));
console.log(check("aa1aaa"));
console.log(check("aa1aa"));
w means alphanumeric in regex,
it will be ok : \w{5,}
[a-zA-Z0-9]{5,}
Just like this? Or do you mean it needs to be a regex that ignores digits? Because the above would match aaaa1 as well.

How to to extract a string between two characters?

These are my strings:
there_is_a_string_here_1_480#1111111
there_is_a_string_here_1_360#1111111
there_is_a_string_here_1_180#1111111
What I want to do is extracting 180, 360 and 480 from those strings.
I tried this RegEx _(.*)# but no chance.
You just want the capture group:
var str = 'there_is_a_string_here_1_180#1111111';
var substr = str.match(/_(\d*)#/);
if (substr) {
substr = substr[1];
console.log(substr);
}
//outputs 180
you almost got it
_(\d{3})#
you need to do a match on digits, or else the string will also get selected because of the other underscore.
Ofcourse your match will be in \1
Try this
var str = "there_is_a_string_here_1_480#1111111";
var matches = str.match(/_\d+#/).map(function(value){return value.substring(1,value.length-1);});
document.body.innerHTML += JSON.stringify(matches,0,4);
try this:
(?<=(?!.*))(.*)(?=#)
Use lookbehind (?<=) and look ahead (?=) so that "_" and "#" are not included in the match.
(?!.*) gets the last occurence of "_".
(.*) matches everything between the last occurence of "_" and "#".
I hope it helps.

Adding a condition to a regex

Given the Javascript below how can I add a condition to the clause? I would like to add a "space" character after a separator only if a space does not already exist. The current code will result in double-spaces if a space character already exists in spacedText.
var separators = ['.', ',', '?', '!'];
for (var i = 0; i < separators.length; i++) {
var rg = new RegExp("\\" + separators[i], "g");
spacedText = spacedText.replace(rg, separators[i] + " ");
}
'. , ? ! .,?!foo'.replace(/([.,?!])(?! )/g, '$1 ');
//-> ". , ? ! . , ? ! foo"
Means replace every occurence of one of .,?! that is not followed by a space with itself and a space afterwards.
I would suggest the following regexp to solve your problem:
"Test!Test! Test.Test 1,2,3,4 test".replace(/([!,.?])(?!\s)/g, "$1 ");
// "Test! Test! Test. Test 1, 2, 3, 4 test"
The regexp matches any character in the character class [!,.?] not followed by a space (?!\s). The parenthesis around the character class means that the matched separator will be contained in the first backreference $1, which is used in the replacement string. See this fiddle for working example.
You could do a replace of all above characters including a space. In that way you will capture any punctuation and it's trailing space and replace both by a single space.
"H1a!. A ?. ".replace(/[.,?! ]+/g, " ")
[.,?! ] is a chararcter class. It will match either ., ,, ?, ! or and + makes it match atleast once (but if possible multiple times).
spacedText = spacedText.replace(/([\.,!\?])([^\s])/g,"$1 ")
This means: replace one of these characters ([\.,!\?]) followed by a non-whitespace character ([^\s]) with the match from first group and a space ("$1 ").
Here is a working code :
var nonSpaced= 'Hello World!Which is your favorite number? 10,20,25,30 or other.answer fast.';
var spaced;
var patt = /\b([!\.,\?])+\b/g;
spaced = nonSpaced.replace(patt, '$1 ');
If you console.log the value of spaced, It will be : Hello World! Which is your favorite number? 10, 20, 25, 30 or other. answer fast. Notice the number of space characters after the ? sign , it is only one, and there is not extra space after last full-stop.

how to make regular expression match only Cyrillic bulgarian letters

Hello I want to replace all the letters from bylgarian alphabet with empty string
I've seen this link
How to match Cyrillic characters with a regular expression
but it doesn't work for me
Here is what I've tried
1. var newstr = strInput.replace(/[\p{IsCyrillic}]/gi, '');
doesn't work!
2. var newstr = strInput.replace(/[\p{Letter}]/gi, '');
also nothing
thanks for help;
Javascript doesn't support Unicode classes of the form \p{IsCyrillic}.
But, assuming the characters you want to replace are in the Unicode Cyrillic range 0400 - 04FF, you could use:
newstr = strInput.replace( /[\u0400-\u04FF]/gi, '' );
For example:
var strInput = 'уфхцчшщъhelloЁЂЃЄрстыьэю',
newstr = strInput.replace( /[\u0400-\u04FF]/gi, '' );
console.log( newstr ); // 'hello'
I think that JavaScript RegEx does not support this syntax.
May be this will help?
XRegExp
Another way:
Pattern.compile("[А-я]+", Pattern.UNICODE_CHARACTER_CLASS).matcher(strInput ).replaceAll("") ;
Where [А-я]+ is your alphabet.

Categories

Resources