Determine if char is in regex

Determine if char is in regex - javascript

Im trying to do a custom mention system using Quill and i need to figure out how can i determine if a given char is in the alphabet or not.
Example:
I have this string: Hello my #name is Luis and this function which takes the position of the cursor and evaluate the word to check if it contains #:
CheckWord = function(quill, start){
let at = false, c_char;
for(var i = start; i > 0; --i){
c_char = quill.getText(i, 1);
if (c_char == '#') {
if (quill.getText(i-1, 1) == ' ' || quill.getText(i-1, 1) == '') {
at = true;
break;
}
}
if (c_char == ' ') {
at = false;
break;
}
}
return at;
}
Everything is working fine but in
if (c_char == ' ') {
at = false;
break;
}
I need to verify if is not a alphabet char (A,B,C,D, ect...)
I know that with a regex like this /^[a-zA-Z]+$/ i can achieve what i want but i dont know how to implement it just to check if the given letter is valid.

You can use RegExp.prototype.test() to test if the character is an alphabet (a-z or A-Z), just negate this if you want to match other characters than alphabets:
if (/[a-z]/i.test(c_char)) {
at = false;
break;
}
The i in the regex means case insensitive search (think of i as insensitive), i.e. it checks for both lowercase and uppercase letters. Since c_char is a single character, you don't need the ^ (beginning of input) and $ (end of input) characters in the regex.

Related

Escaping apostrophes and the like in JavaScript [duplicate]

I want to remove all special characters except space from a string using JavaScript.
For example,
abc's test#s
should output as
abcs tests.

You should use the string replace function, with a single regex.
Assuming by special characters, you mean anything that's not letter, here is a solution:
const str = "abc's test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));

You can do it specifying the characters you want to remove:
string = string.replace(/[&\/\\#,+()$~%.'":*?<>{}]/g, '');
Alternatively, to change all characters except numbers and letters, try:
string = string.replace(/[^a-zA-Z0-9]/g, '');

The first solution does not work for any UTF-8 alphabet. (It will cut text such as Привіт). I have managed to create a function which does not use RegExp and use good UTF-8 support in the JavaScript engine. The idea is simple if a symbol is equal in uppercase and lowercase it is a special character. The only exception is made for whitespace.
function removeSpecials(str) {
var lower = str.toLowerCase();
var upper = str.toUpperCase();
var res = "";
for(var i=0; i<lower.length; ++i) {
if(lower[i] != upper[i] || lower[i].trim() === '')
res += str[i];
}
return res;
}
Update: Please note, that this solution works only for languages where there are small and capital letters. In languages like Chinese, this won't work.
Update 2: I came to the original solution when I was working on a fuzzy search. If you also trying to remove special characters to implement search functionality, there is a better approach. Use any transliteration library which will produce you string only from Latin characters and then the simple Regexp will do all magic of removing special characters. (This will work for Chinese also and you also will receive side benefits by making Tromsø == Tromso).

search all not (word characters || space):
str.replace(/[^\w ]/, '')

I don't know JavaScript, but isn't it possible using regex?
Something like [^\w\d\s] will match anything but digits, characters and whitespaces. It would be just a question to find the syntax in JavaScript.

I tried Seagul's very creative solution, but found it treated numbers also as special characters, which did not suit my needs. So here is my (failsafe) tweak of Seagul's solution...
//return true if char is a number
function isNumber (text) {
if(text) {
var reg = new RegExp('[0-9]+$');
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '')) {
result += text[i];
}
}
return result;
}
return '';
}

const str = "abc's#thy#^g&test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));

Try to use this one
var result= stringToReplace.replace(/[^\w\s]/g, '')
[^] is for negation, \w for [a-zA-Z0-9_] word characters and \s for space,
/[]/g for global

With regular expression
let string = "!#This tool removes $special *characters* /other/ than! digits, characters and spaces!!!$";
var NewString= string.replace(/[^\w\s]/gi, '');
console.log(NewString);
Result //This tool removes special characters other than digits characters and spaces
Live Example : https://helpseotools.com/text-tools/remove-special-characters

dot (.) may not be considered special. I have added an OR condition to Mozfet's & Seagull's answer:
function isNumber (text) {
reg = new RegExp('[0-9]+$');
if(text) {
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '') || (lower[i].trim() === '.')) {
result += text[i];
}
}
return result;
}
return '';
}

Try this:
const strippedString = htmlString.replace(/(<([^>]+)>)/gi, "");
console.log(strippedString);

const input = `#if_1 $(PR_CONTRACT_END_DATE) == '23-09-2019' #
Test27919<alerts#imimobile.com> #elseif_1 $(PR_CONTRACT_START_DATE) == '20-09-2019' #
Sender539<rama.sns#gmail.com> #elseif_1 $(PR_ACCOUNT_ID) == '1234' #
AdestraSID<hello#imimobile.co> #else_1#Test27919<alerts#imimobile.com>#endif_1#`;
const replaceString = input.split('$(').join('->').split(')').join('<-');
console.log(replaceString.match(/(?<=->).*?(?=<-)/g));

Whose special characters you want to remove from a string, prepare a list of them and then user javascript replace function to remove all special characters.
var str = 'abc'de#;:sfjkewr47239847duifyh';
alert(str.replace("'","").replace("#","").replace(";","").replace(":",""));
or you can run loop for a whole string and compare single single character with the ASCII code and regenerate a new string.

Delete special characters from an ng-repeat list (parsed from CSV) [duplicate]

I want to remove all special characters except space from a string using JavaScript.
For example,
abc's test#s
should output as
abcs tests.

You should use the string replace function, with a single regex.
Assuming by special characters, you mean anything that's not letter, here is a solution:
const str = "abc's test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));

You can do it specifying the characters you want to remove:
string = string.replace(/[&\/\\#,+()$~%.'":*?<>{}]/g, '');
Alternatively, to change all characters except numbers and letters, try:
string = string.replace(/[^a-zA-Z0-9]/g, '');

The first solution does not work for any UTF-8 alphabet. (It will cut text such as Привіт). I have managed to create a function which does not use RegExp and use good UTF-8 support in the JavaScript engine. The idea is simple if a symbol is equal in uppercase and lowercase it is a special character. The only exception is made for whitespace.
function removeSpecials(str) {
var lower = str.toLowerCase();
var upper = str.toUpperCase();
var res = "";
for(var i=0; i<lower.length; ++i) {
if(lower[i] != upper[i] || lower[i].trim() === '')
res += str[i];
}
return res;
}
Update: Please note, that this solution works only for languages where there are small and capital letters. In languages like Chinese, this won't work.
Update 2: I came to the original solution when I was working on a fuzzy search. If you also trying to remove special characters to implement search functionality, there is a better approach. Use any transliteration library which will produce you string only from Latin characters and then the simple Regexp will do all magic of removing special characters. (This will work for Chinese also and you also will receive side benefits by making Tromsø == Tromso).

search all not (word characters || space):
str.replace(/[^\w ]/, '')

I don't know JavaScript, but isn't it possible using regex?
Something like [^\w\d\s] will match anything but digits, characters and whitespaces. It would be just a question to find the syntax in JavaScript.

I tried Seagul's very creative solution, but found it treated numbers also as special characters, which did not suit my needs. So here is my (failsafe) tweak of Seagul's solution...
//return true if char is a number
function isNumber (text) {
if(text) {
var reg = new RegExp('[0-9]+$');
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '')) {
result += text[i];
}
}
return result;
}
return '';
}

const str = "abc's#thy#^g&test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));

Try to use this one
var result= stringToReplace.replace(/[^\w\s]/g, '')
[^] is for negation, \w for [a-zA-Z0-9_] word characters and \s for space,
/[]/g for global

With regular expression
let string = "!#This tool removes $special *characters* /other/ than! digits, characters and spaces!!!$";
var NewString= string.replace(/[^\w\s]/gi, '');
console.log(NewString);
Result //This tool removes special characters other than digits characters and spaces
Live Example : https://helpseotools.com/text-tools/remove-special-characters

dot (.) may not be considered special. I have added an OR condition to Mozfet's & Seagull's answer:
function isNumber (text) {
reg = new RegExp('[0-9]+$');
if(text) {
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '') || (lower[i].trim() === '.')) {
result += text[i];
}
}
return result;
}
return '';
}

Try this:
const strippedString = htmlString.replace(/(<([^>]+)>)/gi, "");
console.log(strippedString);

const input = `#if_1 $(PR_CONTRACT_END_DATE) == '23-09-2019' #
Test27919<alerts#imimobile.com> #elseif_1 $(PR_CONTRACT_START_DATE) == '20-09-2019' #
Sender539<rama.sns#gmail.com> #elseif_1 $(PR_ACCOUNT_ID) == '1234' #
AdestraSID<hello#imimobile.co> #else_1#Test27919<alerts#imimobile.com>#endif_1#`;
const replaceString = input.split('$(').join('->').split(')').join('<-');
console.log(replaceString.match(/(?<=->).*?(?=<-)/g));

Whose special characters you want to remove from a string, prepare a list of them and then user javascript replace function to remove all special characters.
var str = 'abc'de#;:sfjkewr47239847duifyh';
alert(str.replace("'","").replace("#","").replace(";","").replace(":",""));
or you can run loop for a whole string and compare single single character with the ASCII code and regenerate a new string.

Regex remove duplicate adjacent characters in javascript

I've been struggling getting my regex function to work as intended. My goal is to iterate endlessly over a string (until no match is found) and remove all duplicate, adjacent characters. Aside from checking if 2 characters (adjacent of each other) are equal, the regex should only remove the match when one of the pair is uppercase.
e.g. the regex should only remove 'Xx' or 'xX'.
My current regex only removes matches where a lowercase character is followed by any uppercase character.
(.)(([a-z]{0})+[A-Z])
How can I implement looking for the same adjacent character and the pattern of looking for an uppercase character followed by an equal lowercase character?

You'd either have to list out all possible combinations, eg
aA|Aa|bB|Bb...
Or implement it more programatically, without regex:
let str = 'fooaBbAfoo';
outer:
while (true) {
for (let i = 0; i < str.length - 1; i++) {
const thisChar = str[i];
const nextChar = str[i + 1];
if (/[a-z]/i.test(thisChar) && thisChar.toUpperCase() === nextChar.toUpperCase() && thisChar !== nextChar) {
str = str.slice(0, i) + str.slice(i + 2);
continue outer;
}
}
break;
}
console.log(str);

Looking for the same adjacent character: /(.)\1/
Looking for an uppercase character followed by an equal lowercase character isn't possible in JavaScript since it doesn't support inline modifiers. If they were regex should be: /(.)(?!\1)(?i:\1)/, so it matches both 'xX' or 'Xx'

Clarification of a specific regex

I attempted the CoderByte - Simple Symbols - challenge in JavaScript. From CoderByte:
Using the JavaScript language, have the function SimpleSymbols(str)
take the str parameter being passed and determine if it is an
acceptable sequence by either returning the string true or false. The
str parameter will be composed of + and = symbols with several letters
between them (ie. ++d+===+c++==a) and for the string to be true each
letter must be surrounded by a + symbol. So the string to the left
would be false. The string will not be empty and will have at least
one letter.
My solution:
function simpleSymbols(str) {
var isSymbol = true;
var output = " ";
var symbol = " ";
if (str.match(/[a-zA-Z]/).length != 0) {
for (var i = 0; i <= str.length - 1; i++) {
if ((str.charAt(i) >= 'A' && str.charAt(i) <= 'Z') ||
(str.charAt(i) >= 'a' && str.charAt(i) <= 'z')) {
if (i != str.length - 1) {
symbol = str[--i] + str[++i] + str[++i];
var rgx = new RegExp(/\+[a-zA-Z]\+/);
if (!(rgx.test(symbol))) {
isSymbol = false;
break;
}
}
else {
isSymbol = false;
break;
}
}
}
}
else {
isSymbol = false;
}
return isSymbol;
}
This worked fine for all test cases.
On reviewing code of other submissions, I came across a submission which required only a single line of code:
return ('=' + str + '=').match(/([^\+][a-z])|([a-z][^\+])/gi) === null;
I'm having trouble understanding how the RegEx used here works. Theoretically, I understand:
g modifier => checks for all matches
i modifier => case-insensitive checking
a-z => checks the string contains only letters
\+ => refers to the plus sign
| => match either alternative1 OR alternative2
Thus, if referring to the above, I understand that there are two match conditions:
([^\+][a-z])
([a-z][^\+])
So, for a test input such as "+x+y+z+". Am I correct in understanding that the way it checks matches is as follows: +x => x+ => +y => y+ => +z => z+
Further clarification on this RegEx would be really helpful.
Thanks.

https://regex101.com/ is your friend !
Technically you are right about what you have said.
[^+] matches everything BUT the plus sign. Now the regex says "if there is a letter that is not preceded by a + or a letter that is not followed by a plus, return the regex".
But since there is "=== null", it will return true only if the above regex has not found anything.

[^\+] means any character that's not a plus. [] is a character group and a ^ at the beginning inside a character group means negate/not. It just says "does this string contain any character that's not a plus, followed by a letter a-z?" which would mean it doesn't follow the rules.

Detecting if a character is a letter

Given a set of words, I need to put them in an hash keyed on the first letter of the word.
I have words = {}, with keys A..Z and 0 for numbers and symbols.
I was doing something like
var firstLetter = name.charAt(0);
firstLetter = firstLetter.toUpperCase();
if (firstLetter < "A" || firstLetter > "Z") {
firstLetter = "0";
}
if (words[firstLetter] === undefined) {
words[firstLetter] = [];
}
words[firstLetter].push(name);
but this fails with dieresis and other chars, like in the word Ärzteversorgung.
That word is put in the "0" array, how could I put it in the "A" array?

You can use this to test if a character is likely to be a letter:
var firstLetter = name.charAt(0).toUpperCase();
if( firstLetter.toLowerCase() != firstLetter) {
// it's a letter
}
else {
// it's a symbol
}
This works because JavaScript already has a mapping for lowercase to uppercase letters (and vice versa), so if a character is unchanged by toLowerCase() then it's not in the letter table.

Try converting the character to its uppercase and lowercase and check to see if there's a difference. Only letter characters change when they are converted to their respective upper and lower case (numbers, punctuation marks, etc. don't). Below is a sample function using this concept in mind:
function isALetter(charVal)
{
if( charVal.toUpperCase() != charVal.toLowerCase() )
return true;
else
return false;
}

You could use a regular expression. Unfortunately, JavaScript does not consider international characters to be "word characters". But you can do it with the regular expression below:
var firstLetter = name.charAt(0);
firstLetter = firstLetter.toUpperCase();
if (!firstLetter.match(/^\wÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð$/)) {
firstLetter = "0";
}
if (words[firstLetter] === undefined) {
words[firstLetter] = [];
}
words[firstLetter].push(name);

You can use .charCodeAt(0); to get the position in the ASCII Chart and then do some checks.
The ranges you are looking for are probably 65-90, 97-122, 128-154, 160-165 (inclusive), but double check this by viewing the ASCII Chart
Something like this
if((x>64&&x<91)||(x>96&&x<123)||(x>127&&x<155)||(x>159&&x<166))
Where x is the Char Code

This is fortunately now possible without external libraries. Straight from the docs:
let story = "It’s the Cheshire Cat: now I shall have somebody to talk to.";
// Most explicit form
story.match(/\p{General_Category=Letter}/gu);
// It is not mandatory to use the property name for General categories
story.match(/\p{Letter}/gu);

Develop Reference

JavaScript is the programming language of the Web.

Determine if char is in regex - javascript

Related

Escaping apostrophes and the like in JavaScript [duplicate]

Delete special characters from an ng-repeat list (parsed from CSV) [duplicate]

Regex remove duplicate adjacent characters in javascript

Clarification of a specific regex

Detecting if a character is a letter

Categories

Resources