Detecting if a character is a letter - javascript

Given a set of words, I need to put them in an hash keyed on the first letter of the word.
I have words = {}, with keys A..Z and 0 for numbers and symbols.
I was doing something like
var firstLetter = name.charAt(0);
firstLetter = firstLetter.toUpperCase();
if (firstLetter < "A" || firstLetter > "Z") {
firstLetter = "0";
}
if (words[firstLetter] === undefined) {
words[firstLetter] = [];
}
words[firstLetter].push(name);
but this fails with dieresis and other chars, like in the word Ärzteversorgung.
That word is put in the "0" array, how could I put it in the "A" array?

You can use this to test if a character is likely to be a letter:
var firstLetter = name.charAt(0).toUpperCase();
if( firstLetter.toLowerCase() != firstLetter) {
// it's a letter
}
else {
// it's a symbol
}
This works because JavaScript already has a mapping for lowercase to uppercase letters (and vice versa), so if a character is unchanged by toLowerCase() then it's not in the letter table.

Try converting the character to its uppercase and lowercase and check to see if there's a difference. Only letter characters change when they are converted to their respective upper and lower case (numbers, punctuation marks, etc. don't). Below is a sample function using this concept in mind:
function isALetter(charVal)
{
if( charVal.toUpperCase() != charVal.toLowerCase() )
return true;
else
return false;
}

You could use a regular expression. Unfortunately, JavaScript does not consider international characters to be "word characters". But you can do it with the regular expression below:
var firstLetter = name.charAt(0);
firstLetter = firstLetter.toUpperCase();
if (!firstLetter.match(/^\wÀÈÌÒÙàèìòùÁÉÍÓÚÝáéíóúýÂÊÎÔÛâêîôûÃÑÕãñõÄËÏÖÜäëïöüçÇßØøÅåÆæÞþÐð$/)) {
firstLetter = "0";
}
if (words[firstLetter] === undefined) {
words[firstLetter] = [];
}
words[firstLetter].push(name);

You can use .charCodeAt(0); to get the position in the ASCII Chart and then do some checks.
The ranges you are looking for are probably 65-90, 97-122, 128-154, 160-165 (inclusive), but double check this by viewing the ASCII Chart
Something like this
if((x>64&&x<91)||(x>96&&x<123)||(x>127&&x<155)||(x>159&&x<166))
Where x is the Char Code

This is fortunately now possible without external libraries. Straight from the docs:
let story = "It’s the Cheshire Cat: now I shall have somebody to talk to.";
// Most explicit form
story.match(/\p{General_Category=Letter}/gu);
// It is not mandatory to use the property name for General categories
story.match(/\p{Letter}/gu);

Related

Regular Expressions in HTML

I hope you are well. So, as you can see, from "a-z" I have made a code to transcript every common letter (from "a" to "z") in some symbols. How can I do it when I want to transcript the letters "th" together in different symbol? I do not want the app to translate the letters "t" and "h" separately BUT together! How can I do that? Thank you so much!!!
var theInput = txtBr.value.toLowerCase();
for (var i = 0; i < theInput.length; i++)
{
var letter = theInput.charAt( i );
if( letter.match(/[a-z\s]/i) ) {
var symbol = map[ letter ];
txtarea.innerHTML += symbol;
}
string.search() will return the position of the substring you are searching for.
let string = "nspoiuthpiifs";
let position = string.search("th"); // will return 6
You could then split the string at that position with string.split(), replace the the "th" with the symbol you desire, then rejoin the string with array.join()
Note: string.search() will return only the position of the first one it finds, so you may want to repeat this method until there are none left.

Replacing every instance or a word in any form with JS

I am developing a mini chat program and want to redact profanity. How can I replace the word, for example, "replace" in any form like "rePlAce" or "RepLACe" without using .toLowerCase() or something of that sort?
I'm sorry if this is a poor question. I will try to edit it to become better at your suggestions.
The replace method for strings accepts a regular expression pattern. You can use the case insensitive modifier to ignore case differences in the search text.
let userMessage = "Please rePlAce me.";
let pattern = /replace/i;
userMessage = userMessage.replace(pattern, "******");
console.log(userMessage); // "Please ****** me.";
Note: the replace() method only replaces the first occurrence by default. You can include the global modifier to replace all instances.
let userMessage = "Please rePlAce this instance and repLAce this instance as well.";
let pattern = /replace/ig;
userMessage = userMessage.replace(pattern, "******");
console.log(userMessage); // "Please ****** this instance and ****** this instance as well."
Own toLowerCase() Function
To convert to lowerCase you can write your own function.
check if the ASCII Value of your letter is in the range of the upper case letters 65 - 90. (picture below)
If it is than give it an offset of 32 this is the corresponding lower case letter.
ReplaceAll
Then convert your search word and the sentence to lower case and replace all the occurences with replaceAll()
let temper = "HoW";
function convertToLowerCase(str) {
let result = '';
for (let i = 0; i < str.length; i++) {
var code = str.charCodeAt(i);
if (code > 64 && code < 91) {
result += String.fromCharCode(code + 32);
} else {
result += str.charAt(i);
}
}
return result;
}
// convert your string to lowercase
let result = convertToLowerCase(temper);
// then use the replace method of string
let sentence = convertToLowerCase("Hello How Are how You HoW");
console.log(sentence.replaceAll(result, "new"));

Escaping apostrophes and the like in JavaScript [duplicate]

I want to remove all special characters except space from a string using JavaScript.
For example,
abc's test#s
should output as
abcs tests.
You should use the string replace function, with a single regex.
Assuming by special characters, you mean anything that's not letter, here is a solution:
const str = "abc's test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));
You can do it specifying the characters you want to remove:
string = string.replace(/[&\/\\#,+()$~%.'":*?<>{}]/g, '');
Alternatively, to change all characters except numbers and letters, try:
string = string.replace(/[^a-zA-Z0-9]/g, '');
The first solution does not work for any UTF-8 alphabet. (It will cut text such as Привіт). I have managed to create a function which does not use RegExp and use good UTF-8 support in the JavaScript engine. The idea is simple if a symbol is equal in uppercase and lowercase it is a special character. The only exception is made for whitespace.
function removeSpecials(str) {
var lower = str.toLowerCase();
var upper = str.toUpperCase();
var res = "";
for(var i=0; i<lower.length; ++i) {
if(lower[i] != upper[i] || lower[i].trim() === '')
res += str[i];
}
return res;
}
Update: Please note, that this solution works only for languages where there are small and capital letters. In languages like Chinese, this won't work.
Update 2: I came to the original solution when I was working on a fuzzy search. If you also trying to remove special characters to implement search functionality, there is a better approach. Use any transliteration library which will produce you string only from Latin characters and then the simple Regexp will do all magic of removing special characters. (This will work for Chinese also and you also will receive side benefits by making Tromsø == Tromso).
search all not (word characters || space):
str.replace(/[^\w ]/, '')
I don't know JavaScript, but isn't it possible using regex?
Something like [^\w\d\s] will match anything but digits, characters and whitespaces. It would be just a question to find the syntax in JavaScript.
I tried Seagul's very creative solution, but found it treated numbers also as special characters, which did not suit my needs. So here is my (failsafe) tweak of Seagul's solution...
//return true if char is a number
function isNumber (text) {
if(text) {
var reg = new RegExp('[0-9]+$');
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '')) {
result += text[i];
}
}
return result;
}
return '';
}
const str = "abc's#thy#^g&test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));
Try to use this one
var result= stringToReplace.replace(/[^\w\s]/g, '')
[^] is for negation, \w for [a-zA-Z0-9_] word characters and \s for space,
/[]/g for global
With regular expression
let string = "!#This tool removes $special *characters* /other/ than! digits, characters and spaces!!!$";
var NewString= string.replace(/[^\w\s]/gi, '');
console.log(NewString);
Result //This tool removes special characters other than digits characters and spaces
Live Example : https://helpseotools.com/text-tools/remove-special-characters
dot (.) may not be considered special. I have added an OR condition to Mozfet's & Seagull's answer:
function isNumber (text) {
reg = new RegExp('[0-9]+$');
if(text) {
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '') || (lower[i].trim() === '.')) {
result += text[i];
}
}
return result;
}
return '';
}
Try this:
const strippedString = htmlString.replace(/(<([^>]+)>)/gi, "");
console.log(strippedString);
const input = `#if_1 $(PR_CONTRACT_END_DATE) == '23-09-2019' #
Test27919<alerts#imimobile.com> #elseif_1 $(PR_CONTRACT_START_DATE) == '20-09-2019' #
Sender539<rama.sns#gmail.com> #elseif_1 $(PR_ACCOUNT_ID) == '1234' #
AdestraSID<hello#imimobile.co> #else_1#Test27919<alerts#imimobile.com>#endif_1#`;
const replaceString = input.split('$(').join('->').split(')').join('<-');
console.log(replaceString.match(/(?<=->).*?(?=<-)/g));
Whose special characters you want to remove from a string, prepare a list of them and then user javascript replace function to remove all special characters.
var str = 'abc'de#;:sfjkewr47239847duifyh';
alert(str.replace("'","").replace("#","").replace(";","").replace(":",""));
or you can run loop for a whole string and compare single single character with the ASCII code and regenerate a new string.

Delete special characters from an ng-repeat list (parsed from CSV) [duplicate]

I want to remove all special characters except space from a string using JavaScript.
For example,
abc's test#s
should output as
abcs tests.
You should use the string replace function, with a single regex.
Assuming by special characters, you mean anything that's not letter, here is a solution:
const str = "abc's test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));
You can do it specifying the characters you want to remove:
string = string.replace(/[&\/\\#,+()$~%.'":*?<>{}]/g, '');
Alternatively, to change all characters except numbers and letters, try:
string = string.replace(/[^a-zA-Z0-9]/g, '');
The first solution does not work for any UTF-8 alphabet. (It will cut text such as Привіт). I have managed to create a function which does not use RegExp and use good UTF-8 support in the JavaScript engine. The idea is simple if a symbol is equal in uppercase and lowercase it is a special character. The only exception is made for whitespace.
function removeSpecials(str) {
var lower = str.toLowerCase();
var upper = str.toUpperCase();
var res = "";
for(var i=0; i<lower.length; ++i) {
if(lower[i] != upper[i] || lower[i].trim() === '')
res += str[i];
}
return res;
}
Update: Please note, that this solution works only for languages where there are small and capital letters. In languages like Chinese, this won't work.
Update 2: I came to the original solution when I was working on a fuzzy search. If you also trying to remove special characters to implement search functionality, there is a better approach. Use any transliteration library which will produce you string only from Latin characters and then the simple Regexp will do all magic of removing special characters. (This will work for Chinese also and you also will receive side benefits by making Tromsø == Tromso).
search all not (word characters || space):
str.replace(/[^\w ]/, '')
I don't know JavaScript, but isn't it possible using regex?
Something like [^\w\d\s] will match anything but digits, characters and whitespaces. It would be just a question to find the syntax in JavaScript.
I tried Seagul's very creative solution, but found it treated numbers also as special characters, which did not suit my needs. So here is my (failsafe) tweak of Seagul's solution...
//return true if char is a number
function isNumber (text) {
if(text) {
var reg = new RegExp('[0-9]+$');
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '')) {
result += text[i];
}
}
return result;
}
return '';
}
const str = "abc's#thy#^g&test#s";
console.log(str.replace(/[^a-zA-Z ]/g, ""));
Try to use this one
var result= stringToReplace.replace(/[^\w\s]/g, '')
[^] is for negation, \w for [a-zA-Z0-9_] word characters and \s for space,
/[]/g for global
With regular expression
let string = "!#This tool removes $special *characters* /other/ than! digits, characters and spaces!!!$";
var NewString= string.replace(/[^\w\s]/gi, '');
console.log(NewString);
Result //This tool removes special characters other than digits characters and spaces
Live Example : https://helpseotools.com/text-tools/remove-special-characters
dot (.) may not be considered special. I have added an OR condition to Mozfet's & Seagull's answer:
function isNumber (text) {
reg = new RegExp('[0-9]+$');
if(text) {
return reg.test(text);
}
return false;
}
function removeSpecial (text) {
if(text) {
var lower = text.toLowerCase();
var upper = text.toUpperCase();
var result = "";
for(var i=0; i<lower.length; ++i) {
if(isNumber(text[i]) || (lower[i] != upper[i]) || (lower[i].trim() === '') || (lower[i].trim() === '.')) {
result += text[i];
}
}
return result;
}
return '';
}
Try this:
const strippedString = htmlString.replace(/(<([^>]+)>)/gi, "");
console.log(strippedString);
const input = `#if_1 $(PR_CONTRACT_END_DATE) == '23-09-2019' #
Test27919<alerts#imimobile.com> #elseif_1 $(PR_CONTRACT_START_DATE) == '20-09-2019' #
Sender539<rama.sns#gmail.com> #elseif_1 $(PR_ACCOUNT_ID) == '1234' #
AdestraSID<hello#imimobile.co> #else_1#Test27919<alerts#imimobile.com>#endif_1#`;
const replaceString = input.split('$(').join('->').split(')').join('<-');
console.log(replaceString.match(/(?<=->).*?(?=<-)/g));
Whose special characters you want to remove from a string, prepare a list of them and then user javascript replace function to remove all special characters.
var str = 'abc'de#;:sfjkewr47239847duifyh';
alert(str.replace("'","").replace("#","").replace(";","").replace(":",""));
or you can run loop for a whole string and compare single single character with the ASCII code and regenerate a new string.

Trouble with Javascript easy coderbyte challenge

I'm attempting to answer this question:
Using the JavaScript language, have the function SimpleSymbols(str) take the str parameter being passed and determine if it is an acceptable sequence by either returning the string true or false. The str parameter will be composed of + and = symbols with several letters between them (ie. ++d+===+c++==a) and for the string to be true each letter must be surrounded by a + symbol. So the string to the left would be false. The string will not be empty and will have at least one letter.
Here's my solution:
function SimpleSymbols(str) {
var test;
for (var i =0; i<str.length; i++){
if ((str.charAt(i)!== '+' && str.charAt(i+1) === str.match(/[a-z]/))
||(str.charAt(i+1) === str.match(/[a-z]/) && str.charAt(i+2) !== '+')){
test = false;
break;
}
else if (str.charAt(0) === str.match(/[a-z]/)){
test = false;
break;}
else {
test= true;}
}
return test;
};
I think you can just use two regex and then compare the length of arrays returned by them
function SimpleSymbols(str){
return str.match(/[a-z]/g).length == str.match(/\+[a-z]\+/g).length;
}
The first regex /[a-z]/g will match all the letters and /\+[a-z]\+/g will match all the letters which are followed and preceded by a literal +.
Then, we just use the Array.length property to check if the lengths are same or not and then return the Boolean result. As simple as that.

Categories

Resources