Convert camel case to sentence case in javascript - javascript

I found myself needing to do camel case to sentence case string conversion with sane acronym support, a google search for ideas led me to the following SO post:
Convert camelCaseText to Sentence Case Text
Which is actually asking about title case not sentence case so I came up with the following solution which maybe others will find helpful or can offer improvements to, it is using ES6 which is acceptable for me and can easily be polyfilled if there's some horrible IE requirement.

The below uses capitalised notation for acronyms; I don't agree with Microsoft's recommendation of capitalising when more than two characters so this expects the whole acronym to be capitalised even if it's at the start of the string (which technically means it's not camel case but it gives sane controllable output), multiple consecutive acronyms can be escaped with _ (e.g. parseDBM_MXL -> Parse DBM XML).
function camelToSentenceCase(str) {
return str.split(/([A-Z]|\d)/).map((v, i, arr) => {
// If first block then capitalise 1st letter regardless
if (!i) return v.charAt(0).toUpperCase() + v.slice(1);
// Skip empty blocks
if (!v) return v;
// Underscore substitution
if (v === '_') return " ";
// We have a capital or number
if (v.length === 1 && v === v.toUpperCase()) {
const previousCapital = !arr[i-1] || arr[i-1] === '_';
const nextWord = i+1 < arr.length && arr[i+1] && arr[i+1] !== '_';
const nextTwoCapitalsOrEndOfString = i+3 > arr.length || !arr[i+1] && !arr[i+3];
// Insert space
if (!previousCapital || nextWord) v = " " + v;
// Start of word or single letter word
if (nextWord || (!previousCapital && !nextTwoCapitalsOrEndOfString)) v = v.toLowerCase();
}
return v;
}).join("");
}
// ----------------------------------------------------- //
var testSet = [
'camelCase',
'camelTOPCase',
'aP2PConnection',
'JSONIsGreat',
'thisIsALoadOfJSON',
'parseDBM_XML',
'superSimpleExample',
'aGoodIPAddress'
];
testSet.forEach(function(item) {
console.log(item, '->', camelToSentenceCase(item));
});

Related

Parse custom string to array Javascript

So I'm trying to parse specific sequence of string in JavaScript. Basically i need to parse (123 abc)(()) where () is like openning array and closing, and values in the array would be split by spaces.
For example:
(133 abs)(()) will give me [["123", "abs"],[[]]]
No JQuery or ragex,only vanilla JavaScript
i tried doing but it errors with recursion error, can't really send the try due it was on another computer
Does this do what you need? No RegExp. No JQuery.
This splits the input string into codepoints and recursively enumerates that array.
The base recursive step is to check if we have reached the end of the codepoint array, at which point we exit the recursion.
For each codepoint we use a switch statement to choose the value to insert into the result depending on the current codepoint, the previous codepoint, and whether we are inside an "item".
Codepoint by codepoint a result string is constructed. This code will probably break for Unicode strings containing grapheme clusters.
const transform = (str) => {
let result = []
let withinItem = false
const arr = [...str] // split into codepoints
const next = (arr, pos, prev = null) => {
if(pos >= arr.length) return
let curr = arr[pos]
switch(true) {
case curr === '(':
prev === ')' && result.push(',')
result.push('[')
break
case curr === ' ':
result.push('", ')
withinItem = false
break
case curr === ')':
withinItem && result.push('"')
result.push(']')
withinItem = false
break
default:
!withinItem && result.push('"')
result.push(curr)
withinItem = true
break
}
next(arr, ++pos, curr)
}
next(arr, 0)
return `[${result.join('')}]`
}
console.log(transform('(133 abs)(())'))

Splitting word into syllables in javascript

My intention is to build a simple process with which I can split the word into syllables. The approach is to split the word whenever the vowel occurs. However, the trouble is when a consonant is not followed by a vowel, in such a case the split occurs at that consonant.
My test cases are as follows:
hair = ["hair"]
hairai = ["hai", "rai"]
hatred = ["hat", "red"]
In the first example hair is one syllable, as the final consonant is not followed by a vowel, similarly, in the final example, the "t" is followed by an r and so should considered along "ha" as one syllable.
In the second example, ai is considered as one vowel sound and so hai will become one syllable.
More examples include
father = ["fat", "her"]
kid = ["kid"]
lady = ["la","dy"]
Please note that, I am using simplistic examples as the ENglish language is quite complex when it comes to sound
My code is as follows
function syllabify(input) {
var arrs = [];
for (var i in input) {
var st = '';
var curr = input[i];
var nxt = input[i + 1];
if ((curr == 'a') || (curr == 'e') || (curr == 'i') || (curr == 'o') || (curr == 'u')) {
st += curr;
} else {
if ((nxt == 'a') || (nxt == 'e') || (nxt == 'i') || (nxt == 'o') || (nxt == 'u')) {
st += nxt;
} else {
arrs.push(st);
st = '';
}
}
}
console.log(arrs);
}
syllabify('hatred')
However, my code does not even return the strings. What am I doing wrong?
Problems with your current approach
There are a number of problems with your code:
First thing in the loop, you set st to an empty string. This means that you never accumulate any letters. You probably want that line above, outside the loop.
You are trying to loop over the indexes of letters by using i in input. In JavaScript, the in keyword gives you the keys of an object as strings. So you get strings, not numbers, plus the names of some methods defined on strings. Try var i = 0; i < input.length; i++ instead.
Maybe not the direct cause of the problems, but still - your code is messy. How about these?
Use clearer names. currentSyllable instead of st, syllables instead of arrs and so on.
Instead of a nested if - else, use one if - else if - else.
You repeat the same code that checks for vowels twice. Separate it into a function isVowel(letter) instead.
A new approach
Use regular expressions! Here is your definition of a syllable expressed in regex:
First, zero or more consonants: [^aeiouy]*
Then, one or more vowels: [aeiouy]+
After that, zero or one of the following:
Consonants, followed by the end of the word: [^aeiouy]*$
A consonant (if it is followed by another consonant): [^aeiouy](?=[^aeiouy])
Taken together you get this:
/[^aeiouy]*[aeiouy]+(?:[^aeiouy]*$|[^aeiouy](?=[^aeiouy]))?/gi
You can see it in action here. To run it in JavaScript, use the match function:
const syllableRegex = /[^aeiouy]*[aeiouy]+(?:[^aeiouy]*$|[^aeiouy](?=[^aeiouy]))?/gi;
function syllabify(words) {
return words.match(syllableRegex);
}
console.log(['away', 'hair', 'halter', 'hairspray', 'father', 'lady', 'kid'].map(syllabify))
Note that this does not work for words without vowels. You would either have to modify the regex to accomodate for that case, or do some other workaround.
I am weak in the ways of RegEx and while Anders example is right most of the time, I did find a few exceptions. Here is what I have found to work so far (but I am sure there are other exceptions I have not found yet). I am sure it can be RegEx'ified by masters of the art. This function returns an array of syllables.
function getSyllables(word){
var response = [];
var isSpecialCase = false;
var nums = (word.match(/[aeiou]/gi) || []).length;
//debugger;
if (isSpecialCase == false && (word.match(/[0123456789]/gi) || []).length == word.length ){
// has digits
response.push(word);
isSpecialCase = true;
}
if (isSpecialCase == false && word.length < 4){
// three letters or less
response.push(word);
isSpecialCase = true;
}
if (isSpecialCase == false && word.charAt(word.length-1) == "e"){
if (isVowel(word.charAt(word.length-2)) == false){
var cnt = (word.match(/[aeiou]/gi) || []).length;
if (cnt == 3){
if (hasDoubleVowels(word)){
// words like "piece, fleece, grease"
response.push(word);
isSpecialCase = true;
}
}
if (cnt == 2){
// words like "phase, phrase, blaze, name",
if (hasRecurringConsonant(word) == false) {
// but not like "syllable"
response.push(word);
isSpecialCase = true;
}
}
}
}
if (isSpecialCase == false){
const syllableRegex = /[^aeiouy]*[aeiouy]+(?:[^aeiouy]*$|[^aeiouy](?=[^aeiouy]))?/gi;
response = word.match(syllableRegex);
}
return response;
}

JS str replace Unicode aware

I am sure there is probably a dupe of this here somewhere, but if so I cannot seem to find it, nor can I glue the pieces together correctly from what I could find to get what I need. I am using JavaScript and need the following:
1) Replace the first character of a string with it's Unicode aware capitalization UNLESS the next (second) character is a - OR ` or ' (minus/dash, caret, or single-quote).
I have come close with what I could find except for getting the caret and single quote included (assuming they need to be escaped somehow) and what I believe to be a scope issue with the following because first returns undefined. I am also not positive which JS/String functions are Unicode aware:
autoCorrect = (str) => {
return str.replace(/^./, function(first) {
// if next char is not - OR ` OR ' <- not sure how to handle caret and quote
if(str.charAt(1) != '-' ) {
return first.toUpperCase(); // first is undefined here - scope??
}
});
}
Any help is appreciated!
Internally, JavaScript uses UCS-2, not UTF-8.
Handling Unicode in JavaScript isn't particularly beautiful, but possible. It becomes particularly ugly with surrogate pairs such as "🐱", but the for..of loop can handle that. Do never try to use indices on Unicode strings, as you might get only one half of a surrogate pair (which breaks Unicode).
This should handle Unicode well and do what you want:
function autoCorrect(string) {
let i = 0, firstSymbol;
const blacklist = ["-", "`", "'"];
for (const symbol of string) {
if (i === 0) {
firstSymbol = symbol;
}
else if (i === 1 && blacklist.some(char => char === symbol)) {
return string;
}
else {
const rest = string.substring(firstSymbol.length);
return firstSymbol.toUpperCase() + rest;
}
++i;
}
return string.toUpperCase();
}
Tests
console.assert(autoCorrect("δα") === "Δα");
console.assert(autoCorrect("🐱") === "🐱");
console.assert(autoCorrect("d") === "D");
console.assert(autoCorrect("t-minus-one") === "t-minus-one");
console.assert(autoCorrect("t`minus`one") === "t`minus`one");
console.assert(autoCorrect("t'minus'one") === "t'minus'one");
console.assert(autoCorrect("t^minus^one") === "T^minus^one");
console.assert(autoCorrect("t_minus_one") === "T_minus_one");

Letter Count I JavaScript Challenge on Coderbyte

I've been on this problem for several hours now and have done all I can to the best of my current newbie javaScript ability to solve this challenge but I just can't figure out exactly what's wrong. I keep getting "UNEXPECTED TOKEN ILLEGAL on here: http://jsfiddle.net/6n8apjze/14/
and "TypeError: Cannot read property 'length' of null": http://goo.gl/LIz89F
I think the problem is the howManyRepeat variable. I don't understand why I'm getting it can't read the length of null when clearly word is a word from str...
I got the idea for:
word.toLowerCase().split("").sort().join("").match(/([.])\1+/g).length
...here: Get duplicate characters count in a string
The Challenge:
Using the JavaScript language, have the function LetterCountI(str) take the str
parameter being passed and return the first word with the greatest number of
repeated letters. For example: "Today, is the greatest day ever!" should return
greatest because it has 2 e's (and 2 t's) and it comes before ever which also
has 2 e's. If there are no words with repeating letters return -1. Words will
be separated by spaces.
function LetterCountI(str){
var wordsAndAmount={};
var mostRepeatLetters="-1";
var words=str.split(" ");
words.forEach(function(word){
// returns value of how many repeated letters in word.
var howManyRepeat=word.toLowerCase().split("").sort().join("").match(/([.])\1+/g).length;
// if there are repeats(at least one value).
if(howManyRepeat !== null || howManyRepeat !== 0){
wordsAndAmount[word] = howManyRepeat;
}else{
// if no words have repeats will return -1 after for in loop.
wordsAndAmount[word] = -1;
}
});
// word is the key, wordsAndAmount[word] is the value of word.
for(var word in wordsAndAmount){
// if two words have same # of repeats pick the one before it.
if(wordsAndAmount[word]===mostRepeatLetters){
mostRepeatLetters=mostRepeatLetters;
}else if(wordsAndAmount[word]<mostRepeatLetters){
mostRepeatLetters=mostRepeatLetters;
}else if(wordsAndAmount[word]>mostRepeatLetters){
mostRepeatLetters=word;
}
}
return mostRepeatLetters;
}
// TESTS
console.log("-----");
console.log(LetterCountI("Today, is the greatest day ever!"));
console.log(LetterCountI("Hello apple pie"));
console.log(LetterCountI("No words"));
Any guidance is much appreciated. Thank you!! ^____^
Here is the working code snippet:
/*
Using the JavaScript language, have the function LetterCountI(str) take the str
parameter being passed and return the first word with the greatest number of
repeated letters. For example: "Today, is the greatest day ever!" should return
greatest because it has 2 e's (and 2 t's) and it comes before ever which also
has 2 e's. If there are no words with repeating letters return -1. Words will
be separated by spaces.
console.log(LetterCountI("Today, is the greatest day ever!") === "greatest");
console.log(LetterCountI("Hello apple pie") === "Hello");
console.log(LetterCountI("No words") === -1);
Tips:
This is an interesting problem. What we can do is turn the string to lower case using String.toLowerCase, and then split on "", so we get an array of characters.
We will then sort it with Array.sort. After it has been sorted, we will join it using Array.join. We can then make use of the regex /(.)\1+/g which essentially means match a letter and subsequent letters if it's the same.
When we use String.match with the stated regex, we will get an Array, whose length is the answer. Also used some try...catch to return 0 in case match returns null and results in TypeError.
/(.)\1+/g with the match method will return a value of letters that appear one after the other. Without sort(), this wouldn't work.
*/
function LetterCountI(str){
var wordsAndAmount={};
var mostRepeatLetters="";
var words=str.split(" ");
words.forEach(function(word){
var howManyRepeat=word.toLowerCase().split("").sort().join("").match(/(.)\1+/g);
if(howManyRepeat !== null && howManyRepeat !== 0){ // if there are repeats(at least one value)..
wordsAndAmount[word] = howManyRepeat;
} else{
wordsAndAmount[word] = -1; // if no words have repeats will return -1 after for in loop.
}
});
// console.log(wordsAndAmount);
for(var word in wordsAndAmount){ // word is the key, wordsAndAmount[word] is the value of word.
// console.log("Key = " + word);
// console.log("val = " + wordsAndAmount[word]);
if(wordsAndAmount[word].length>mostRepeatLetters.length){ //if two words have same # of repeats pick the one before it.
mostRepeatLetters=word;
}
}
return mostRepeatLetters ? mostRepeatLetters : -1;
}
// TESTS
console.log("-----");
console.log(LetterCountI("Today, is the greatest day ever!"));
console.log(LetterCountI("Hello apple pie"));
console.log(LetterCountI("No words"));
/*
split into words
var wordsAndAmount={};
var mostRepeatLetters=0;
loop through words
Check if words has repeated letters, if so
Push amount into object
Like wordsAndAmount[word[i]]= a number
If no repeated letters...no else.
Loop through objects
Compare new words amount of repeated letters with mostRepeatLetters replacing whoever has more.
In the end return the result of the word having most repeated letters
If all words have no repeated letters return -1, ie.
*/
The changes made:
[.] turned into . as [.] matches a literal period symbol, not any character but a newline
added closing */ at the end of the code (the last comment block was not closed resulting in UNEXPECTED TOKEN ILLEGAL)
if(howManyRepeat !== null || howManyRepeat !== 0) should be replaced with if(howManyRepeat !== null && howManyRepeat !== 0) since otherwise the null was testing for equality with 0 and led to the TypeError: Cannot read property 'length' of null" issue. Note that .match(/(.)\1+/g).length cannot be used since the result of matching can be null, and this will also cause the TypeError to appear.
The algorithm for getting the first entry with the greatest number of repetitions was wrong since the first if block allowed subsequent entry to be output as a correct result (not the first, but the last entry with the same repetitions was output actually)
-1 can be returned if mostRepeatLetters is empty.
Hope you dont mind if I rewrite this code. My code may not be that efficient.
Here is a snippet
function findGreatest() {
// ipField is input field
var getString = document.getElementById('ipField').value.toLowerCase();
var finalArray = [];
var strArray = [];
var tempArray = [];
strArray = (getString.split(" "));
// Take only those words which has repeated letter
for (var i = 0, j = strArray.length; i < j; i++) {
if ((/([a-zA-Z]).*?\1/).test(strArray[i])) {
tempArray.push(strArray[i]);
}
}
if (tempArray.length == 0) { // If no word with repeated Character
console.log('No such Word');
return -1;
} else { // If array has words with repeated character
for (var x = 0, y = tempArray.length; x < y; x++) {
var m = findRepWord(tempArray[x]); // Find number of repeated character in it
finalArray.push({
name: tempArray[x],
repeat: m
})
}
// Sort this array to get word with largest repeated chars
finalArray.sort(function(z, a) {
return a.repeat - z.repeat
})
document.getElementById('repWord').textContent=finalArray[0].name;
}
}
// Function to find the word which as highest repeated character(s)
function findRepWord(str) {
try {
return str.match(/(.)\1+/g).length;
} catch (e) {
return 0;
} // if TypeError
}
Here is DEMO
function LetterCountI(str) {
var word_arr = str.split(" ");
var x = word_arr.slice();
for(var i = 0; i < x.length; i ++){
var sum = 0;
for(var y = 0; y < x[i].length; y++){
var amount = x[i].split("").filter(function(a){return a == x[i][y]}).length;
if (amount > 1){
sum += amount
}
}
x[i] = sum;
}
var max = Math.max.apply(Math,x);
if(max == 0)
return -1;
var index = x.indexOf(max);
return(word_arr[index]);
};
Here is another version as well.
You could use new Set in the following manner:
const letterCount = s => {
const res = s.split(' ')
.map(s => [s, (s.length - new Set([...s]).size)])
.reduce((p, c) => (!p.length) ? c
: (c[1] > p[1]) ? c : p, []);
return !res[1] ? -1 : res.slice(0,1).toString()
}
Note: I have not tested this solution (other than the phrases presented here), but the idea is to subtract unique characters from the total characters in each word of the phrase.

Count number of matches of a regex in Javascript

I wanted to write a regex to count the number of spaces/tabs/newline in a chunk of text. So I naively wrote the following:-
numSpaces : function(text) {
return text.match(/\s/).length;
}
For some unknown reasons it always returns 1. What is the problem with the above statement? I have since solved the problem with the following:-
numSpaces : function(text) {
return (text.split(/\s/).length -1);
}
tl;dr: Generic Pattern Counter
// THIS IS WHAT YOU NEED
const count = (str) => {
const re = /YOUR_PATTERN_HERE/g
return ((str || '').match(re) || []).length
}
For those that arrived here looking for a generic way to count the number of occurrences of a regex pattern in a string, and don't want it to fail if there are zero occurrences, this code is what you need. Here's a demonstration:
/*
* Example
*/
const count = (str) => {
const re = /[a-z]{3}/g
return ((str || '').match(re) || []).length
}
const str1 = 'abc, def, ghi'
const str2 = 'ABC, DEF, GHI'
console.log(`'${str1}' has ${count(str1)} occurrences of pattern '/[a-z]{3}/g'`)
console.log(`'${str2}' has ${count(str2)} occurrences of pattern '/[a-z]{3}/g'`)
Original Answer
The problem with your initial code is that you are missing the global identifier:
>>> 'hi there how are you'.match(/\s/g).length;
4
Without the g part of the regex it will only match the first occurrence and stop there.
Also note that your regex will count successive spaces twice:
>>> 'hi there'.match(/\s/g).length;
2
If that is not desirable, you could do this:
>>> 'hi there'.match(/\s+/g).length;
1
As mentioned in my earlier answer, you can use RegExp.exec() to iterate over all matches and count each occurrence; the advantage is limited to memory only, because on the whole it's about 20% slower than using String.match().
var re = /\s/g,
count = 0;
while (re.exec(text) !== null) {
++count;
}
return count;
(('a a a').match(/b/g) || []).length; // 0
(('a a a').match(/a/g) || []).length; // 3
Based on https://stackoverflow.com/a/48195124/16777 but fixed to actually work in zero-results case.
Here is a similar solution to #Paolo Bergantino's answer, but with modern operators. I'll explain below.
const matchCount = (str, re) => {
return str?.match(re)?.length ?? 0;
};
// usage
let numSpaces = matchCount(undefined, /\s/g);
console.log(numSpaces); // 0
numSpaces = matchCount("foobarbaz", /\s/g);
console.log(numSpaces); // 0
numSpaces = matchCount("foo bar baz", /\s/g);
console.log(numSpaces); // 2
?. is the optional chaining operator. It allows you to chain calls as deep as you want without having to worry about whether there is an undefined/null along the way. Think of str?.match(re) as
if (str !== undefined && str !== null) {
return str.match(re);
} else {
return undefined;
}
This is slightly different from #Paolo Bergantino's. Theirs is written like this: (str || ''). That means if str is falsy, return ''. 0 is falsy. document.all is falsy. In my opinion, if someone were to pass those into this function as a string, it would probably be because of programmer error. Therefore, I'd rather be informed I'm doing something non-sensible than troubleshoot why I keep on getting a length of 0.
?? is the nullish coalescing operator. Think of it as || but more specific. If the left hand side of || evaluates to falsy, it executes the right-hand side. But ?? only executes if the left-hand side is undefined or null.
Keep in mind, the nullish coalescing operator in ?.length ?? 0 will return the same thing as using ?.length || 0. The difference is, if length returns 0, it won't execute the right-hand side... but the result is going to be 0 whether you use || or ??.
Honestly, in this situation I would probably change it to || because more JavaScript developers are familiar with that operator. Maybe someone could enlighten me on benefits of ?? vs || in this situation, if any exist.
Lastly, I changed the signature so the function can be used for any regex.
Oh, and here is a typescript version:
const matchCount = (str: string, re: RegExp) => {
return str?.match(re)?.length ?? 0;
};
('my string'.match(/\s/g) || []).length;
This is certainly something that has a lot of traps. I was working with Paolo Bergantino's answer, and realising that even that has some limitations. I found working with string representations of dates a good place to quickly find some of the main problems. Start with an input string like this:
'12-2-2019 5:1:48.670'
and set up Paolo's function like this:
function count(re, str) {
if (typeof re !== "string") {
return 0;
}
re = (re === '.') ? ('\\' + re) : re;
var cre = new RegExp(re, 'g');
return ((str || '').match(cre) || []).length;
}
I wanted the regular expression to be passed in, so that the function is more reusable, secondly, I wanted the parameter to be a string, so that the client doesn't have to make the regex, but simply match on the string, like a standard string utility class method.
Now, here you can see that I'm dealing with issues with the input. With the following:
if (typeof re !== "string") {
return 0;
}
I am ensuring that the input isn't anything like the literal 0, false, undefined, or null, none of which are strings. Since these literals are not in the input string, there should be no matches, but it should match '0', which is a string.
With the following:
re = (re === '.') ? ('\\' + re) : re;
I am dealing with the fact that the RegExp constructor will (I think, wrongly) interpret the string '.' as the all character matcher \.\
Finally, because I am using the RegExp constructor, I need to give it the global 'g' flag so that it counts all matches, not just the first one, similar to the suggestions in other posts.
I realise that this is an extremely late answer, but it might be helpful to someone stumbling along here. BTW here's the TypeScript version:
function count(re: string, str: string): number {
if (typeof re !== 'string') {
return 0;
}
re = (re === '.') ? ('\\' + re) : re;
const cre = new RegExp(re, 'g');
return ((str || '').match(cre) || []).length;
}
Using modern syntax avoids the need to create a dummy array to count length 0
const countMatches = (exp, str) => str.match(exp)?.length ?? 0;
Must pass exp as RegExp and str as String.
how about like this
function isint(str){
if(str.match(/\d/g).length==str.length){
return true;
}
else {
return false
}
}

Categories

Resources