Splitting word into syllables in javascript - javascript

My intention is to build a simple process with which I can split the word into syllables. The approach is to split the word whenever the vowel occurs. However, the trouble is when a consonant is not followed by a vowel, in such a case the split occurs at that consonant.
My test cases are as follows:
hair = ["hair"]
hairai = ["hai", "rai"]
hatred = ["hat", "red"]
In the first example hair is one syllable, as the final consonant is not followed by a vowel, similarly, in the final example, the "t" is followed by an r and so should considered along "ha" as one syllable.
In the second example, ai is considered as one vowel sound and so hai will become one syllable.
More examples include
father = ["fat", "her"]
kid = ["kid"]
lady = ["la","dy"]
Please note that, I am using simplistic examples as the ENglish language is quite complex when it comes to sound
My code is as follows
function syllabify(input) {
var arrs = [];
for (var i in input) {
var st = '';
var curr = input[i];
var nxt = input[i + 1];
if ((curr == 'a') || (curr == 'e') || (curr == 'i') || (curr == 'o') || (curr == 'u')) {
st += curr;
} else {
if ((nxt == 'a') || (nxt == 'e') || (nxt == 'i') || (nxt == 'o') || (nxt == 'u')) {
st += nxt;
} else {
arrs.push(st);
st = '';
}
}
}
console.log(arrs);
}
syllabify('hatred')
However, my code does not even return the strings. What am I doing wrong?

Problems with your current approach
There are a number of problems with your code:
First thing in the loop, you set st to an empty string. This means that you never accumulate any letters. You probably want that line above, outside the loop.
You are trying to loop over the indexes of letters by using i in input. In JavaScript, the in keyword gives you the keys of an object as strings. So you get strings, not numbers, plus the names of some methods defined on strings. Try var i = 0; i < input.length; i++ instead.
Maybe not the direct cause of the problems, but still - your code is messy. How about these?
Use clearer names. currentSyllable instead of st, syllables instead of arrs and so on.
Instead of a nested if - else, use one if - else if - else.
You repeat the same code that checks for vowels twice. Separate it into a function isVowel(letter) instead.
A new approach
Use regular expressions! Here is your definition of a syllable expressed in regex:
First, zero or more consonants: [^aeiouy]*
Then, one or more vowels: [aeiouy]+
After that, zero or one of the following:
Consonants, followed by the end of the word: [^aeiouy]*$
A consonant (if it is followed by another consonant): [^aeiouy](?=[^aeiouy])
Taken together you get this:
/[^aeiouy]*[aeiouy]+(?:[^aeiouy]*$|[^aeiouy](?=[^aeiouy]))?/gi
You can see it in action here. To run it in JavaScript, use the match function:
const syllableRegex = /[^aeiouy]*[aeiouy]+(?:[^aeiouy]*$|[^aeiouy](?=[^aeiouy]))?/gi;
function syllabify(words) {
return words.match(syllableRegex);
}
console.log(['away', 'hair', 'halter', 'hairspray', 'father', 'lady', 'kid'].map(syllabify))
Note that this does not work for words without vowels. You would either have to modify the regex to accomodate for that case, or do some other workaround.

I am weak in the ways of RegEx and while Anders example is right most of the time, I did find a few exceptions. Here is what I have found to work so far (but I am sure there are other exceptions I have not found yet). I am sure it can be RegEx'ified by masters of the art. This function returns an array of syllables.
function getSyllables(word){
var response = [];
var isSpecialCase = false;
var nums = (word.match(/[aeiou]/gi) || []).length;
//debugger;
if (isSpecialCase == false && (word.match(/[0123456789]/gi) || []).length == word.length ){
// has digits
response.push(word);
isSpecialCase = true;
}
if (isSpecialCase == false && word.length < 4){
// three letters or less
response.push(word);
isSpecialCase = true;
}
if (isSpecialCase == false && word.charAt(word.length-1) == "e"){
if (isVowel(word.charAt(word.length-2)) == false){
var cnt = (word.match(/[aeiou]/gi) || []).length;
if (cnt == 3){
if (hasDoubleVowels(word)){
// words like "piece, fleece, grease"
response.push(word);
isSpecialCase = true;
}
}
if (cnt == 2){
// words like "phase, phrase, blaze, name",
if (hasRecurringConsonant(word) == false) {
// but not like "syllable"
response.push(word);
isSpecialCase = true;
}
}
}
}
if (isSpecialCase == false){
const syllableRegex = /[^aeiouy]*[aeiouy]+(?:[^aeiouy]*$|[^aeiouy](?=[^aeiouy]))?/gi;
response = word.match(syllableRegex);
}
return response;
}

Related

Find if the sentence has 3 consecutive words

You are given a string with words and numbers separated by whitespaces (one space). The words contains only letters. You should check if the string contains three words in succession. For example, the string "start 5 one two three 7 end" contains three words in succession.
Input : String
Output : Boolean
This is what I'm trying to do, please point out my mistake. Thanks.
function threeWords(text){
let lst = text.split(' ');
for (let i=0; i< 3; i++) {
if (typeof lst[i] === 'string' && Number(lst[i]) === NaN) {return true}
else {return false;}}
}
If you'd rather continue with your code than use regex, here are your issues:
You only loop over 3 of the elements in lst. Loop over the entire length of the list.
You try to check if Number('somestring') === NaN. In JavaScript, NaN === NaN is False. Use isNaN() instead.
Once you find a list element that is not a number, you return True. You should have a variable that keeps track of how many words there are in succession (resetting to 0 when you find a number), and return True when this variable is equal to 3.
Here is the fixed code:
function threeWords(text) {
let lst = text.split(' ');
let num_words = 0;
for (let i = 0; i < lst.length; i++) {
if (isNaN(Number(lst[i])))
num_words++
else
num_words = 0
if (num_words === 3)
return true
}
return false;
}
Might be easier with a regular expression:
const result = /([A-Za-z]+( |$)){3}/.test('start 5 one two three 7 end');
console.log(result);

Convert camel case to sentence case in javascript

I found myself needing to do camel case to sentence case string conversion with sane acronym support, a google search for ideas led me to the following SO post:
Convert camelCaseText to Sentence Case Text
Which is actually asking about title case not sentence case so I came up with the following solution which maybe others will find helpful or can offer improvements to, it is using ES6 which is acceptable for me and can easily be polyfilled if there's some horrible IE requirement.
The below uses capitalised notation for acronyms; I don't agree with Microsoft's recommendation of capitalising when more than two characters so this expects the whole acronym to be capitalised even if it's at the start of the string (which technically means it's not camel case but it gives sane controllable output), multiple consecutive acronyms can be escaped with _ (e.g. parseDBM_MXL -> Parse DBM XML).
function camelToSentenceCase(str) {
return str.split(/([A-Z]|\d)/).map((v, i, arr) => {
// If first block then capitalise 1st letter regardless
if (!i) return v.charAt(0).toUpperCase() + v.slice(1);
// Skip empty blocks
if (!v) return v;
// Underscore substitution
if (v === '_') return " ";
// We have a capital or number
if (v.length === 1 && v === v.toUpperCase()) {
const previousCapital = !arr[i-1] || arr[i-1] === '_';
const nextWord = i+1 < arr.length && arr[i+1] && arr[i+1] !== '_';
const nextTwoCapitalsOrEndOfString = i+3 > arr.length || !arr[i+1] && !arr[i+3];
// Insert space
if (!previousCapital || nextWord) v = " " + v;
// Start of word or single letter word
if (nextWord || (!previousCapital && !nextTwoCapitalsOrEndOfString)) v = v.toLowerCase();
}
return v;
}).join("");
}
// ----------------------------------------------------- //
var testSet = [
'camelCase',
'camelTOPCase',
'aP2PConnection',
'JSONIsGreat',
'thisIsALoadOfJSON',
'parseDBM_XML',
'superSimpleExample',
'aGoodIPAddress'
];
testSet.forEach(function(item) {
console.log(item, '->', camelToSentenceCase(item));
});

Fetching function name and body code from JavaScript file using C#

I need to fetch particular function and its body as a text from the javascript file and print that function as an output using C#. I need to give function name and js file as an input parameter. I tried using regex but couldnt achieved the desired result. Here is the code of regex.
public void getFunction(string jstext, string functionname)
{
Regex regex = new Regex(#"function\s+" + functionname + #"\s*\(.*\)\s*\{");
Match match = regex.Match(jstext);
}
Is there any other way I can do this?
This answer is based on the assumption which you provide in comments, that the C# function needs only to find function declarations, and not any form of function expressions.
As I point out in comments, javascript is too complex to be efficiently expressed in a regular expression. The only way to know you've reached the end of the function is when the brackets all match up, and given that, you still need to take escape characters, comments, and strings into account.
The only way I can think of to achieve this, is to actually iterate through every single character, from the start of your function body, until the brackets match up, and keep track of anything odd that comes along.
Such a solution is never going to be very pretty. I've pieced together an example of how it might work, but knowing how javascript is riddled with little quirks and pitfalls, I am convinced there are many corner cases not considered here. I'm also sure it could be made a bit tidier.
From my first experiments, the following should handle escape characters, multi- and single line comments, strings that are delimited by ", ' or `, and regular expressions (i.e. delimited by /).
This should get you pretty far, although I'm intrigued to see what exceptions people can come up with in comments:
private static string GetFunction(string jstext, string functionname) {
var start = Regex.Match(jstext, #"function\s+" + functionname + #"\s*\([^)]*\)\s*{");
if(!start.Success) {
throw new Exception("Function not found: " + functionname);
}
StringBuilder sb = new StringBuilder(start.Value);
jstext = jstext.Substring(start.Index + start.Value.Length);
var brackets = 1;
var i = 0;
var delimiters = "`/'\"";
string currentDelimiter = null;
var isEscape = false;
var isComment = false;
var isMultilineComment = false;
while(brackets > 0 && i < jstext.Length) {
var c = jstext[i].ToString();
var wasEscape = isEscape;
if(isComment || !isEscape)
{
if(c == #"\") {
// Found escape symbol.
isEscape = true;
} else if(i > 0 && !isComment && (c == "*" || c == "/") && jstext[i-1] == '/') {
// Found start of a comment block
isComment = true;
isMultilineComment = c == "*";
} else if(c == "\n" && isComment && !isMultilineComment) {
// Found termination of singline line comment
isComment = false;
} else if(isMultilineComment && c == "/" && jstext[i-1] == '*') {
// Found termination of multiline comment
isComment = false;
isMultilineComment = false;
} else if(delimiters.Contains(c)) {
// Found a string or regex delimiter
currentDelimiter = (currentDelimiter == c) ? null : currentDelimiter ?? c;
}
// The current symbol doesn't appear to be commented out, escaped or in a string
// If it is a bracket, we should treat it as one
if(currentDelimiter == null && !isComment) {
if(c == "{") {
brackets++;
}
if(c == "}") {
brackets--;
}
}
}
sb.Append(c);
i++;
if(wasEscape) isEscape = false;
}
return sb.ToString();
}
Demo

Letter Count I JavaScript Challenge on Coderbyte

I've been on this problem for several hours now and have done all I can to the best of my current newbie javaScript ability to solve this challenge but I just can't figure out exactly what's wrong. I keep getting "UNEXPECTED TOKEN ILLEGAL on here: http://jsfiddle.net/6n8apjze/14/
and "TypeError: Cannot read property 'length' of null": http://goo.gl/LIz89F
I think the problem is the howManyRepeat variable. I don't understand why I'm getting it can't read the length of null when clearly word is a word from str...
I got the idea for:
word.toLowerCase().split("").sort().join("").match(/([.])\1+/g).length
...here: Get duplicate characters count in a string
The Challenge:
Using the JavaScript language, have the function LetterCountI(str) take the str
parameter being passed and return the first word with the greatest number of
repeated letters. For example: "Today, is the greatest day ever!" should return
greatest because it has 2 e's (and 2 t's) and it comes before ever which also
has 2 e's. If there are no words with repeating letters return -1. Words will
be separated by spaces.
function LetterCountI(str){
var wordsAndAmount={};
var mostRepeatLetters="-1";
var words=str.split(" ");
words.forEach(function(word){
// returns value of how many repeated letters in word.
var howManyRepeat=word.toLowerCase().split("").sort().join("").match(/([.])\1+/g).length;
// if there are repeats(at least one value).
if(howManyRepeat !== null || howManyRepeat !== 0){
wordsAndAmount[word] = howManyRepeat;
}else{
// if no words have repeats will return -1 after for in loop.
wordsAndAmount[word] = -1;
}
});
// word is the key, wordsAndAmount[word] is the value of word.
for(var word in wordsAndAmount){
// if two words have same # of repeats pick the one before it.
if(wordsAndAmount[word]===mostRepeatLetters){
mostRepeatLetters=mostRepeatLetters;
}else if(wordsAndAmount[word]<mostRepeatLetters){
mostRepeatLetters=mostRepeatLetters;
}else if(wordsAndAmount[word]>mostRepeatLetters){
mostRepeatLetters=word;
}
}
return mostRepeatLetters;
}
// TESTS
console.log("-----");
console.log(LetterCountI("Today, is the greatest day ever!"));
console.log(LetterCountI("Hello apple pie"));
console.log(LetterCountI("No words"));
Any guidance is much appreciated. Thank you!! ^____^
Here is the working code snippet:
/*
Using the JavaScript language, have the function LetterCountI(str) take the str
parameter being passed and return the first word with the greatest number of
repeated letters. For example: "Today, is the greatest day ever!" should return
greatest because it has 2 e's (and 2 t's) and it comes before ever which also
has 2 e's. If there are no words with repeating letters return -1. Words will
be separated by spaces.
console.log(LetterCountI("Today, is the greatest day ever!") === "greatest");
console.log(LetterCountI("Hello apple pie") === "Hello");
console.log(LetterCountI("No words") === -1);
Tips:
This is an interesting problem. What we can do is turn the string to lower case using String.toLowerCase, and then split on "", so we get an array of characters.
We will then sort it with Array.sort. After it has been sorted, we will join it using Array.join. We can then make use of the regex /(.)\1+/g which essentially means match a letter and subsequent letters if it's the same.
When we use String.match with the stated regex, we will get an Array, whose length is the answer. Also used some try...catch to return 0 in case match returns null and results in TypeError.
/(.)\1+/g with the match method will return a value of letters that appear one after the other. Without sort(), this wouldn't work.
*/
function LetterCountI(str){
var wordsAndAmount={};
var mostRepeatLetters="";
var words=str.split(" ");
words.forEach(function(word){
var howManyRepeat=word.toLowerCase().split("").sort().join("").match(/(.)\1+/g);
if(howManyRepeat !== null && howManyRepeat !== 0){ // if there are repeats(at least one value)..
wordsAndAmount[word] = howManyRepeat;
} else{
wordsAndAmount[word] = -1; // if no words have repeats will return -1 after for in loop.
}
});
// console.log(wordsAndAmount);
for(var word in wordsAndAmount){ // word is the key, wordsAndAmount[word] is the value of word.
// console.log("Key = " + word);
// console.log("val = " + wordsAndAmount[word]);
if(wordsAndAmount[word].length>mostRepeatLetters.length){ //if two words have same # of repeats pick the one before it.
mostRepeatLetters=word;
}
}
return mostRepeatLetters ? mostRepeatLetters : -1;
}
// TESTS
console.log("-----");
console.log(LetterCountI("Today, is the greatest day ever!"));
console.log(LetterCountI("Hello apple pie"));
console.log(LetterCountI("No words"));
/*
split into words
var wordsAndAmount={};
var mostRepeatLetters=0;
loop through words
Check if words has repeated letters, if so
Push amount into object
Like wordsAndAmount[word[i]]= a number
If no repeated letters...no else.
Loop through objects
Compare new words amount of repeated letters with mostRepeatLetters replacing whoever has more.
In the end return the result of the word having most repeated letters
If all words have no repeated letters return -1, ie.
*/
The changes made:
[.] turned into . as [.] matches a literal period symbol, not any character but a newline
added closing */ at the end of the code (the last comment block was not closed resulting in UNEXPECTED TOKEN ILLEGAL)
if(howManyRepeat !== null || howManyRepeat !== 0) should be replaced with if(howManyRepeat !== null && howManyRepeat !== 0) since otherwise the null was testing for equality with 0 and led to the TypeError: Cannot read property 'length' of null" issue. Note that .match(/(.)\1+/g).length cannot be used since the result of matching can be null, and this will also cause the TypeError to appear.
The algorithm for getting the first entry with the greatest number of repetitions was wrong since the first if block allowed subsequent entry to be output as a correct result (not the first, but the last entry with the same repetitions was output actually)
-1 can be returned if mostRepeatLetters is empty.
Hope you dont mind if I rewrite this code. My code may not be that efficient.
Here is a snippet
function findGreatest() {
// ipField is input field
var getString = document.getElementById('ipField').value.toLowerCase();
var finalArray = [];
var strArray = [];
var tempArray = [];
strArray = (getString.split(" "));
// Take only those words which has repeated letter
for (var i = 0, j = strArray.length; i < j; i++) {
if ((/([a-zA-Z]).*?\1/).test(strArray[i])) {
tempArray.push(strArray[i]);
}
}
if (tempArray.length == 0) { // If no word with repeated Character
console.log('No such Word');
return -1;
} else { // If array has words with repeated character
for (var x = 0, y = tempArray.length; x < y; x++) {
var m = findRepWord(tempArray[x]); // Find number of repeated character in it
finalArray.push({
name: tempArray[x],
repeat: m
})
}
// Sort this array to get word with largest repeated chars
finalArray.sort(function(z, a) {
return a.repeat - z.repeat
})
document.getElementById('repWord').textContent=finalArray[0].name;
}
}
// Function to find the word which as highest repeated character(s)
function findRepWord(str) {
try {
return str.match(/(.)\1+/g).length;
} catch (e) {
return 0;
} // if TypeError
}
Here is DEMO
function LetterCountI(str) {
var word_arr = str.split(" ");
var x = word_arr.slice();
for(var i = 0; i < x.length; i ++){
var sum = 0;
for(var y = 0; y < x[i].length; y++){
var amount = x[i].split("").filter(function(a){return a == x[i][y]}).length;
if (amount > 1){
sum += amount
}
}
x[i] = sum;
}
var max = Math.max.apply(Math,x);
if(max == 0)
return -1;
var index = x.indexOf(max);
return(word_arr[index]);
};
Here is another version as well.
You could use new Set in the following manner:
const letterCount = s => {
const res = s.split(' ')
.map(s => [s, (s.length - new Set([...s]).size)])
.reduce((p, c) => (!p.length) ? c
: (c[1] > p[1]) ? c : p, []);
return !res[1] ? -1 : res.slice(0,1).toString()
}
Note: I have not tested this solution (other than the phrases presented here), but the idea is to subtract unique characters from the total characters in each word of the phrase.

How to check if the elements of an array are identical in javascript (more than 2 elements)

I am trying to make sure that a phone# is not all identical characters, example 1111111111
The code I am using works but there has to be a cleaner way. I've tried loops but that only compares two consecutive characters at a time. This is what I am using now:
if (MainPhone.value != "")
{
if ((MainPhone.value == 1111111111) || (MainPhone.value == 2222222222) || (MainPhone.value == 3333333333) || (MainPhone.value == 4444444444) || (MainPhone.value == 5555555555) || (MainPhone.value == 6666666666) || (MainPhone.value == 7777777777) || (MainPhone.value == 8888888888) || (MainPhone.value == 9999999999) || (MainPhone.value == 0000000000))
{
window.alert("Phone Number is Invalid");
MainPhone.focus();
return false;
}
}
I found this recommendation for someone else' question but could not get it to work.
var dup = MainPhone.value.split('');
if all(dup == dup(1))
I would try something like this:
var phone = '11111211';
var digits = phone.split('').sort();
var test = digits[0] == digits[digits.length - 1];
Simply sort the array and compare first and last element..
You can use a regular expression like this to check if all characters are the same:
^(.)\1*$
Example:
var phone = '11111111';
if (/^(.)\1*$/.test(phone)) {
alert('All the same.');
}
Demo: http://jsfiddle.net/Guffa/3V5en/
Explanation of the regular expression:
^ = matches start of the string
(.) = captures one character
\1 = matches the first capture
* = zero or more times
$ = matches end of the string
So, it captures the first character, and matches the rest of the characters if they are the same.

Categories

Resources