Find duplicate two and three keywords in string text - javascript

I am basically going to build a "keyword density" function.
The function is going to take text input from a textarea, so a basic string.
Let's say we have the string of My name is not Kevin this is not fun.
What I want to do, is write a function that finds two words groups that are duplicates.
In this example, the result would be is not x2, as the word is not is repeated twice and in the same order.
So instead of finding duplicate single words, for example, we want to find two words together that are duplicates.
Another example: Text string = What is your name? What is your hobby?. The two combinations of groups of words are `What is, so that would count as x2.
So the function is going to find these duplicates inside this text(could be multiple groups), and store it in a way so that I can access it and display it in the UI.
What I have tried:
I am capable of doing this for a single word duplicate in the text, but I am unsure on how to start/go on when it comes to multiple words next to each other.
Any help is greatly appreciated!

Here's a solution for finding 2-word phrases that uses map(), reduce(), Object.entries() and Object.keys(). It's easily extendable to find duplicates of any number of word groups.
let str = "What is your name? What is your hobby?";
let arr = str.split(" ");
let list = Object.entries(arr.reduce((acc, a, index) => {
if (index != 0) acc.push(arr[index - 1] + " " + arr[index])
return acc;
}, []).reduce((acc, a) => {
if (Object.keys(acc).indexOf(a) !== -1) acc[a]++;
else acc[a] = 1;
return acc;
}, {})).map(el => ({
phrase: el[0],
count: el[1]
}));
console.log(list)

That can be done pretty easily in JavaScript. First lets make the string.
let testString = "The dog jumps and the dog barks just some dummy text there.
Now if want to get all the words, we can use testString.split(" ") this will split the string and return an array. Now just count how many times they occur.
Here's the code in action
let testString = "the dog jumps and the dog barks";
let testArray = testString.split(" ")
testArray.sort() //this will sort the array
function count() {
let current = null;
let cnt = 0;
for (var i = 0; i < testArray.length; i++) {
if (testArray[i] != current) {
if (cnt > 0) {
document.write(current + ' - ' + cnt + '<br>');
}
current = testArray[i];
cnt = 1;
} else {
cnt++;
}
}
if (cnt > 0) {
document.write(current + '-' + cnt);
}
}
count();
if this solution is your desired answer then please mark this answer by clicking the tick on the left.
Thank you!

Related

Is there a Typescript function to set the max char limit to a string and make it cut from a specific character?

so i had a code like this..
const array = ['• foo', '• bar', '• third']
and many more elements in this array..
(the array was taken from a docs api so it was unpredictable everytime but always had a bullet at the start of elements)
then i joined the array using .join('\n')
so In the new string which came, now i wanted it to have a max characters of 1000, so i used a substring method on it, .substring(0, 1000)..
but the problem here was that sometimes, it cut lines in half.. like this
const string = array.join('\n').substring(0, 1000)
string = "• foo\n• bar\n• th"
so the rest of the word "third" was cut down... is there anyway i can keep a max limit on the string of 1000 and make it cut from the bullet closest to 1000?
so like it would only make it
string = "• foo\n• bar"
const stringifyArrayToCharactersNumber = (
arr: string[],
charactersNum: number,
) => {
return arr.reduce((resultString, arrayItem) => {
const str = resultString + arrayItem + '\n';
if (str.length <= charactersNum) {
return str;
}
return resultString;
}, '');
};
Use the function above to truncate to any numbers of characters: const lessThan1000 = stringifyArrayToCharactersNumber(array, 1000)
You can iterate backwards through the characters in the string, starting at the end, and grab the index of the last bullet. Then use that index in substring. Something like:
const array = ['• foo', '• bar', '• third'];
const initialString = array.join('\n');
let bulletIndex = initialString.length - 1;
for (let i = initialString.length - 1; i > 0; i--) {
if (initialString[i] === "•") {
bulletIndex = i;
break;
}
}
const finalString = initialString.substring(0, bulletIndex);
Note that that code doesn't quite do what you're saying: currently it cuts off everything after the last bullet, whether you want it to or not, so you'll need to add a check that the string is actually too long in the first place. But this basic approach should work.

How to add commas in between words in string in Javascript

I'm trying to add commas in between the first and last names in parentheses.
//Input:
s = "Fred:Corwill;Wilfred:Corwill;Barney:Tornbull;Betty:Tornbull;Bjon:Tornbull;Raphael:Corwill;Alfred:Corwill";
//Expected output: "(CORWILL, ALFRED)(CORWILL, FRED)(CORWILL, RAPHAEL)(CORWILL, WILFRED)(TORNBULL, BARNEY)(TORNBULL, BETTY)(TORNBULL, BJON)"
What my code is currently outputting:
(CORWILL ALFRED) (CORWILL FRED) (CORWILL RAPHAEL) (CORWILL WILFRED) (TORNBULL BARNEY) (TORNBULL BETTY) (TORNBULL BJON)
I've tried a number of approaches like changing how the characters are replaced in the beginning when I reassign s (the string) so that I am not removing the commas in the first place, to then have to replace them...but when I did that, the regex I have was no longer working, and I am not sure why that is. So I tried to find another regex to use so I could work around that problem, but that has equally been a pain, so I decided to just stick to solving it this way: trying to find a way to find commas in between the first, and last names in the parentheses.
Full problem & code:
/*Could you make a program that
• makes this string uppercase
• gives it sorted in alphabetical order by last name.
When the last names are the same, sort them by first name. Last name and first name of a guest come in the result between parentheses separated by a comma.
*/
function meeting(s) {
s = s.replace(/:/g, ", ").toUpperCase();
//order alphabetically based on Last, then first name
const semicolon = ';'
let testArr = s.split(semicolon)
testArr.sort(function compare(a, b) {
var splitA = a.split(",");
var splitB = b.split(",");
var firstA = splitA[0]
var firstB = splitB[0]
var lastA = splitA[splitA.length - 1];
var lastB = splitB[splitB.length - 1];
if (lastA < lastB) return -1;
if (lastA > lastB) return 1;
if (firstA < firstB) return -1; //sort first names alphabetically
if (firstA > firstB) return 1;
return 0; //if they are equal
})
//print last names before first names with regex
let newArr = [];
for (let i = 0; i < testArr.length; i++) {
let variable = (testArr[i].replace(/([\w ]+), ([\w ]+)/g, "$2 $1"))
let comma = ","
newArr.push(`(${variable})`)
}
let finalStr;
finalStr = newArr.toString().replace(/[ ,.]/g, " ").toUpperCase();
// finalStr = finalStr.replace(/" "/g, ", ")
return finalStr
}
s = "Fred:Corwill;Wilfred:Corwill;Barney:Tornbull;Betty:Tornbull;Bjon:Tornbull;Raphael:Corwill;Alfred:Corwill";
console.log(meeting(s))
// expected result: "(CORWILL, ALFRED)(CORWILL, FRED)(CORWILL, RAPHAEL)(CORWILL, WILFRED)(TORNBULL, BARNEY)(TORNBULL, BETTY)(TORNBULL, BJON)"
Any help would be appreciated, I've spent about 5 hours on this problem.The regex I am using is to switch the last name's position with the first name's position (Fred Corwill) --> (Corwill Fred). If there is a regex for me to this other than the one I am using that you could suggest, maybe I could work around the problem this way too, so far everything I have tried has not worked other the one I am using here.
That looks much more complicated than it needs to be. After splitting by ;s, map each individual element to its words in reverse order, then join:
const s = "Fred:Corwill;Wilfred:Corwill;Barney:Tornbull;Betty:Tornbull;Bjon:Tornbull;Raphael:Corwill;Alfred:Corwill";
const output = s
.toUpperCase()
.split(';')
.sort((a, b) => {
const [aFirst, aLast] = a.split(':');
const [bFirst, bLast] = b.split(':');
return aLast.localeCompare(bLast) || aFirst.localeCompare(bFirst);
})
.map((name) => {
const [first, last] = name.split(':');
return `(${last}, ${first})`;
})
.join('');
console.log(output);
That's what you need:
const str = 'Fred:Corwill;Wilfred:Corwill;Barney:Tornbull;Betty:Tornbull;Bjon:Tornbull;Raphael:Corwill;Alfred:Corwill';
function formatString(string) {
const modifiedString = string.toUpperCase().replace(/(\w+):(\w+)/g, '($2, $1)');
const sortedString = modifiedString.split(';').sort().join('');
return sortedString;
}
console.log(formatString(str))
Using splits with maps and sorts
var s = "Fred:Corwill;Wilfred:Corwill;Barney:Tornbull;Betty:Tornbull;Bjon:Tornbull;Raphael:Corwill;Alfred:Corwill";
var res = s.split(/;/) // split into people
.map(x => x.split(/:/).reverse()) // split names, put last first
.sort((a, b) => a[0] === b[0] ? a[1].localeCompare(b[1]) : a[0].localeCompare(b[0])) // sort by last name, first name
.map(x => `(${x.join(', ')})`) // create the new format
.join(' ') // join the array back into a string
console.log(res);

Find letters in random string exactly, using RegEx

The emphasis here is on the word exactly. This needs to work for any number of permutations, so hopefully my example is clear enough.
Given a string of random letters, is it possible (using RegEx) to match an exact number of letters within the given string?
So if I have a string (str1) containing letters ABZBABJDCDAZ and I wanted to match the letters JDBBAA (str2), my function should return true because str1 contains all the right letters enough times. If however str1 were to be changed to ABAJDCDA, then the function would return false as str2 requires that str1 have at least 2 instances of the letter B.
This is what I have so far using a range:
const findLetters = (str1, str2) => {
const regex = new RegExp(`[${str2}]`, 'g')
const result = (str1.match(regex))
console.log(result)
}
findLetters('ABZBABJDCDAZ', 'JDBBAA')
As you can see it matches the right letters, but it matches all instances of them. Is there any way to do what I'm trying to do using RegEx? The reason I'm focusing on RegEx here is because I need this code to be highly optimised, and so far my other functions using Array.every() and indexOf() are just too slow.
Note: My function only requires to return a true/false value.
Try (here we sort letters of both strings and then create regexp like A.*A.*B.*B.*D.*J)
const findLetters = (str1, str2) => {
const regex = new RegExp([...str2].sort().join`.*`)
return regex.test([...str1].sort().join``)
}
console.log( findLetters('ABZBABJDCDAZ', 'JDBBAA') );
console.log( findLetters('ABAJDCDA', 'JDBBAA') );
I dont know if regex is the right way for this as this can also get very expensive. Regex is fast, but not always the fastest.
const findLetters2 = (strSearchIn, strSearchFor) => {
var strSearchInSorted = strSearchIn.split('').sort(function(a, b) {
return a.localeCompare(b);
});
var strSearchForSorted = strSearchFor.split('').sort(function(a, b) {
return a.localeCompare(b);
});
return hasAllChars(strSearchInSorted, strSearchForSorted);
}
const hasAllChars = (searchInCharList, searchCharList) => {
var counter = 0;
for (i = 0; i < searchCharList.length; i++) {
var found = false;
for (counter; counter < searchInCharList.length;) {
counter++;
if (searchCharList[i] == searchInCharList[counter - 1]) {
found = true;
break;
}
}
if (found == false) return false;
}
return true;
}
// No-Regex solution
console.log('true: ' + findLetters2('abcABC', 'abcABC'));
console.log('true: ' + findLetters2('abcABC', 'acbACB'));
console.log('true: ' + findLetters2('abcABCx', 'acbACB'));
console.log('false: ' + findLetters2('abcABC', 'acbACBx'));
console.log('true: ' + findLetters2('ahfffmbbbertwcAtzrBCasdf', 'acbACB'));
console.log('false: ' + findLetters2('abcABC', 'acbAACB'));
Feel free to test it's speed and to optimize it as I'm no js expert. This solution should iterate each string once after sorting. Sorting is thanks to https://stackoverflow.com/a/51169/9338645.

Formating user input to match stored values

I have an H3 element that I'm currently using to initialize a game
<h3 id="startGame">Start</h3>
The following is the JavaScript that I've written
In the function formatText() I take what the user has inputted and make the entire string lower case. I then capitalize the first letter of the word so that it matches how the strings are written in the array. In the case where there is actually two words, I then grab the first letter of the second word and capitalize that to match the way the two word strings are written in the array.
In the end when a user inputs what I've asked them to input, it shouldn't matter how they wrote it (in regards to capitalization). All that should matter is that they spelled it right.
However, my problem is that it works for the one word strings but does not capitalize the second word as I intended. Meaning I can enter the one word strings in any manner I wish (spelled correctly of course) and it will resolve to correct. I can even input the two word strings and mess with the capitalization of the first word and it will resolve to correct. However when I do not capitalize the second word it always resolves to incorrect.
The code that I've written to resolve this issue doesn't seem to work and I don't know why.
var nutrients = [
"Vitamin B6",
"Manganese",
"Vitamin C",
"Fiber",
"Potassium",
"Biotin",
"Copper"
];
function memoNutri() {
var pleaseCopy;
var spaceMarker = " ";
var capitalizeSecondWord;
var firstWord;
var secondWord;
var twoWords;
function ask() {
pleaseCopy = prompt("Enter the following into the text field: " + nutrients[i] + ".");
}
function formatText() {
pleaseCopy.toLowerCase();
pleaseCopy = pleaseCopy[0].toUpperCase() + pleaseCopy.substring(1, pleaseCopy.length);
capitalizeSecondWord = pleaseCopy.substring(spaceMarker + 1, spaceMarker + 2).toUpperCase();
firstWord = pleaseCopy.substring(0, spaceMarker);
secondWord = capitalizeSecondWord + pleaseCopy.substring(spaceMarker + 2, pleaseCopy.length);
twoWords = firstWord + spaceMarker + secondWord;
}
for (i = 0; i < nutrients.length; i++) {
ask();
formatText();
if (pleaseCopy === nutrients[i] || twoWords === nutrients[i]) {
alert("You are correct! " + nutrients[i]);
} else {
alert("That is incorrect");
}
}
}
var startGame = document.getElementById('startGame');
startGame.onclick = memoNutri;
Try using this:
var textArr = pleaseCopy.split(" ");
for (var i = 0; i < textArr.length; i++) {
textArr[i] = textArr[i][0].toUpperCase() + textArr[i].substring(1).toLowerCase();
}
var twoWords = textArr.join(" ");
Split the content into an array. Format the text at each index, and the re-join the text.

Find words around search term (snippet) with javascript

I'm returning search results from some json using jlinq and I'd like to show the user a snippet of the result text which contains the search term, say three words before the search term and three words after.
var searchTerm = 'rain'
var text = "I'm singing in the rain, just singing in the rain";
Result would be something like "singing in the rain, just singing in"
How could I do this in javascript? I've seen some suggestions using php, but nothing specifically for javascript.
Here is a slightly better approximation:
function getMatch(string, term)
{
index = string.indexOf(term)
if(index >= 0)
{
var _ws = [" ","\t"]
var whitespace = 0
var rightLimit = 0
var leftLimit = 0
// right trim index
for(rightLimit = index + term.length; whitespace < 4; rightLimit++)
{
if(rightLimit >= string.length){break}
if(_ws.indexOf(string.charAt(rightLimit)) >= 0){whitespace += 1}
}
whitespace = 0
// left trim index
for(leftLimit = index; whitespace < 4; leftLimit--)
{
if(leftLimit < 0){break}
if(_ws.indexOf(string.charAt(leftLimit)) >= 0){whitespace += 1}
}
return string.substr(leftLimit + 1, rightLimit) // return match
}
return // return nothing
}
This is a little bit of "greedy" hehe but it should do the trick. Note the _ws array. You could include all the white space you like or modify to use regex to check for whitespace.
This has been slightly modified to handle phrases. It only finds the first occurrence of the term. Dealing with multiple occurrences would require a slightly different strategy.
It occurred to me that what you want is also possible (in varying degrees) with the following:
function snippet(stringToSearch, phrase)
{
var regExp = eval("/(\\S+\\s){0,3}\\S*" + phrase + "\\S*(\\s\\S+){0,3}/g")
// returns an array containing all matches
return stringToSearch.match(regExp)
}
The only possible problem with this is, when it grabs the first occurrence of your pattern, it slices off the matched part and then searches again. You also need to be careful that the "phrase" variable doesn't have any regExp characters in it(or convert it to a hex or octal representation)
At any rate, I hope this helps man! :)
First, we need to find first occurance of term in a string.
Instead, we dealing with an array of words, so we better find first occurance of a term in such an array.
I decided to attach this method to Array's prototype.
We could use indexOf, but if we split a string by " ", we will deal with words like "rain," and indexOf wouldn't match it.
Array.prototype.firstOccurance = function(term) {
for (i in this) {
if (this[i].indexOf(term) != -1 ) { // still can use idnexOf on a string, right? :)
return parseInt(i,10); // we need an integer, not a string as i is
}
}
}
Than, I split a string by words, to do so, split it by " ":
function getExcerpt(text, searchTerm, precision) {
var words = text.split(" "),
index = words.firstOccurance(searchTerm),
result = [], // resulting array that we will join back
startIndex, stopIndex;
// now we need first <precision> words before and after searchTerm
// we can use slice for this matter
// but we need to know what is our startIndex and stopIndex
// since simple substitution from index could lead us to
// a negative value
// and adding to an index could get us to exceeding words array length
startIndex = index - precision;
if (startIndex < 0) {
startIndex = 0;
}
stopIndex = index + precision + 1;
if (stopIndex > words.length) {
stopIndex = words.length;
}
result = result.concat( words.slice(startIndex, index) );
result = result.concat( words.slice(index, stopIndex) );
return result.join(' '); // join back
}
Results:
> getExcerpt("I'm singing in the rain, just singing in the rain", 'rain', 3)
'singing in the rain, just singing in'
> getExcerpt("I'm singing in the rain, just singing in the rain", 'rain', 2)
'in the rain, just singing'
> getExcerpt("I'm singing in the rain, just singing in the rain", 'rain', 10)
'I\'m singing in the rain, just singing in the rain'

Categories

Resources