Find words around search term (snippet) with javascript

Find words around search term (snippet) with javascript - javascript

I'm returning search results from some json using jlinq and I'd like to show the user a snippet of the result text which contains the search term, say three words before the search term and three words after.
var searchTerm = 'rain'
var text = "I'm singing in the rain, just singing in the rain";
Result would be something like "singing in the rain, just singing in"
How could I do this in javascript? I've seen some suggestions using php, but nothing specifically for javascript.

Here is a slightly better approximation:
function getMatch(string, term)
{
index = string.indexOf(term)
if(index >= 0)
{
var _ws = [" ","\t"]
var whitespace = 0
var rightLimit = 0
var leftLimit = 0
// right trim index
for(rightLimit = index + term.length; whitespace < 4; rightLimit++)
{
if(rightLimit >= string.length){break}
if(_ws.indexOf(string.charAt(rightLimit)) >= 0){whitespace += 1}
}
whitespace = 0
// left trim index
for(leftLimit = index; whitespace < 4; leftLimit--)
{
if(leftLimit < 0){break}
if(_ws.indexOf(string.charAt(leftLimit)) >= 0){whitespace += 1}
}
return string.substr(leftLimit + 1, rightLimit) // return match
}
return // return nothing
}
This is a little bit of "greedy" hehe but it should do the trick. Note the _ws array. You could include all the white space you like or modify to use regex to check for whitespace.
This has been slightly modified to handle phrases. It only finds the first occurrence of the term. Dealing with multiple occurrences would require a slightly different strategy.
It occurred to me that what you want is also possible (in varying degrees) with the following:
function snippet(stringToSearch, phrase)
{
var regExp = eval("/(\\S+\\s){0,3}\\S*" + phrase + "\\S*(\\s\\S+){0,3}/g")
// returns an array containing all matches
return stringToSearch.match(regExp)
}
The only possible problem with this is, when it grabs the first occurrence of your pattern, it slices off the matched part and then searches again. You also need to be careful that the "phrase" variable doesn't have any regExp characters in it(or convert it to a hex or octal representation)
At any rate, I hope this helps man! :)

First, we need to find first occurance of term in a string.
Instead, we dealing with an array of words, so we better find first occurance of a term in such an array.
I decided to attach this method to Array's prototype.
We could use indexOf, but if we split a string by " ", we will deal with words like "rain," and indexOf wouldn't match it.
Array.prototype.firstOccurance = function(term) {
for (i in this) {
if (this[i].indexOf(term) != -1 ) { // still can use idnexOf on a string, right? :)
return parseInt(i,10); // we need an integer, not a string as i is
}
}
}
Than, I split a string by words, to do so, split it by " ":
function getExcerpt(text, searchTerm, precision) {
var words = text.split(" "),
index = words.firstOccurance(searchTerm),
result = [], // resulting array that we will join back
startIndex, stopIndex;
// now we need first <precision> words before and after searchTerm
// we can use slice for this matter
// but we need to know what is our startIndex and stopIndex
// since simple substitution from index could lead us to
// a negative value
// and adding to an index could get us to exceeding words array length
startIndex = index - precision;
if (startIndex < 0) {
startIndex = 0;
}
stopIndex = index + precision + 1;
if (stopIndex > words.length) {
stopIndex = words.length;
}
result = result.concat( words.slice(startIndex, index) );
result = result.concat( words.slice(index, stopIndex) );
return result.join(' '); // join back
}
Results:
> getExcerpt("I'm singing in the rain, just singing in the rain", 'rain', 3)
'singing in the rain, just singing in'
> getExcerpt("I'm singing in the rain, just singing in the rain", 'rain', 2)
'in the rain, just singing'
> getExcerpt("I'm singing in the rain, just singing in the rain", 'rain', 10)
'I\'m singing in the rain, just singing in the rain'

Related

Find duplicate two and three keywords in string text

I am basically going to build a "keyword density" function.
The function is going to take text input from a textarea, so a basic string.
Let's say we have the string of My name is not Kevin this is not fun.
What I want to do, is write a function that finds two words groups that are duplicates.
In this example, the result would be is not x2, as the word is not is repeated twice and in the same order.
So instead of finding duplicate single words, for example, we want to find two words together that are duplicates.
Another example: Text string = What is your name? What is your hobby?. The two combinations of groups of words are `What is, so that would count as x2.
So the function is going to find these duplicates inside this text(could be multiple groups), and store it in a way so that I can access it and display it in the UI.
What I have tried:
I am capable of doing this for a single word duplicate in the text, but I am unsure on how to start/go on when it comes to multiple words next to each other.
Any help is greatly appreciated!

Here's a solution for finding 2-word phrases that uses map(), reduce(), Object.entries() and Object.keys(). It's easily extendable to find duplicates of any number of word groups.
let str = "What is your name? What is your hobby?";
let arr = str.split(" ");
let list = Object.entries(arr.reduce((acc, a, index) => {
if (index != 0) acc.push(arr[index - 1] + " " + arr[index])
return acc;
}, []).reduce((acc, a) => {
if (Object.keys(acc).indexOf(a) !== -1) acc[a]++;
else acc[a] = 1;
return acc;
}, {})).map(el => ({
phrase: el[0],
count: el[1]
}));
console.log(list)

That can be done pretty easily in JavaScript. First lets make the string.
let testString = "The dog jumps and the dog barks just some dummy text there.
Now if want to get all the words, we can use testString.split(" ") this will split the string and return an array. Now just count how many times they occur.
Here's the code in action
let testString = "the dog jumps and the dog barks";
let testArray = testString.split(" ")
testArray.sort() //this will sort the array
function count() {
let current = null;
let cnt = 0;
for (var i = 0; i < testArray.length; i++) {
if (testArray[i] != current) {
if (cnt > 0) {
document.write(current + ' - ' + cnt + '<br>');
}
current = testArray[i];
cnt = 1;
} else {
cnt++;
}
}
if (cnt > 0) {
document.write(current + '-' + cnt);
}
}
count();
if this solution is your desired answer then please mark this answer by clicking the tick on the left.
Thank you!

Is there a Typescript function to set the max char limit to a string and make it cut from a specific character?

so i had a code like this..
const array = ['• foo', '• bar', '• third']
and many more elements in this array..
(the array was taken from a docs api so it was unpredictable everytime but always had a bullet at the start of elements)
then i joined the array using .join('\n')
so In the new string which came, now i wanted it to have a max characters of 1000, so i used a substring method on it, .substring(0, 1000)..
but the problem here was that sometimes, it cut lines in half.. like this
const string = array.join('\n').substring(0, 1000)
string = "• foo\n• bar\n• th"
so the rest of the word "third" was cut down... is there anyway i can keep a max limit on the string of 1000 and make it cut from the bullet closest to 1000?
so like it would only make it
string = "• foo\n• bar"

const stringifyArrayToCharactersNumber = (
arr: string[],
charactersNum: number,
) => {
return arr.reduce((resultString, arrayItem) => {
const str = resultString + arrayItem + '\n';
if (str.length <= charactersNum) {
return str;
}
return resultString;
}, '');
};
Use the function above to truncate to any numbers of characters: const lessThan1000 = stringifyArrayToCharactersNumber(array, 1000)

You can iterate backwards through the characters in the string, starting at the end, and grab the index of the last bullet. Then use that index in substring. Something like:
const array = ['• foo', '• bar', '• third'];
const initialString = array.join('\n');
let bulletIndex = initialString.length - 1;
for (let i = initialString.length - 1; i > 0; i--) {
if (initialString[i] === "•") {
bulletIndex = i;
break;
}
}
const finalString = initialString.substring(0, bulletIndex);
Note that that code doesn't quite do what you're saying: currently it cuts off everything after the last bullet, whether you want it to or not, so you'll need to add a check that the string is actually too long in the first place. But this basic approach should work.

Reshape String, inserting "\n" at every N characters

Using JavaScript functions, I was trying to insert a breakline on a string at every N characters provided by the user.
Just like this: function("blabla", 3) would output "bla\nbla\n".
I searched a lot of answers and ended up with a regex to do that, the only problem is, I need the user's input on the matter, so I need to stuck a variable on this regex.
Here's the code:
function reshapeString(string, num) {
var regex = new RegExp("/(.{" + num + "})/g");
return string.replace(regex,"$1\n");
}
reshapeString("blablabla", 3);
This is currently not working. I tried to escape the '/' characters, but I'm screwing up at some point and I don't know where.
What am I missing? Is there any other way to solve the problem of reshaping this string?

You need a string for the regexp constructor, without /, and you can omit the group by using $& for the found string.
function reshapeString(string, num) {
var regex = new RegExp(".{" + num + "}", "g");
return string.replace(regex,"$&\n");
}
console.log(reshapeString("blablabla", 3));

How about a one-liner?
const reshapeString = (str,N) => str.split('').reduce((o,c,i) => o+(!i || i%N?'':'\n')+c, '')
Explanation:
So first thing we do is split the string into a character array
Now we use a reduce() statement to go through each element and reduce to a single value (ie. the final string you're looking for!)
Now i%N should give a non-zero (ie. a truthy value) when the index is not a multiple of N, so we just add the current character to out accumulator variable o.
If i%N is in fact 0 (then it's falsey in value), and we append:
o (the string so far) +
\n (the appended character at the N'th interval)
c (the current character)
Note: We also have a !i check, that's for ignoring the first char since, that may be considered un-intended behavior
Benchmarking
Regex construction and replace also requires string re-construction and creating an FSA to follow. Which for strings smaller than 1000 should be slower
Test:
(_ => {
const reshapeString_AP = (str,N) => str.split('').reduce((o,c,i) => o+(!i || i%N?'':'\n')+c, '')
function reshapeString_Nina(string, num) {
var regex = new RegExp(".{" + num + "}", "g");
return string.replace(regex,"$&\n");
}
const payload = 'a'.repeat(100)
console.time('AP');
reshapeString_AP(payload, 4)
console.timeEnd('AP');
console.time('Nina');
reshapeString_Nina(payload, 4)
console.timeEnd('Nina');
})()
Results (3 runs):
AP: 0.080078125ms
Nina: 0.13916015625ms
---
AP: 0.057861328125ms
Nina: 0.119140625ms
---
AP: 0.070068359375ms
Nina: 0.116943359375ms

public static String reshape(int n, String str){
StringBuilder sb = new StringBuilder();
char[] c = str.replaceAll(" " , "").toCharArray();
int count =0;
for (int i = 0; i < c.length; i++) {
if(count != n){
sb.append(c[i]);
count++;
}else {
count = 1;
sb.append("\n").append(c[i]);
}
}
return sb.toString();
}

Strings are immutable so whatever you do you have to create a new string. It's best to start creating it in the first place.
var newLineForEach = (n,s) => s && `${s.slice(0,n)}\n${newLineForEach(n,s.slice(n))}`,
result = newLineForEach(3,"blablabla");
console.log(result);
So my tests show that this is by far the fastest. 100K iterations resulted Nina's 1260msec, AP's 103msec and Redu's 33msec. The accepted answer is very inefficient.

search() string for multiple occurrences

Say you have the string, Black cat jack black cat jack black cat jack.
How would you use search() to find the 2nd occurence of the word jack?
I'm guessing the code would look something like:
var str = "Black cat jack black cat jack black cat jack";
var jack = str.search('jack');
But that will only return the location of the first occurrence of jack in the string.

you can use indexof method in a loop
var pos = foo.indexOf("jack");
while(pos > -1) {
pos = foo.indexOf("jack", pos+1);
}

Usage recommendation
Note that String.search method works with RegExp - if you supply a string then it will implicitly convert it into a RegExp. It more or less has the same purpose as RegExp.test, where you only want to know whether there is a match to the RegExp in the string.
If you want to search for fixed string, then I recommend that you stick with String.indexOf. If you really want to work with pattern, then you should use RegExp.exec instead to get the indices of all the matches.
String.indexOf
If you are searching for a fixed string, then you can supply the position to resume searching to String.indexOf:
str.indexOf(searchStr, lastMatch + searchStr.length);
I add searchStr.length to prevent overlapping matches, e.g. searching for abab in abababacccc, there will be only 1 match found if I add searchStr.length. Change it to + 1 if you want to find all matches, regardless of overlapping.
Full example:
var lastMatch;
var result = [];
if ((lastMatch = str.indexOf(searchStr)) >= 0) {
result.push(lastMatch);
while ((lastMatch = str.indexOf(searchStr, lastMatch + searchStr.length)) >= 0) {
result.push(lastMatch);
}
}
RegExp.exec
This is to demonstrate the usage. For fixed string, use String.indexOf instead - you don't need the extra overhead with RegExp in fixed string case.
As an example for RegExp.exec:
// Need g flag to search for all occurrences
var re = /jack/g;
var arr;
var result = [];
while ((arr = re.exec(str)) !== null) {
result.push(arr.index);
}
Note that the example above will give you non-overlapping matches. You need to set re.lastIndex if you want to find overlapping matches (no such thing for "jack" as search string, though).

I've figured out this solution -to call the function that searches and replaces the original string recursively, until no more occurrences of the word are found:
function ReplaceUnicodeChars(myString) {
var pos = myString.search("&#");
if (pos != -1) {
// alert("Found unicode char in string " + myString + ", position " + pos);
unicodeChars = myString.substr(pos, 6);
decimalChars = unicodeChars.substr(2, 3);
myString = myString.replace(unicodeChars, String.fromCharCode(decimalChars));
}
if (myString.search("&#") != -1)
// Keep calling the function until there are no more unicode chars
myString = ReplaceUnicodeChars(myString);
return myString;
}

New to javascript, how to write reverse iteration?

I'm currently taking an introduction CIS class at my university and one of the projects is javascript. It is split into two unrelated parts and I was able to do the second part but I'm stuck on the first one. My professor wants me to write an iteration that will display in a reverse order whatever name I write in the prompt screen. So if I write "John Smith" it will display "htims nhoj". The issue is that I have no idea how to write it.
<html>
<body>
<script>
var namIn = window.prompt("Enter name:" );
var namAr = namIn.split("");
var namArLen = namAr.length;
document.write(namAr + "<br /> Length: " + namArLen);
</script>
</body>
</html>

Strings in JavaScript have a function called split() which turn them in to Arrays. Arrays in JavaScript have a function called reverse() which reverse their order, and a function called join() which turn them back into Strings. You can combine these into:
"John Smith".split("").reverse().join("")
This returns:
"htimS nhoJ"
Also, and I don't know if this is a typo, but you can throw a toLowerCase() to get 100% of what your question is after:
"John Smith".split("").reverse().join("").toLowerCase()
returns:
"htims nhoj"
As for the question in your title, you can specify the direction of a for loop in the last argument like so:
var reversed = [];
var name = "John Smith".split("");
for(var i = name.length-1; i >= 0; i--) {
reversed.push(name[i]);
}
console.log(reversed.join(""));
Which will output:
"htimS nhoJ"

There's no need to split this string into an array. Just use the charAt() function and a simple for loop.
var name = window.prompt("Enter name:");
var reverse = "";
for (var i = name.length - 1; i >=0; i--) {
reverse += name.charAt(i);
}
console.log(reverse)
Instead of converting the string to an array first, you're just reading the characters out of the string directly.

You can accomplish this by iterating only half the number of characters.
DEMO: http://jsfiddle.net/vgG2P/
CODE:
var name = "Bob Dylan".split("");
// The counters will meet in the middle.
// --------------+----------------------
// first char last char | inc dec
// -------v-------------v-----------v----v----v
for(var i = 0, j = name.length-1; i < j; i++, j--) {
var temp = name[i]; // Store the `i` char
name[i] = name[j]; // Replace the `i` char with the `j` char
name[j] = temp; // Replace the `j` char with the `i` char we stored
}
console.log(name.join("")); "nalyD boB"
EXPLANATION:
What we did was split the characters into an Array, and maintain two counters, one that increments from the first character at 0, and the other that decrements from the last character at .length - 1. Then simply swap the characters.
The iteration continues while your incrementing counter is less than your decrementing counter. Since they will meet in the middle, you'll end up incrementing only half the total length.
We can also build the halves of the result without using an Array:
DEMO: http://jsfiddle.net/vgG2P/1/
var name = "Bob Dylan";
var start = "", end = ""
for(var i = 0, j = name.length-1; i < j; i++, j--) {
end = name.charAt(i) + end
start += name.charAt(j)
}
if (i === j)
start += name.charAt(i)
console.log(start + end); "nalyD boB"

I'm assuming that your professor would not be asking you how to reverse a string if he hasn't yet introduced you to the concept of arrays and loops. Basically, a string like John Smith is just an array of characters like this:
0123456789
John Smith
Again, thinking in the sense that a string is just an array of characters, you have have 10 characters that need to be reversed. So how do you go about doing this? Well, you basically need to take the last character h from the "array" you're given and make it the first character in a new "array" you're going to create. Here's an example:
var known = 'John Smith';
var reversed = ''; // I'm making an empty string aka character array
var last = known.length - 1 // This is the index of the last character
for (var i = 0; i < known.length; i++)
{
temp += known[last - i];
}
(You can see it working here)
So what's happening?
We're looping over known starting at 0 and ending at 9 (from the first character to the last)
During each iteration, i is incrementing from 0 - 9
last always has a value of 9
last - i will give us indexes in reverse order (9, 8, 7, ..., 0)
So, when i is 0, last - i is 9 and known[9] is "h"; repeat this process and you get the reversed string
Hopefully this helps explain a little better what's happening when you call reverse() on an array.

(1) A more straight forward way without built-in functions:
function reverse(str) {
let reversed_string = "";
for (let i = str.length - 1; i >= 0; i--) {
reversed_string += str[i];
}
return reversed_string;
}
(2) Using ES2015 'for' helper function:
function reverse(str) {
let reversed_string = "";
for (let character of str) {
reversed_string = character + reversed_string;
}
return reversed_string;
}
(3) Using ES6 syntax and ES5.1 reduce():
function reverse(str) {
return str.split('').reduce((reversed, char) => char + reversed, '');
}
// reduce takes in 2 arguments. (1) arrow function, and (2) empty string.
Chances are, for an interview, that you will not able to use the built-in functions, especially for "reverse()".

Develop Reference

JavaScript is the programming language of the Web.

Find words around search term (snippet) with javascript - javascript

Related

Find duplicate two and three keywords in string text

Is there a Typescript function to set the max char limit to a string and make it cut from a specific character?

Reshape String, inserting "\n" at every N characters

search() string for multiple occurrences

New to javascript, how to write reverse iteration?

Categories

Resources