Is it possible to substring to the next whitespace in a string? - javascript

I am trying to highlight the remaining of a word if the text value from an input includes the start of the word. For example, we have a string of "Nutrition is great!", and if the user types in "Nutr" then I would like for "Nutrition" to be highlighted. I am having quite a lot of difficulty with this and was wondering if it's possible to substring to the next available whitespace? Or if anyone could give me any pointers for a different/better approach.
I have taken some inspiration from This post but unfortunately, it doesn't match the full word as I'd like.
I have created a sandbox to demonstrate my example, you'll notice that if you type in "Nutr" that only 2 options get highlighted when all 3 options should if possible?
https://codesandbox.io/s/dry-wind-17mko

You have a few issues in that others have pointed out about upper/lower case. You are also including the period at the end of the sentence, which is unlikely your intent.
This solution doesn't require changing the case of the input or anything.
I would like to offer a simpler approach using Regular expressions and the .match() method from the String Object.
This will allow a case insensitive match while only highlighting the word. In the code below, you'll see the matched searched word placed into the HTML with a simple call to .replace() adding the <span> tags.
The match is done based upon finding the typed characters and then finding the word boundary (the space you mentioned in your question).
if (titleRef.current) {
let resultsText = titleRef.current.innerHTML;
const regex = new RegExp("(?=" + searchTerm + ")\\w*", "i");
const found = resultsText.match(regex);
if (found) {
titleRef.current.innerHTML = resultsText.replace(found[0],`<span>${found[0]}</span>`);
}
}
There is one remaining problem. If the search no longer matches (say you type 'nutrdddd', the original match remains highlighted. So to overcome this, you'll need to remove the span tags and I'll leave that up to you.
Hope this is helpful. Here is the CodeSandBox

Comparison has to be based on String either being uppercase or lower case so that we can select the text String irrespective of being upper case of lower case.
UPDATE:
Also we need to select only the term based on match but not full length of text in the span.
UPDATED CODESANDBOX: https://codesandbox.io/s/highlighting-z216x
Logic to get the specific text updated to work dynamically.
Code Changes in useEffect() in Result.js
useEffect(() => {
console.log(searchTerm);
if (titleRef.current) {
let resultsText = titleRef.current.textContent.toUpperCase();
const index = resultsText.toUpperCase().indexOf(searchTerm);
const text = resultsText.substr(index, resultsText.length).split(" ")[0];
if (index >= 0) {
resultsText =
resultsText.substring(0, index) +
"<span>" +
text +
"</span>" +
resultsText.substring(index + text.length);
titleRef.current.innerHTML = resultsText;
}
}
}, []);
And in App.js we need to make sure to pass searchTerm as uppercase props
<Result
text="This is nutritional value"
searchTerm={searchTerm.toUpperCase()}
/>

Well, first what you can do is you need to change your string and input both to either Lowercase or Uppercase so that both will be compared on the same basis. rest all is fine you can check using .includes() method and you can trim your string using .trim() method as then can only highlight the desired the word.

Related

JavaScript: Replace certain occurrence of string depending on selection index

I've created my own autocomplete feature and I've come across a bug I'd like to fix. Here's an example of an incomplete sentence I might want to autocomplete the final word for:
let text = 'Hello there, I am her'
In my functionality the user clicks ctrl + enter and it autcompletes the word with a suggestion displayed on the page. In this case let's say the suggestion is 'here'. Also my controller knows where the user is based on the insertion cursor (so I have the index).
If I use replace like so:
text.replace(word, suggestion);
(Where word is 'her' and suggestion is 'here') it will replace the first occurrence. Obviously there are endless combinations of where this word might be in the text, how do I replace one at a certain index in text string? I know I can do it through some messy if conditions, but is there an elegant way to do this?
(If it is relevant I am using angular keydown/keyup for this)
EDIT>>>>>
This is not a duplicate on the question linked as in that case they are always replacing the last occurrence. If I did that then my program wouldn't support a user going back in their sentence and attempting to autocomplete a new word there
So, you have a position in a string and a number of characters to replace (=length of an incomplete word). In this case, would this work?
let text = 'appl and appl and appl'
function replaceAt(str, pos, len, replace) {
return str.slice(0, pos) + replace + str.slice(pos + len);
}
console.log(replaceAt(text, 0, 4, 'apple'))
console.log(replaceAt(text, 9, 4, 'apple'))
Gonna point you in a direction that should get you started.
let sentence = 'Hello lets replace all words like hello with Hi';
let fragments = sentence.split(' ');
for (let i=0; i<fragments.length; i++){
if(fragments[i].toLowerCase() == 'hello')
fragments[i] = 'Hi'
}
let formattedsentence = fragments.join(' ');
console.log(formattedsentence); //"Hi lets replace all words like Hi with Hi"

Uppercase for each new word swedish characters and html markup

I was pointed out to this post, which does not seem to follow the criteria I have:
Replace a Regex capture group with uppercase in Javascript
I am trying to make a regex that will:
format a string by adding uppercase for the first letter of each word and lower case for the rest of the characters
ignore HTML markup
Accept swedish characters (åäöÅÄÖ)
Say I've got this string:
<b>app</b>le store östersund
Then I want it to be (changes marked by uppercase characters)
<b>App</b>le Store Östersund
I've been playing around with it and the closest I've got is the following:
(?!([^<])*?>)[åäöÅÄÖ]|\s\b\w
Resulted in
<b>app</b>le Store Östersund
Or this
/(?!([^<])*?>)[åäöÅÄÖ]|\S\b\w/g
Resulted in
<B>App</B>Le store Östersund
Here's a fiddle:
http://refiddle.com/refiddles/598aabef75622d4a531b0000
Any help or advice is much appreciated.
It is not possible to do this with regexp alone, since regexp doesn't understand HTML structure. [*] Instead, we need to process each text node, and carry through our logic for what is the beginning of the word in case a word continues across different text nodes. A character is at start of the word if it is preceded by a whitespace, or if it is at the start of the string and it is either the first text node, or the previous text node ended in whitespace.
function htmlToTitlecase(html, letters) {
let div = document.createElement('div');
let re = new RegExp("(^|\\s)([" + letters + "])", "gi");
div.innerHTML = html;
let treeWalker = document.createTreeWalker(div, NodeFilter.SHOW_TEXT);
let startOfWord = true;
while (treeWalker.nextNode()) {
let node = treeWalker.currentNode;
node.data = node.data.replace(re, function(match, space, letter) {
if (space || startOfWord) {
return space + letter.toUpperCase();
} else {
return match;
}
});
startOfWord = node.data.match(/\s$/);
}
return div.innerHTML;
}
console.log(htmlToTitlecase("<b>app</b>le store östersund", "a-zåäö"));
// <b>App</b>le Store Östersund
[*] Maybe possible, but even if so, it would be horribly ugly, since it would need to cover an awful amount of corner cases. Also might need a stronger RegExp engine than JavaScript's, like Ruby's or Perl's.
EDIT:
Even if just specifying really simple html tags? The only ones I am actually in need of covering is <b> and </b> at the moment.
This was not specified in the question. The solution is general enough to work for any markup (including simple tags). But...
function simpleHtmlToTitlecaseSwedish(html) {
return html.replace(/(^|\s)(<\/?b>|)([a-zåäö])/gi, function(match, space, tag, letter) {
return space + tag + letter.toUpperCase();
});
}
console.log(simpleHtmlToTitlecaseSwedish("<b>app</b>le store östersund", "a-zåäö"));
I have a solution which use almost only regex. It may be not the most intuitive way to do it, but it should be effective and I find it funny :)
You have to append at the end of your string every lowercase character followed by their uppercase counterpart, like this (it must also be preceded by a space for my regex) :
aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZåÅäÄöÖ
(I don't know which letters are missing, I know nothing about swedish alphabet, sorry... I'm counting on you to correct that !)
Then you can use the following regex :
(?![^<]*>)(\s<[^/]*?>|\s|^)([\wåäö])(?=.*\2(.)\S*$)|[\wåÅäÄöÖ]+$
Replace by :
$1$3
Test it here
Here is a working javascript code :
// Initialization
var regex = /(?![^<]*>)(\s<[^/]*?>|\s|^)([\wåäö])(?=.*\2(.)\S*$)|[\wåÅäÄöÖ]+$/g;
var string = "test <b when=\"2>1\">ap<i>p</i></b>le store östersund";
// Processing
result = string + " aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZåÅäÄöÖ";
result = result.replace(regex, "$1$3");
// Display result
console.log(result);
Edit : I forgot to handle first word of the string, it's corrected :)

Regular expression to match a string which is NOT matched by a given regexp

I've been hoving around by some answers here, and I can't find a solution to my problem:
I have this regexp which matches everyting inside an HTML span tag, including contents:
<span\b[^>]*>(.*?)</span>
and I want to find a way to make a search in all the text, except for what is matched with that regexp.
For example, if my text is:
var text = "...for there is a class of <span class="highlight">guinea</span> pigs which..."
... then the regexp would match:
<span class="highlight">guinea</span>
and I want to be able to make a regexp such that if I search for "class", regexp will match "...for there is a class of..."
and will not match inside the tag, like in
"... class="highlight"..."
The word to be matched ("class") might be anywhere within the text. I've tried
(?!<span\b[^>]*>(.*?)</span>)class
but it keeps searching inside tags as well.
I want to find a solution using only regexp, not dealing with DOM nor JQuery. Thanks in advance :).
Although I wouldn't recommend this, I would do something like below
(class)(?:(?=.*<span\b[^>]*>))|(?:(?<=<\/span>).*)(class)
You can see this in action here
Rubular Link for this regex
You can capture your matches from the groups and work with them as needed. If you can, use a HTML parser and then find matches from the text element.
It's not pretty, but if I get you right, this should do what you wan't. It's done with a single RegEx but js can't (to my knowledge) extract the result without joining the results in a loop.
The RegEx: /(?:<span\b[^>]*>.*?<\/span>)|(.)/g
Example js code:
var str = '...for there is a class of <span class="highlight">guinea</span> pigs which...',
pattern = /(?:<span\b[^>]*>.*?<\/span>)|(.)/g,
match,
res = '';
match = pattern.exec(str)
while( match != null )
{
res += match[1];
match = pattern.exec(str)
}
document.writeln('Result:' + res);
In English: Do a non capturing test against your tag-expression or capture any character. Do this globally to get the entire string. The result is a capture group for each character in your string, except the tag. As pointed out, this is ugly - can result in a serious number of capture groups - but gets the job done.
If you need to send it in and retrieve the result in one call, I'd have to agree with previous contributors - It can't be done!

Trying to use Javascript regex to grab a section of "&" delimited text whether or not it's the last value

My text looks similar to this:
action=addItem&siteId=4&lang_locale=en_US&country=US&catalogId=1&productId=417689&displaySize=7&skuSize=2194171&qty=1&pil=7&psh=had+AIRJRnjbp7+rGivIKg00
and I want to replace the value of 'psh'. It may sometimes not be the last value (it may be followed by &something=else).
I've tried doing these lines of code:
var text = text.replace(/&psh=.*(?=&|$)/, "&psh=" + data.psh);
var text = text.replace(/&psh=.*(?=[&|$]+)/, "&psh=" + data.psh);
var text = text.replace(/(?:&psh=)(.*)(?=[&|$]+)/, data.psh);
None of them work for both situations. Use this site to check regexes.
This should work:
var text = text.replace(/&psh=[^&]*/, "&psh=" + data.psh);
[^&]* matches a string of any length that consists of any characters except &, therefore the match will continue until the end of the string or until (but not including) the next &, whichever comes first.
Tim's answer may work, but I fear it is not the best possible answer. The string you are giving as an example looks a lot like a url. If it is, that means there can sometimes be a pound sign in it as well (#). To compensate for that you actually need to modify your code to look like this:
var text = text.replace(/&psh=[^&#]*/, "&psh=" + data.psh);
Notice the # which was added in order to not get tripped up by anchor tags in the url.

javascript regex to extract the first character after the last specified character

I am trying to extract the first character after the last underscore in a string with an unknown number of '_' in the string but in my case there will always be one, because I added it in another step of the process.
What I tried is this. I also tried the regex by itself to extract from the name, but my result was empty.
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var string = match(/[^_]*$/)[1]
string.charAt(0)
So the final desired result is 'D'. If the RegEx can only get me what is behind the last '_' that is fine because I know I can use the charAt like currently shown. However, if the regex can do the whole thing, even better.
If you know there will always be at least one underscore you can do this:
var s = "XXXX-XXXX_XX_DigitalF.pdf"
var firstCharAfterUnderscore = s.charAt(s.lastIndexOf("_") + 1);
// OR, with regex
var firstCharAfterUnderscore = s.match(/_([^_])[^_]*$/)[1]
With the regex, you can extract just the one letter by using parentheses to capture that part of the match. But I think the .lastIndexOf() version is easier to read.
Either way if there's a possibility of no underscores in the input you'd need to add some additional logic.

Categories

Resources