Find first word in a string after a . in javascript - javascript

I am working on a school assignment where we have to highlight the first word after a "." in a text with the press on a button.
As far as i've come I have created a HTML page, the HTML page contains a button and on click it is supposed to highligt the first word after a "." I have discovered that I probably have to use split and or slice.
function highligtWord(){
var tekst = document.getElementById("tekst").innerHTML;
for (var i = 0 ; i < tekst.length; i++) {
var arr = tekst.split(". ")[i].split(" ")[0]
console.log(arr)
var res = tekst.replace(`${arr}`, "<span style=background-color:yellow>" + `${arr}` + "</span>" );
document.getElementById("tekst").innerHTML += res;
}
So far this does not work as intended as it "highlight" words that aren't after a "." so my question is, what do i do wrong?
And how can I get you "highligt" all words after a "." instead?
Thanks in advance

It'd probably be easier to use a regular expression - match \. (a literal dot), then match \S+ - one or more non-space characters. Then replace with the non-space characters surrounded by the highlight span:
const elm = document.getElementById("tekst");
elm.innerHTML = elm.innerHTML.replace(
/(\. *)(\S+)/,
'$1<span style=background-color:yellow>$2</span>'
);
<div id="tekst">foo bar. baz should be highlighted.</div>
If you want to highlight all words that follow a ., use the global flag instead for the regular expression (use /g).

Related

Replace a specific character from a string with HTML tags

Having a text input, if there is a specific character it must convert it to a tag. For example, the special character is *, the text between 2 special characters must appear in italic.
For example:
This is *my* wonderful *text*
must be converted to:
This is <i>my</i> wonderful <i>text</i>
So I've tried like:
const arr = "This is *my* wonderful *text*";
if (arr.includes('*')) {
arr[index] = arr.replace('*', '<i>');
}
it is replacing the star character with <i> but doesn't work if there are more special characters.
Any ideas?
You can simply create wrapper and thereafter use regular expression to detect if there is any word that is surrounded by * and simply replace it with any tag, in your example is <i> tag so just see the following
Example
let str = "This is *my* wonderful *text*";
let regex = /(?<=\*)(.*?)(?=\*)/;
while (str.includes('*')) {
let matched = regex.exec(str);
let wrap = "<i>" + matched[1] + "</i>";
str = str.replace(`*${matched[1]}*`, wrap);
}
console.log(str);
here you go my friend:
var arr = "This is *my* wonderful *text*";
const matched = arr.match(/\*(?:.*?)\*/g);
for (let i = 0; i < matched.length; i++) {
arr = arr.replace(matched[i], `<i>${matched[i].replaceAll("*", "")}</i>`);
}
console.log(arr);
an explanation first of all we're matching the regex globaly by setting /g NOTE: that match with global flag returns an array.
secondly we're looking for any character that lies between two astrisks and we're escaping them because both are meta characters.
.*? match everything in greedy way so we don't get something like this my*.
?: for non capturing groups, then we're replacing every element we've matched with itself but without astrisk.

Splitting a string by white space and a period when not surrounded by quotes

I know that similar questions have been asked many times, but my regular expression knowledge is pretty bad and I can't get it to work for my case.
So here is what I am trying to do:
I have a text and I want to separate the sentences. Each sentence ends with some white space and a period (there can be one or many spaces before the period, but there is always at least one).
At the beginning I used /\s+\./ and it worked great for separating the sentences, but then I noticed that there are cases such as this one:
"some text . some text".
Now, I don't want to separate the text in quotes. I searched and found a lot of solutions that work great for spaces (for example: /(".*?"|[^"\s]+)+(?=\s*|\s*$)/), but I was not able to modify them to separate by white space and a period.
Here is the code that I am using at the moment.
var regex = /\s+\./;
var result = regex.exec(fullText);
if(result == null) {
break;
}
var length = result[0].length;
var startingPoint = result.index;
var currentSentence = fullText.substring(0,startingPoint).trim();
fullText = fullText.substring(startingPoint+length);
I am separating the sentences one by one and removing them from the full text.
The length var represents the size of the portion that needs to be removed and startingPoint is the position on which the portion starts. The code is part of a larger while cycle.
Instead of splitting you may try and match all sentences between delimiters. In this case it will be easier to skip delimiters in quotes. The respective regex is:
(.*?(?:".*?".*?)?|.*?)(?: \.|$)
Demo: https://regex101.com/r/iS9fN6/1
The sentences then may be retrieved in this loop:
while (match = regex.exec(input)) {
console.log(match[1]); // each next sentence is in match[1]
}
BUT! This particular expression makes regex.exec(input) return true infinitely! (Looks like a candidate to one more SO question.)
So I can only suggest a workaround with removing the $ from the expression. This will cause the regex to miss the last part which later may be extracted as a trailer not matched by the regex:
var input = "some text . some text . \"some text . some text\" some text . some text";
//var regex = /(.*?(?:".*?".*?)?|.*?)(?: \.|$)/g;
var regex = /(.*?(?:".*?".*?)?|.*?) \./g;
var trailerPos = 0;
while (match = regex.exec(input)) {
console.log(match[1]); // each next sentence is in match[1]
trailerPos = match.index + match[0].length;
}
if (trailerPos < input.length) {
console.log(input.substring(trailerPos)); // the last sentence in
// input.substring(trailerPos)
}
Update:
If sentences span multiple lines, the regex won't work since . pattern does not match the newline character. In this case just use [\s\S] instead of .:
var input = "some \ntext . some text . \"some\n text . some text\" some text . so\nm\ne text";
var regex = /([\s\S]*?(?:"[\s\S]*?"[\s\S]*?)?|[\s\S]*?) \./g;
var trailerPos = 0;
var sentences = []
while (match = regex.exec(input)) {
sentences.push(match[1]);
trailerPos = match.index + match[0].length;
}
if (trailerPos < input.length) {
sentences.push(input.substring(trailerPos));
}
sentences.forEach(function(s) {
console.log("Sentence: -->%s<--", s);
});
Use the encode and decode of javascript while sending and receiving.

Add a string to the thing begin replaced in a case insensitive RegExp.

sorry for the badly formulated question I hope my text and code will make it better understood what I want to accomplish.
I am Writing some java script for an Android app. I have some problems with the JavaScript RegExp for my webview. Could someone please help me?
Basic pseudo code for what I want to do.
/*
* Replace all instances of a letter (case insensitive) with itself + add some string.
* Example, search for all 'a' (case insensitive) and replace it with 'a
* someString' if it was lowercase. If it was a capital than replace it with 'A someString'
*/
This is my code (Sorry its all in a string, has to be for the webview).
"var alpha = 'abcdefghijklmnopqrstuvwxyz'.split('');" +
"for (i = 0; i < 5; i++) " +
"{if(window.HtmlViewer.isActive(i))" +
"{var re = new RegExp( \"(\" + alpha[i] + \"(?![^<>]*>))\", 'gi' );" +
"document.body.innerHTML = document.body.innerHTML.replace(re, " +
"'<font color=\"'+colorsArray[i]+'\">'+alpha[i]+'</font>');}" +
"else{break;}" +
"};" +
In the first loop it replaces all 'a' and 'A' with 'a' and gives a color to it. What I want is to make it replace 'a' with only 'a' and 'A' with only 'A', i.e. the only thing changing is the color "font color="'+colorsArray[i]). Any idea how I would accomplish this? Can I somehow use the var re to get if its a capital or lowercase, and than do something like:
"'<font color=\"'+colorsArray[i]+'\">'+re.getString()+'</font>');}" +
The solution I have now is to make two for loops and remove the 'i' (case insensitive) modifier. In the first loop I handle lowercase and in the second loop I handle uppercase. But this seems like double work since the color for both 'a' and 'A' are the same. There has to be a better way to do this than that?
First off, you don't have to use the loop to match each letter, you can use the regex pattern [a-z] with the 'i' flag to instead of "alpha[i]". I have set up an example that should work for your case here:
http://jsfiddle.net/yy6we65a/3/
var re = /([a-z](?![^<>]*>))/gi;
var colors = ["red","blue","green","yellow","orange"];
var i =0;
function encapsulate(src){
var ret = '<font color="'+colors[i]+'">'+src+'</font>';
i++;
if( i == 5) i = 0;
return ret;
}
var orig = document.getElementById("container").innerHTML;
var target = document.getElementById("target");
target.innerHTML = orig.replace(re,encapsulate);
I didn't have your color array, so I used one of my own but of course you can just use yours. There are comments in the code to explain each section.

Javascript - How to join two capitalize first letter of word scripts

I have an Acrobat form with some text fields with multiline on. My goal is to convert to uppercase the first letter of any sentence (look for dots) and also the first letter of any new line (after return has been pressed).
I can run each transformation separately, but do not know how to run them together.
To capitalize sentences I use the following code as custom convalidation :
// make an array split at dot
var aInput = event.value.split(". ");
var sCharacter = '';
var sWord='';
// for each element of word array, capitalize the first letter
for(i = 0; i <aInput.length; i++)
{
aInput[i] = aInput[i].substr(0, 1).toUpperCase() + aInput[i].substr(1) .toLowerCase();
}
// rebuild input string with modified words with dots
event.value = aInput.join('. ');
To capitalize new lines I replace ". " with "\r".
Thanks in advance for any help.
You can get the first character of each sentence with RegExp :
event.value = event.value.replace(/.+?[\.\?\!](\s|$)/g, function (txt) {
return txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase();
});
Demo : http://jsfiddle.net/00kzc370/
Regular Expression explained :
/.+?[\.\?\!](\s|$)/g is a regular expression.
.+?[\.\?\!](\s|$) is a pattern (to be used in a search) that match sentences ended by ., ? or ! and followed by a whitespace character.
g is a modifier. (Perform a global match (find all matches rather than stopping after the first match)).
Source : http://www.w3schools.com/jsref/jsref_obj_regexp.asp

Select Random Words From Tag, Wrap In Italic

I have a bunch of dynamically generated H1 tags.
I want to randomly select 1 word within the tag, and wrap it in italic tags.
This is what I have so far, the problem is, it is taking the first h1's dynamically generated content, and duplicating it to every h1 on the page.
Other than that, it works.
Any ideas?
var words = $('h1').text().split(' ');
// with help from http://stackoverflow.com/questions/5915096/get-random-item-from-array-with-jquery
var randomWord = words[Math.floor(Math.random()*words.length)];
// with more help from http://stackoverflow.com/questions/2214794/wrap-some-specified-words-with-span-in-jquery
$('h1').html($('h1').html().replace(new RegExp( randomWord, 'g' ),'<i>'+randomWord+'</i>'));
My ultimate goal
<h1>This is a <i>title</i></h1>
<h1><i>This</i> is another one</h1>
<h1>This <i>is</i> the last one</h1>
All of the titles will be dynamically generated.
http://codepen.io/anon/pen/uskfl
The problem is $('h1') creates a collection of all of the h1 tags in the page.
You can use a function callback of the html() method which will loop over every h1 and treat them as separate instances
$('h1').html(function(index, existingHtml) {
var words = existingHtml.split(' ');
var randomWord = words[Math.floor(Math.random() * words.length)];
return existingHtml.replace(new RegExp(randomWord, 'g'), '<i>' + randomWord + '</i>');
});
see html() docs ( scroll 1/2 way down page, function argument was not in earlier versions)
You can use jQuery's .each() to iterate through the h1s.
$('h1').each(function(){
var words = $(this).text().split(' ');
var randomWord = words[Math.floor(Math.random()*words.length)];
$(this).html(
$(this).html().replace(new RegExp( randomWord, 'g'),'<i>'+randomWord+'</i>')
);
});
Demo: http://jsfiddle.net/RT25S/1/
Edit: I just noticed a bug in my answer that is also in your question and probably in the other answers.
In titles like this is another one, is is italicised in both is and this. scrowler commented that when the selected word is in the title multiple times all of them will be italicised, but I doubt you intended for partial words to be italicised.
The fixes are relatively simple. Just check for spaces before and after the word. You also have to allow for words at the beginning and end of the title using the ^ and $ metacharacters.
Ideally we could use \b, which is a "word boundary", instead but it doesn't seem to work when words end with non-alphanum characters.
You should also probably escape the randomly-selected word before including it in a regex in case it contains any special characters. I added the escaping regex from Is there a RegExp.escape function in Javascript?.
The updated code:
$('h1').each(function(){
var words = $(this).text().split(' ');
var randomWord = words[Math.floor(Math.random()*words.length)];
// Escape the word before including it in a regex in case it has any special chars
var randomWordEscaped = randomWord.replace(/[-\/\\^$*+?.()|[\]{}]/g, '\\$&');
$(this).html(
$(this).html().replace(
//new RegExp( '\\b' + randomWordEscaped + '\\b', 'g' ),
new RegExp( '(^| )' + randomWordEscaped + '( |$)', 'g' ),
'<i> ' + randomWord + ' </i>'
)
);
});
And the updated JSFiddle: http://jsfiddle.net/RT25S/3/
Note that I added spaces after and before the <i> tags because the regex now captures them. (This still works for words at the beginning/ends of titles because HTML ignores that whitespace.)

Categories

Resources