Add HTML Around Regex Matches From Element Inner Text - javascript

I have a span element with a piece of inner text. I need to target specific pieces of that text through regex and turn my matches into span elements with a class attribute class="hightlight-yellow" and a custom attribute called my-custom-attribute="hello".
Here is my input and expected output...
Input
<span class="message"> This is some text $HIGH:LOW $LOW:HIGH </span>
Output
<span> This is some text <span class="highlight-yellow" my-custom-attribute="hello">$HIGH:LOW</span> <span class="highlight-yellow" my-custom-attribute="hello">$LOW:HIGH</span></span>
How can I do the replace? Here is my current code colleting the matches
function handle()
{
let element = document.getElementsByClassName('message')
let text = element[0].innerText;
let regex = /([^=])\$([A-Za-z:]{1,})/g;
let matches = text.match(regex);
if(matches != null)
{
for(let i = 0; i < matches.length; ++i)
{
// TODO takes matches are replace with spans with attributes.
}
}
}

You can simply use your regex in String.prototype.replace() and replace it with the desired markup wrapping your capturing group.
I have removed the first capturing group since it matches the empty whitespace before, which is not what you want. On top of that, I'd suggest using querySelectorAll and iterate through the node list to future proof your setup.
If you really only want to select the first element that matches the selector .message, that's also possible.
NOTE: Your output also seems to remove the message class from your original <span> element, but I'm not sure if that is a typo or intentional.
See proof-of-concept below:
function handle() {
const regex = /\$([A-Za-z:]{1,})/g;
document.querySelectorAll('.message').forEach(el => {
el.innerHTML = el.innerText.replace(regex, '<span class="highlight-yellow" my-custom-attribute="hello">$1</span>');
});
}
handle();
span.highlight-yellow {
background-color: yellow;
}
<span class="message">This is some text $HIGH:LOW $LOW:HIGH</span>

Related

Wrap all individual words in a span tag based on their first letter

I am trying to wrap each individual word on a webpage in a tag so I can style them individually based on their starting letter.
I have found this method of wrapping each word in a span tag individually, but I can't figure out how to vary the class based on the first letter of the word.
let e = document.getElementById('words');
e.innerHTML = e.innerHTML.replace(/(^|<\/?[^>]+>|\s+)([^\s<]+)/g, '$1<span class="word">$2</span>');
What you're trying to achieve can't be done with regex if you want to reference the individual words.
I've wrote a little snippet that uses document.querySelector() instead
outerText property on the query selector object returns a plain text string which is later converted to an array with split() function
Then it simply loops over the array and appends the style tag and to get the first letter I've used substring() function
const words = document.querySelector("#words").outerText.split(" ");
const wordsDiv = document.querySelector("#words")
wordsDiv.innerHTML = ""
words.map((el) => {
wordsDiv.innerHTML += `<span class="${el.substring(0, 1)}">${el}</span> `
})
<div id="words">red green blue orange</div>
Try this:
let e = document.getElementById('words');
// Create array with words
let words = e.innerHTML.split(' ');
// Object with CSS classes and corresponding letter
const classes = {
a: 'class-one',
b: 'class-two',
// and so on
}
// Return array with the new strings
words = words.map(word => {
const firstLetter = word.substring(0,1);
return `<span class="${classes[firstLetter]}">${word}</span>`;
});
// Join the array and update the DOM
e.innerHTML = words.join(' ');
Here the following is happening:
Words are being separated by a space, creating an array;
The classes constant must receive, as in the example, the correspondence of each letter with its class;
In the function of the map method, the first letter of the word is being returned and the classes object is accessed with that letter;
Finally we do a join to join all the texts separating them with a space.
Remembering that if there is more than one space between words, inconsistencies will occur, as the first letter will be a space.
From the string.prototype.replace reference, you can also add a replace function to the method, instead of an string. The replace function has this form.
So, if I didn't misinterpret your problem, you can do something similar to this:
let e = document.getElementById('words');
e.innerHTML = e.innerHTML.replace(/(^|<\/?[^>]+>|\s+)([^\s<]+)/g, function(match, p1, p2) {
const myClasses = {
a: "aword",
b: "bword",
...
}
return `${p1}<span class="${myClasses[p2[0]]}">${p2}</span>'
});
You could do something like this:
Select all the children
map them with their text content
Split the text into single words
map again with some styled html
Join, render and enjoy.
words.innerHTML = [...words.children].flatMap(el => el.innerText.replace(/\n/ig, " ").split(" ")).map(el => `<div class="someClass">${el}</div>`).join("")
.someClass {
color: red;
background: orange;
margin: 10px;
}
<div id="words">
<div>
Some dummy content
<span>Some other nested dummy content</span>
</div>
<p>Some sibling content</p>
</div>

How can I remove all HTML elements from a string excluding a special class?

I've a problem. I'm currently looking for a way to remove any HTML elements from a string. But there are two conditions:
The content of the elements should be kept
Special elements with a defined class should not be removed
I've already tried lots of things and looked at plenty of questions/answers on SO, but unfortunately I can't really figure out any of the answers. Unfortunately, this exceeds my abilities by far. But I would like to know how something like this works.
Question/Answers I've tried:
How to strip HTML tags from string in JavaScript?,
Strip HTML from Text JavaScript
So when I have for example a string like this:
You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span>
It should looks like this after stripping:
You have to pay <div class="keep-this">$200</div> per month for your car
I've actually tried following things:
jQuery(document).ready(function ($) {
let string = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span>';
console.log(string);
function removeHTMLfromString(string) {
let tmp = document.createElement("DIV");
tmp.innerHTML = string;
return tmp.textContent || tmp.innerText || "";
}
console.log(removeHTMLfromString(string));
console.log(string.replace(/<[^>]*>?/gm, ''));
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
And I've also tried out a regex tool to see what get's removed, but unfortunately, I'm not making much progress here either:
https://www.regexr.com/50qar
I would love if someone can help me with this task. Thanks a lot!
Update
Maybe there is a way doing it with just a regex? If yes, how can I exclude my elements with a special class when using this regex: /<\/?[^>]+(>|$)/g?
It may be a little big code. But I think it may help you.
let str = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> <div class="keep-this">$500</div> also';
const el = document.createElement("div");
el.innerHTML = str;
// Get all the elements to keep
const keep = el.querySelectorAll(".keep-this");
// Replace the keeping element from the original string
// With special pattern and index so that we can replace
// the pattern with original keeping element
keep.forEach((v, i) => {
const keepStr = v.outerHTML;
str = str.replace(keepStr, `_k${i}_`);
});
// Replace created element's innerHTML by patternised string.
el.innerHTML = str;
// Get the text only
let stringify = el.innerText;
// Replace patterns from the text string by keeping element
keep.forEach((v,i) => {
const keepStr = v.outerHTML;
stringify = stringify.replace(`_k${i}_`, keepStr);
});
console.log(stringify);
Leave me comment if anything misleading.
Update: Regular Expression approach
The same task can be done by using a regular expression. The approach is-
Find all the keepable elements by regex and store them.
Replace all the keepable elements from the input string by an identical pattern
Remove all the HTML tags from the sting.
Replace the identical patterns by keepable elements.
let htmlString = 'You have to pay <div class="keep-this">$200</div> per <span class="date">month</span> for your <span class="vehicle">car</span> Another <div class="keep-this">$400</div> here';
// RegExp for keep elements
const keepRegex = /<([a-z1-6]+)\s+(class=[\'\"](keep-this\s*.*?)[\'\"])[^>]*>.*?<\/\1>/ig;
// RegExp for opening tag
const openRegex = /<([a-z1-6]+)\b[^>]*>/ig;
// RegExp for closing tag
const closeRegex = /<\/[a-z1-6]+>/ig;
// Find all the matches for the keeping elements
const matches = [...htmlString.matchAll(keepRegex)];
// Replace the input string with any pattern so that it could be replaced later
matches.forEach((match, i) => {
htmlString = htmlString.replace(match[0], `_k${i}_`);
});
// Remove opening tags from the input string
htmlString = htmlString.replace(openRegex, '');
// Remove closing tags from the input string
htmlString = htmlString.replace(closeRegex, '');
// Replace the previously created pattern by keeping element
matches.forEach((match, index) => {
htmlString = htmlString.replace(`_k${index}_`, match[0]);
})
console.log(htmlString);
If date and vehicles div and class are coming from another function, you should just get rid of it from there.

Why does RegEx output escaped text instead of HTML

I am writing a Chrome extension which adds a <span> ... </span> around every string that matches a certain regular expression. The RegEx match works perfectly, but I cannot seem to find a way to correctly add the span tag around the text.
My code thus far is:
// main.js
var regex_pattern = new RegEx('(apple)', 'g'); // Let's pretend I want to match every instance of 'apple'
var textNodes = getTextNodes(); // A function that returns a list of every text node from the DOM
for (var i = 0; i < textNodes.length; i++) {
if (textNodes[i].nodeValue.match(regex_pattern)) {
textNodes[i].nodeValue = textNodes[i].nodeValue.replace(regex_pattern, "<span class='highlight'>$&</span>");
}
}
This will correctly identify every match of my RegEx pattern (in this case 'apple') and output <span class="highlight">apple</span>. The only problem is that this is not treated as HTML by Chrome, it's treated as text - so instead of seeing the world 'apple' styled according to the highlight class, one would see the literal output: <span class="highlight">apple<span>
Why does this happen, and how can I fix it so that the style is correctly applied? Realizing that this was less than desirable, I tried using the insertBefore() method to wrap the matched text in a span, but this didn't do anything, it would either error or fail to add the span node, depending on how I tweaked the code. Thanks for any insight you can provide!
You can't use nodeValue to replace a text node with arbitrary HTML.
You must do it manually:
function replaceNodeWithHTML(node, html) {
var parent = node.parentNode;
if(!parent) return;
var next = node.nextSibling;
var parser = document.createElement('div');
parser.innerHTML = html;
while(parser.firstChild)
parent.insertBefore(parser.firstChild, next);
parent.removeChild(node);
}
var regex_pattern = /(apple)/g;
var textNodes = [document.querySelector('div').firstChild];
for (var i = 0; i < textNodes.length; i++)
if (textNodes[i].nodeValue.match(regex_pattern))
replaceNodeWithHTML(
textNodes[i],
textNodes[i].nodeValue.replace(regex_pattern, "<span class='highlight'>$&</span>")
);
.highlight {
background: yellow;
}
<div>I have an (apple). You have an (apple) too.</div>
It would be easier if nodes had insertAdjacentHTML method, but only elements do.
Set .innerHTML on the element. Setting textNode.nodeValue value direcly sets the text.

Count Specific Words Using Jquery

I have the following HTML code:
<ul>
<li>apples <span id="apples-density">1</span></li>
<li>pears <span id="pears-density">0</span></li>
<li>oranges <span id="oranges-density">2</span></li>
</ul>
<textarea>This is where I love to eat apples and oranges</textarea>
<textarea>I also like oranges on Sundays!</textarea>
What I would to achieve is that when ever the textarea is updated (on key up), the density of the 3 specific words is counted and then the value inside the SPAN is updated.
However, the page can contain up to 10 words that will need to be counted and also an unlimited number of TEXTAREA elements. And... the actual 3 words that are being counted are different each time, so the code has to allow for some sort of automation.
I can sort of see how it should work in my head, but not quite how to implement...
perform a jquery .each on the textarea values.
perform a jquery to grab each of the <li> values
so some sort of regex to match the content of the textarea's and count the words.
update the .text of the correct to show the value.
My own suggestion would be:
function wordsUsed(){
// get all the text from all the textarea elements together in one string:
var text = $('textarea').map(function(){
return $(this).val();
}).get().join(' '),
reg;
// iterate over the span elements found in each li element:
$('li > span').each(function(i,el){
// cache the variable (in case you want to use it more than once):
var that = $(el);
// create a RegExp (regular expression) object:
// the '\\b' is a word boundary double-escaped because it's a string;
// we shorten the id of the current element by removing '-frequency'
// (I didn't think 'density' was an appropriate description) to find
// the word we're looking for;
// 'gi' we look through the whole of the string (the 'g') and
// ignore the case of the text (the 'i'):
reg = new RegExp('\\b' + el.id.replace('-frequency','') + '\\b', 'gi');
// we look for the regular-expression ('reg') matches in the string ('text'):
var matched = text.match(reg),
// if matched does not exist (there were no matched words) we set
// the count to 0, otherwise we use the number of matches (length):
matches = !matched ? 0 : matched.length;
// setting the text of the current element:
that.text(matches);
});
}
$('textarea')
// binding the keyup event-handler, using 'on()':
.on('keyup', wordsUsed)
// triggering the keyup event, so the count is accurate on page-load:
.keyup();
JS Fiddle demo.
The above works on the (pedantically) modified HTML:
<ul>
<li>apples <span id="apples-frequency"></span>
</li>
<li>pears <span id="pears-frequency"></span>
</li>
<li>oranges <span id="oranges-frequency"></span>
</li>
</ul>
<textarea>actual text removed for brevity</textarea>
<textarea>actual text removed for brevity</textarea>
References:
'Plain' JavaScript:
JavaScript regular expressions.
RegExp().
String.prototype.match().
String.prototype.replace().
jQuery:
each().
get().
map().
on().
text().
// Get all the textarea values
var all_text = '';
$('textarea').each(function() {
all_text += ' ' + $(this).text();
}
// Process each of the LIs
$('li span').each(function() {
var word = this.id.split('-')[0]; // Get the search word
// Convert it to a regular expression. Use 'g' option to match all occurrences.
var regex = new RegEx('\b'+word+'\b', 'g');
var count = all_text.match(regex).length;
$(this).text(count);
}

How to get numbers in elements' inner text by javascript's regex

I want to get numbers in the inner text of an html by javascript regex to replace them.
for example in the below code I want to get 1,2,3,4,5,6,1,2,3,1,2,3, but not the 444 inside of the div tag.
<body>
aaaa123aaa456
<div style="background: #444">aaaa123aaaa</div>
aaaa123aaa
</body>
What could be the regular expression?
Your best bet is to use innerText or textContent to get at the text without the tags and then just use the regex /\d/g to get the numbers.
function digitsInText(rootDomNode) {
var text = rootDomNode.textContent || rootDomNode.innerText;
return text.match(/\d/g) || [];
}
For example,
alert(digitsInText(document.body));
If your HTML is not in the DOM, you can try to strip the tags yourself : JavaScript: How to strip HTML tags from string?
Since you need to do a replacement, I would still try to walk the DOM and operate on text nodes individually, but if that is out of the question, try
var HTML_TOKEN = /(?:[^<\d]|<(?!\/?[a-z]|!--))+|<!--[\s\S]*?-->|<\/?[a-z](?:[^">']|"[^"]*"|'[^']*')*>|(\d+)/gi;
function incrementAllNumbersInHtmlTextNodes(html) {
return html.replace(HTML_TOKEN, function (all, digits) {
if ("string" === typeof digits) {
return "" + (+digits + 1);
}
return all;
});
}
then
incrementAllNumbersInHtmlTextNodes(
'<b>123</b>Hello, World!<p>I <3 Ponies</p><div id=123>245</div>')
produces
'<b>124</b>Hello, World!<p>I <4 Ponies</p><div id=123>246</div>'
It will get confused around where special elements like <script> end and won't recognize digits that are entity encoded, but should work otherwise.
You don't necessarily need RegExp to get the text contents of an element excluding its descendant elements' — in fact I'd advise against it as RegExp matching for HTML is notoriously difficult — there are DOM solutions:
function getImmediateText(element){
var text = '';
// Text and elements are all DOM nodes. We can grab the lot of immediate descendants and cycle through them.
for(var i = 0, l = element.childNodes.length, node; i < l, node = element.childNodes[i]; ++i){
// nodeType 3 is text
if(node.nodeType === 3){
text += node.nodeValue;
}
}
return text;
}
var bodyText = getImmediateText(document.getElementsByTagName('body')[0]);
So here there's a function that will return only the immediate text content as a string. Of course, you could then strip that for numbers with the RegExp using something like this:
var numberString = bodyText.match(/\d+/g).join('');
Just to answer my old question:
It is possible to achieve it by lookahead.
/\d(?=[^<>]*(<|$))/g
to replace the numbers
html.replace(/\d(?=[^<>]*(<|$))/g, function($0) {
return map[$0]
});
the source of the answer https://www.drupal.org/node/619198#comment-5710052

Categories

Resources