Combined two HTML data value with Regex

Combined two HTML data value with Regex - javascript

I am trying to merege two HTML data into one HTML tag. I am trying to do it with regex.
The HTML is look like this
var temp = document.getElementById('data').innerHTML;
console.log(temp);
<div id="data">
<strong>20 </strong>
<strong>0 </strong>
<strong>0 /-</strong>
</div>
My expexted output is <strong>2000/-</strong>

// you should get the textContent instead of the innerHTML
var temp = document.getElementById('data').textContent;
// removes all spaces and
// surrounds the output with the <strong> element
var newhtml = `<strong>${ temp.replace(/\s/g,'') }</strong>`;
// replaces the innerHTML of the data with newhtml
document.getElementById('data').innerHTML = newhtml
<div id="data">
<strong>20 </strong>
<strong>0 </strong>
<strong>0 /-</strong>
</div>

You do not need regex for this you could simply concat the strings with
str.concat(string2, string3, string4, etc)
and giving all the strong tags an individual ID
However, if the content is dynamic and not hard coded in, you could loop through the child nodes of document.getElementById('data') and get the textContent of each of the child nodes and concat them that way.

Related

innerText is concatenating words

So HTML.outerHTML outputs <td><p><b>Ranges of</b><br><b>Exercise</b> <b>Prices</b></p></td>
HTML.innerText outputs "Ranges ofExercise Prices"
HTML is an HTMLElement
The result I'm trying to get is "Ranges of Exercise Prices". innerText and textContent both give the same incorect result. How can I prevent the concatenation of strings with those line break tags present? (Can't change the HTML)

This is a whack a mole way, but you can replace the br element with a space so it will show up as a space when you replace it.
const temp = document.querySelector('.test').innerHTML;
const cloned = (new DOMParser()).parseFromString(temp, 'text/html');
cloned.documentElement.querySelectorAll('br').forEach(function (br) {
br.replaceWith(' ');
});
console.log(cloned.body.textContent);
<div class="test"><p><b>Ranges of</b><br><b>Exercise</b> <b>Prices</b></p></div>

function getTextContent() {
alert(document.getElementById("demo").textContent)
}
<p id="demo"><b>Ranges of </b><br><b>Exercise</b> <b>Prices</b></p>
<button onclick="getTextContent()">Get textContent</button>
You can use textContent to get only text inside the p tag

How to selectively replace text with HTML?

I have an application where users can write comments containing HTML code, which is escaped before displaying:
<div class="card-body">
<p class="card-text">
<h1>HOLA</h1> Cita:#2<h1>HOLA</h1> <h1>HOLA</h1> Cita:#6<h1>HOLA</h1> <h1>HOLA</h1>
</p>
</div>
But when a user write a specific word like "Cita:#1" I want to transform it with jQuery to a link, so later I can load an Ajax popup there with this code:
$('.card-text').each(function() {
$(this).html($(this).text().replace(/Cita:#(\d+)/ig, 'Cita:#$1'));
});
My problem is that it does it well but also transform all possible HTML tags inside that comment too.
Is there a way to just ignore all tags that can be inside the comment and only replace the word "Cita:#1" with a link and made it works?
Actual:
Expected:

Since you control the server-side here, this would be much easier to do in PHP:
$string = '<h1>HOLA</h1> Cita:#2<h1>HOLA</h1> <h1>HOLA</h1> Cita:#6<h1>HOLA</h1> <h1>HOLA</h1>';
$string = htmlspecialchars($string);
$string = preg_replace(
'/Cita:#(\\d+)/i',
'Cita:#$1',
$string
);
echo $string;
Output:
<h1>HOLA</h1> Cita:#2<h1>HOLA</h1> <h1>HOLA</h1> Cita:#6<h1>HOLA</h1> <h1>HOLA</h1>

What you need to do is separate the matching parts of the text and process them as HTML strings, but escape* the rest of the text. Since regular expression matches provide the index of the match, this is easy enough to do.
$('.card-text').each(function() {
var before = "";
var after = $(this).text();
var link = "";
// look for each match in the string
while (match = after.match(/\b(Cita:#(\d+))\b/i)) {
// isolate the portion before the match and escape it, this will become the output
before += $("<div>").text(after.substring(0, match.index)).html();
// isolate the portion after the match, for use in the next loop
after = after.substring(match.index + match[1].length);
// build a link and append it to our eventual output
before += match[1].replace(/Cita:#(\d+)/i, 'Cita:#$1');
}
// deal with the final bit of the string, which needs to be escaped
this.innerHTML = before + $("<div>").text(after).html();
});
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div class="card-body">
<p class="card-text">
<span class="foo">Here is some text. <b>Cita:#1</b> and <b>Cita:#2</b></span>
</p>
</div>
*We use jQuery to build an element, pass the string as the text content of that element, and finally get the HTML output:
htmlString = "<div>foo</div>";
console.log(
$("<div>").text(htmlString).html()
);
Output:
<div>foo</div>

Compare string with HTML text

I have a string of text that I want to compare with another string that has HTML code. The problem is that the text I need to compare it to in the HTML code is within different tags. Also, if the string exists in the HTML code then I want to wrap it inside a <mark> tag.
This is the example I am using:
var html = "<h1>This is a heading</h1><div class="subtitle">and this is the subheading</div><p class="small">this is some example text</p>";
var lookup = "is a heading and this is the subheading this is some";
var finalHtml = ""; //will contain new html
//Need to do some comparison and then add a <mark> tag around found string.
console.log(finalHtml);
//This should print "<h1>This <mark>is a heading</h1><div class="subtitle">and this is the subheading</div><p class="small">this is some</mark> example text</p>"
I am using Javascript/Jquery to do this.

This will only help to search your lookup within html (i.e., no marking). I have removed tags-spaces & then checked.
var html = '<h1>This is a heading</h1><div class="subtitle">and this is the subheading</div><p class="small">this is some example text</p>';
//remove html tags & spaces.
cleanHtml = html.replace(/<\/?[^>]+(>|$)/g, "").replace(/\s/g,"");
var lookup = "is a heading and this is the subheading this is some";
lookup = lookup.replace(/\s/g,'');
if(cleanText.includes(lookup)){
//match found
}

Using regex with javascript on nodejs find html attribute and prepend something to its value

I have some markup in JS as follows:
<div class="col-sm-4">
<span id="some-media" class="media">Text</span>
</div>
I would like to select the class attribute of the span and prepend its value with lets say the characters: "::". So after the regex replace i would end up with:
<div class="col-sm-4">
<span id="some-media" class="::media">Text</span>
</div>
EDIT: Note that the order of the attributes in the HTML element is variable so my span attributes could very well have different order like so:
<div class="col-sm-4">
<span class="::media" id="some-media" >Text</span>
</div>

You got a regex solution, this is a DOMmy one:
var html = `<div class="col-sm-4">
<span id="some-media" class="media">Text</span>
</div>`
var doc = (new DOMParser()).parseFromString(html, "text/html");
var el = doc.getElementsByTagName('span')[0];
el.setAttribute('class', '::' + el.className);
console.log(
doc.getElementsByClassName('::media').length > 0 // check if modification's done
);
Since you have no way except Regular Expressions this can be considered as a workaround:
(<span[^>]*class=.)([^'"]+)
JS:
var html = `<div class="col-sm-4">
<span id="some-media" class="media">Text</span>
</div>
<span class="media" id="some-media">Text</span>
`;
console.log(
html.replace(/(<span[^>]*class=.)([^'"]+)/g, `$1::$2`)
);

This isn't using regex, but you can do it like this in vanilla JavaScript:
const el = document.getElementsByClassName('media')[0];
el.className = '::' + el.className;
Or in jQuery:
const $el = $('div span.media');
$el.attr('class', '::' + $el.attr('class'));
Hope this helps.

Don't parse html with regex, use DocumentFragment (or DOMParser) object instead:
var html_str = '<div class="col-sm-4"><span class="media">Text</span></div>',
df = document.createRange().createContextualFragment(html_str),
span = df.querySelector('span');
span.setAttribute('class', '::' + span.getAttribute('class'));
console.log(df.querySelector('div').outerHTML);

I think this is what you're after:
var test = $("#some-media")[0].outerHTML();
var test2 = '<div id="some-media" class="media">Text</div>'
if(/span/.test(test)) //Valid as contains 'span'
alert(test.replace(/(class=")/g, "$1::"));
if(/span/.test(test2)) //Not valid
alert(test.replace(/(class=")/g, "$1::"));

Since the order differs, writing a regex that captures all possible combinations of syntax might be rather difficult.
So we'd need a full list of rules the span follows so we can identify that span?
Got some more info about if the span occurs in a longer HTML string? Or is the string this span and this span only?
An alternative would be to use one of the several node DOM modules available, so you can work with HTML nodes and be able to use any of the above solutions to make the problem simpler.
But since you're using node:
1) Are you using any templating engines? If so, why not rerender the entire template?
2) Why does the class name have to change on the server side? Isn't there a workaround on the clientside where you do have access to the DOM natively? Or if it's just to add styling, why not add another css file that overwrites the styling of spans with className 'media'?
3) If all of the above is not applicable and it;s a trivial problem like you say, what error di you get using a simple replace?
strHTML.replace( 'class="media"', 'class="::media"' )
or if it has to be regex:
strHTML.replace( /class=\"(.*)\"/, 'class=\"::$1\"' );

Insert span in a dom element without overwrite child nodes?

I have an HTML article with some annotations that I retrieve with SPARQL queries. These annotations refer to some text in the document, and I have to highlight this text (wrapping it in a span).
I had already asked how to wrap text in a span, but now I have a more specific problem that I do not know how to solve.
The code I wrote was:
var currentText = $("#"+v[4]["element"]+"").text();
var newText = currentText.substring(0, v[5]["start"]) + "<span class=' annotation' >" + currentText.substring(v[5]["start"], v[6]["end"]) + "</span>" + currentText.substring(v[6]["end"], currentText.length);
$("#"+v[4]["element"]+"").html(newText);
Where:
v[4]["element"] is the id of the parent element of the annotation
v[5]["start"] is the position of the first character of the annotation
v[6]["end"] is the position of the last character of the annoation
Note that start and end don't consider html tags.
In fact my mistake consists in extracting data from the node with the text() method (to be able to go back to the correct position of the annotation) and put back with the html() method; but in this manner if parent node has children nodes, they will be lost and overwritten by simple text.
Example:
having an annotation on '2003'
<p class="metadata-entry" id="k673f4141ea127b">
<span class="generated" id="bcf5791f3bcca26">Publication date (<span class="data" id="caa7b9266191929">collection</span>): </span>
2003
</p>
It becomes:
<p class="metadata-entry" id="k673f4141ea127b">
Publication date (collection):
<span class="annotation">2003</span>
</p>
I think I should work with nodes instead of simply extract and rewrite the content, but I don't know how to identify the exact point where to insert the annotation without considering html tags and without eliminating child elements.
I read something about the jQuery .contents() method, but I didn't figure out how to use it in my code.
Can anyone help me with this issue? Thank you
EDIT: Added php code to extract body of the page.
function get_doc_body(){
if (isset ($_GET ["doc_url"])) {
$doc_url = $_GET ["doc_url"];
$doc_name = $_GET ["doc_name"];
$doc = new DOMDocument;
$mock_doc = new DOMDocument;
$doc->loadHTML(file_get_contents($doc_url.'/'.$doc_name));
$doc_body = $doc->getElementsByTagName('body')->item(0);
foreach ($doc_body->childNodes as $child){
$mock_doc->appendChild($mock_doc->importNode($child, true));
}
$doc_html = $mock_doc->saveHTML();
$doc_html = str_replace ('src="images','src="'.$doc_url.'/images',$doc_html);
echo($doc_html);
}
}

Instead of doing all these, you can either use $(el).append() or $(el).prepend() for inserting the <span> tag!
$("#k673f4141ea127b").append('<span class="annotation">2003</span>');
Or, If I understand correctly, you wanna wrap the final 2003 with a span.annotation right? If that's the case, you can do:
$("#k673f4141ea127b").contents().eq(1).wrap('<span class="annotation" />');
Fiddle:
$(document).ready(function() {
$("#k673f4141ea127b").contents().eq(1).wrap('<span class="annotation" />');
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<p class="metadata-entry" id="k673f4141ea127b">
<span class="generated" id="bcf5791f3bcca26">Publication date (<span class="data" id="caa7b9266191929">collection</span>): </span>
2003
</p>

At the end my solution is in this Fiddle.
Generalizing:
var element = document.getElementById(id);
var totalText = element.textContent;
var toFindText = totalText.substring(start,end);
var toReplaceText = "<span class='annotation'>"+toFindText+"</span>";
element.innerHTML = element.innerHTML.replace(toFindText, toReplaceText);
Hope it could help someone else.
Note: This don't check if two or more annotations refers to the same node, I'm working on it right now.

Develop Reference

JavaScript is the programming language of the Web.

Combined two HTML data value with Regex - javascript

Related

innerText is concatenating words

How to selectively replace text with HTML?

Compare string with HTML text

Using regex with javascript on nodejs find html attribute and prepend something to its value

Insert span in a dom element without overwrite child nodes?

Categories

Resources