Check if a text is HTML or not

Check if a text is HTML or not - javascript

I am using Meteor and I am trying to check if a text is html. But usual ways do not work. This is my code:
post: function() {
var postId = Session.get("postId");
var post = Posts.findOne({
_id: postId
});
var object = new Object();
if (post) {
object.title = post.title;
if ($(post.content).has("p")) { //$(post.content).has("p") / post.content instanceof HTMLElement
object.text = $(post.content).text();
if (post.content.match(/<img src="(.*?)"/)) {
object.image = post.content.match(/<img src="(.*?)"/)[1];
}
} else {
console.log("it is not an html------------------------");
object.text = post.content;
}
}
return object;
}
Actually, this is the most "working" solution I have used up to now. Also, I pointed out the two most common ways which I use (next to the if statement). Is it possible to happen without regex.

Can use approach you already started with jQuery but append response to a new <div> and check if that element has children. If jQuery finds children it is html.
If it is html you can then search that div for any type of element using find().
// create element and inject content into it
var $div=$('<div>').html(post.content);
// if there are any children it is html
if($div.children().length){
console.log('this is html');
var $img = $div.find('img');
console.log('There are ' + $img.length +' image(s)');
}else{
console.log('this is not html');
}

Use the jquery $.parseHTML function to parse the string into an array of DOM nodes and check if it has any HTMLElement.
var htmlText = "----<b>abc</b>----<h3>GOOD</h3>----";
htmlText = prompt("Please enter something:", "----<b>abc</b>----");
var htmlArray = $.parseHTML(htmlText);
var isHtml = htmlArray.filter(function(e){ return e instanceof HTMLElement;}).length;
console.log(htmlText);
//console.log(htmlArray);
if (isHtml)
console.log(isHtml + " HTML Element(s) found.");
else
console.log("No HTML Elements found!");
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>

Related

Adding ID based on innerHTML to all elements in class

I need to write a code to add IDs to all element in one class. The IDs have to be based on innerText.
Elements look like that:
<lable class="sf-label-radio">Name1<span>Some Other Text that I do not need</span><label>
<lable class="sf-label-radio">Name2<span>Some Other Text that I do not need</span><label>
etc.
Here is my code:
<script>
addIDtoGI();
function addIDtoGI() {
let searchButtons = document.getElementsByClassName('sf-label-radio');
for(i = 0; i < searchButtons.length; i++) {
x = searchButtons[i].innerHTML;
x = x.substr(0, x.search("<")).replace(/\s+/g, '-').toLowerCase();
x = onlyEngLetters(x);
searchButtons[i].setAttribute('id',x);
}
}
function onlyEngLetters(text) {
text=text.replace("ę","e");
text=text.replace("ó","o");
text=text.replace("ą","a");
text=text.replace("ś","s");
text=text.replace("ł","l");
text=text.replace("ż","z");
text=text.replace("ź","z");
text=text.replace("ć","c");
text=text.replace("ń","n");
return text;
}
</script>
Thank You for help!

Iterate the childnodes until you get to the first textNode that isn't empty to get the text you want. Note also thaat replace() only works on first instance found and you probably want to convery to lower case to match your replacements
addIDtoGI()
function addIDtoGI(){
document.querySelectorAll('.sf-label-radio').forEach(el=>{
let txtNode = el.childNodes[0];
while(!txtNode.textContent.trim()){
txtNode = txt.nextSibling
}
el.id = onlyEngLetters(txtNode.textContent);
console.log(el.id)
});
}
function onlyEngLetters(text) {
return text.toLowerCase()
.replaceAll("ę","e")
.replaceAll("ó","o")
.replaceAll("ą","a")
.replaceAll("ś","s")
.replaceAll("ł","l")
.replaceAll("ż","z")
.replaceAll("ź","z")
.replaceAll("ć","c")
.replaceAll("ń","n")
}
<label class="sf-label-radio">Name1<span>Some Other Text that I do not need</span><label>
<label class="sf-label-radio">Name2<span>Some Other Text that I do not need</span><label>

First, you define your function but you never call it.
In your script, add the "()" to "addIDToGI;": addIGToGI();
There's also a typo on searchButtons.lenght, it should be length.
It should resolve your errors.
EDIT: Also, as someone mentionned in the comments, <lable> should be <label>.

First you must call function with () and when using for loop must use variables=>
for(let i = 0; i < searchButtons.length; i++) . length is true not lenght. and of course
let x = searchButtons[i].innerHTML;
and ...

const addIdToClassByInnerHTML = cls => {
const elements = document.querySelectorAll('.' + cls);
elements.forEach(el=>{
el.setAttribute('id', el.innerHTML);
});
}
note that do not use this function if the element contains child.
if your element contains one or more child(ren), use this:
const addIdToClassByInnerHTML = cls => {
const elements = document.querySelectorAll('.' + cls);
elements.forEach(el=>{
let html = el.innerHTML;
el.querySelectorAll('*').forEach(sub=>{
html = html.replace(sub.outerHTML, '');
});
el.setAttribute('id', html);
});
}
codepen demo
snippets
const addIdToClassByInnerHTML = cls => {
const elements = document.querySelectorAll('.' + cls);
elements.forEach(el=>{
let html = el.innerHTML;
el.querySelectorAll('*').forEach(sub=>{
html = html.replace(sub.outerHTML, '');
});
el.setAttribute('id', html);
});
}
addIdToClassByInnerHTML('sf-label-radio');
console.log(document.querySelectorAll('.sf-label-radio')[0]);
label{
display: block;
}
<label class="sf-label-radio">Name1<span>Some Other Text that I do not need</span><label>
<label class="sf-label-radio">Name2<span>Some Other Text that I do not need</span><label>

Change liferay-ui:input-localized XML with javascript

I have the following tag in my view.jsp:
<liferay-ui:input-localized id="message" name="message" xml="" />
And I know that I can set a XML and have a default value on my input localized. My problem is that I want to change this attribute with javascript. I am listening for some changes and call the function "update()" to update my information:
function update(index) {
var localizedInput= document.getElementById('message');
localizedInput.value = 'myXMLString';
}
Changing the value is only updating the currently selected language input (with the whole XML String). The XML String is correct, but I am not sure on how to update the XML for the input with javascript.
Is this possible?
PS: I have posted this in the Liferay Dev forum to try and reach more people.

After a week of studying the case and some tests, I think that I found a workaround for this. Not sure if this is the correct approach, but it is working for me so I will post my current solution for future reference.
After inspecting the HTML, I noticed that the Liferay-UI:input-localized tag creates an input tag by default, and then one more input tag for each language, each time you select a new language. Knowing that I created some functions with Javascript to help me update the inputs created from my liferay-ui:input-localized. Here is the relevant code:
function updateAnnouncementInformation(index) {
var announcement = announcements[index];
// the announcement['message'] is a XML String
updateInputLocalized('message', announcement['message']);
}
function updateInputLocalized(input, message) {
var inputId = '<portlet:namespace/>' + input;
var xml = $.parseXML(message);
var inputCurrent = document.getElementById(inputId);
var selectedLanguage = getSelectedLanguage(inputId);
var inputPT = document.getElementById(inputId + '_pt_PT');
inputPT.value = $(xml).find("Title[language-id='pt_PT']").text();
var inputEN = document.getElementById(inputId + '_en_US');
if (inputEN !== null) inputEN.value = $(xml).find("Title[language-id='en_US']").text();
else waitForElement(inputId + '_en_US', inputCurrent, inputId, xml);
var inputLabel = getInputLabel(inputId);
if (selectedLanguage == 'pt-PT') inputLabel.innerHTML = '';
else inputLabel.innerHTML = inputPT.value;
if (selectedLanguage == 'pt-PT') inputCurrent.value = inputPT.value;
else if (inputEN !== null) inputCurrent.value = inputEN.value;
else waitForElement(inputId + '_en_US', inputCurrent, inputId, xml);
}
function getSelectedLanguage(inputId) {
var languageContainer = document.getElementById('<portlet:namespace/>' + inputId + 'Menu');
return languageContainer.getElementsByClassName('btn-section')[0].innerHTML;
}
function getInputLabel(inputId) {
var boundingBoxContainer = document.getElementById(inputId + 'BoundingBox').parentElement;
return boundingBoxContainer.getElementsByClassName('form-text')[0];
}
function waitForElement(elementId, inputCurrent, inputId, xml) {
window.setTimeout(function() {
var element = document.getElementById(elementId);
if (element) elementCreated(element, inputCurrent, inputId, xml);
else waitForElement(elementId, inputCurrent, inputId, xml);
}, 500);
}
function elementCreated(inputEN, inputCurrent, inputId, xml) {
inputEN.value = $(xml).find("Title[language-id='en_US']").text();
var selectedLanguage = getSelectedLanguage(inputId);
if (selectedLanguage == 'en-US') inputCurrent.value = inputEN.value;
}
With this I am able to update the liferay-ui:input-localized inputs according to a pre-built XML String. I hope that someone finds this useful and if you have anything to add, please let me know!

To change the text value of an element, you must change the value of the elements's text node.
Example -
xmlDoc.getElementsByTagName("title")[0].childNodes[0].nodeValue = "new content"
Suppose "books.xml" is loaded into xmlDoc
Get the first child node of the element
Change the node value to "new content"

Function to remove <span></span> from string in an json object array in JavaScript

I know there are many similar questions posted, and have tried a couple solutions, but would really appreciate some guidance with my specific issue.
I would like to remove the following HTML markup from my string for each item in my array:
<SPAN CLASS="KEYWORDSEARCHTERM"> </SPAN>
I have an array of json objects (printArray) with a printArray.header that might contain the HTML markup.
The header text is not always the same.
Below are 2 examples of what the printArray.header might look like:
<SPAN CLASS="KEYWORDSEARCHTERM">MOST EMPOWERED</SPAN> COMPANIES 2016
RECORD WINE PRICES AT <SPAN CLASS="KEYWORDSEARCHTERM">NEDBANK</SPAN> AUCTION
I would like the strip the HTML markup, leaving me with the following results:
MOST EMPOWERED COMPANIES 2016
RECORD WINE PRICES AT NEDBANK AUCTION
Here is my function:
var newHeaderString;
var printArrayWithExtract;
var summaryText;
this.setPrintItems = function(printArray) {
angular.forEach(printArray, function(printItem){
if (printItem.ArticleText === null) {
summaryText = '';
}
else {
summaryText = '... ' + printItem.ArticleText.substring(50, 210) + '...';
}
// Code to replace the HTML markup in printItem.header
// and return newHeaderString
printArrayWithExtract.push(
{
ArticleText: printItem.ArticleText,
Summary: summaryText,
Circulation: printItem.Circulation,
Headline: newHeaderString,
}
);
});
return printArrayWithExtract;
};

Try this function. It will remove all markup tags...
function strip(html)
{
var tmp = document.createElement("DIV");
tmp.innerHTML = html;
return tmp.textContent || tmp.innerText || "";
}
Call this function sending the html as a string. For example,
var str = '<SPAN CLASS="KEYWORDSEARCHTERM">MOST EMPOWERED</SPAN> COMPANIES 2016';
var expectedText = strip(str);
Here you find your expected text.

It can be done using regular expressions, see below:
var s1 = '<SPAN CLASS="KEYWORDSEARCHTERM">MOST EMPOWERED</SPAN> COMPANIES 2016';
var s2 = 'RECORD WINE PRICES AT <SPAN CLASS="KEYWORDSEARCHTERM">NEDBANK</SPAN> AUCTION';
function removeSpanInText(s) {
return s.replace(/<\/?SPAN[^>]*>/gi, "");
}
$("#x1").text(removeSpanInText(s1));
$("#x2").text(removeSpanInText(s2));
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
1 ->
<span id="x1"></span>
<br/>2 ->
<span id="x2"></span>
For more info, see e.g. Javascript Regex Replace HTML Tags.
And jQuery is not needed, just used here to show the output.

I used this little replace function:
if (printItem.Headline === null) {
headlineText = '';
}
else {
var str = printItem.Headline;
var rem1 = str.replace('<SPAN CLASS="KEYWORDSEARCHTERM">', '');
var rem2 = rem1.replace('</SPAN>', '');
var newHeaderString = rem2;
}

Restore exact innerHTML to DOM

I'd like to save the html string of the DOM, and later restore it to be exactly the same. The code looks something like this:
var stringified = document.documentElement.innerHTML
// later, after serializing and deserializing
document.documentElement.innerHTML = stringified
This works when everything is perfect, but when the DOM is not w3c-comliant, there's a problem. The first line works fine, stringified matches the DOM exactly. But when I restore from the (non-w3c-compliant) stringified, the browser does some magic and the resulting DOM is not the same as it was originally.
For example, if my original DOM looks like
<p><div></div></p>
then the final DOM will look like
<p></p><div></div><p></p>
since div elements are not allowed to be inside p elements. Is there some way I can get the browser to use the same html parsing that it does on page load and accept broken html as-is?
Why is the html broken in the first place? The DOM is not controlled by me.
Here's a jsfiddle to show the behavior http://jsfiddle.net/b2x7rnfm/5/. Open your console.
<body>
<div id="asdf"><p id="outer"></p></div>
<script type="text/javascript">
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
document.getElementById('outer').appendChild(insert);
var e = document.getElementById('asdf')
console.log(e.innerHTML);
e.innerHTML = e.innerHTML;
console.log(e.innerHTML); // This is different than 2 lines above!!
</script>
</body>

If you need to be able to save and restore an invalid HTML structure, you could do it by way of XML. The code which follows comes from this fiddle.
To save, you create a new XML document to which you add the nodes you want to serialize:
var asdf = document.getElementById("asdf");
var outer = document.getElementById("outer");
var add = document.getElementById("add");
var save = document.getElementById("save");
var restore = document.getElementById("restore");
var saved = undefined;
save.addEventListener("click", function () {
if (saved !== undefined)
return; /// Do not overwrite
// Create a fake document with a single top-level element, as
// required by XML.
var parser = new DOMParser();
var doc = parser.parseFromString("<top/>", "text/xml");
// We could skip the cloning and just move the nodes to the XML
// document. This would have the effect of saving and removing
// at the same time but I wanted to show what saving while
// preserving the data would look like
var clone = asdf.cloneNode(true);
var top = doc.firstChild;
var child = asdf.firstChild;
while (child) {
top.appendChild(child);
child = asdf.firstChild;
}
saved = top.innerHTML;
console.log("saved as: ", saved);
// Perform the removal here.
asdf.innerHTML = "";
});
To restore, you create an XML document to deserialize what you saved and then add the nodes to your document:
restore.addEventListener("click", function () {
if (saved === undefined)
return; // Don't restore undefined data!
// We parse the XML we saved.
var parser = new DOMParser();
var doc = parser.parseFromString("<top>" + saved + "</top>", "text/xml");
var top = doc.firstChild;
var child = top.firstChild;
while (child) {
asdf.appendChild(child);
// Remove the extra junk added by the XML parser.
child.removeAttribute("xmlns");
child = top.firstChild;
}
saved = undefined;
console.log("inner html after restore", asdf.innerHTML);
});
Using the fiddle, you can:
Press the "Add LadyGaga..." button to create the invalid HTML.
Press "Save and Remove from Document" to save the structure in asdf and clear its contents. This prints to the console what was saved.
Press "Restore" to restore the structure that was saved.
The code above aims to be general. It would be possible to simplify the code if some assumptions can be made about the HTML structure to be saved. For instance blah is not a well-formed XML document because you need a single top element in XML. So the code above takes pains to add a top-level element (top) to prevent this problem. It is also generally not possible to just parse an HTML serialization as XML so the save operation serializes to XML.
This is a proof-of-concept more than anything. There could be side-effects from moving nodes created in an HTML document to an XML document or the other way around that I have not anticipated. I've run the code above on Chrome and FF. I don't have IE at hand to run it there.

This won't work for your most recent clarification, that you must have a string copy. Leaving it, though, for others who may have more flexibility.
Since using the DOM seems to allow you to preserve, to some degree, the invalid structure, and using innerHTML involves reparsing with (as you've observed) side-effects, we have to look at not using innerHTML:
You can clone the original, and then swap in the clone:
var e = document.getElementById('asdf')
snippet.log("1: " + e.innerHTML);
var clone = e.cloneNode(true);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
document.getElementById('outer').appendChild(insert);
snippet.log("2: " + e.innerHTML);
e.parentNode.replaceChild(clone, e);
e = clone;
snippet.log("3: " + e.innerHTML);
Live Example:
var e = document.getElementById('asdf')
snippet.log("1: " + e.innerHTML);
var clone = e.cloneNode(true);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
document.getElementById('outer').appendChild(insert);
snippet.log("2: " + e.innerHTML);
e.parentNode.replaceChild(clone, e);
e = clone;
snippet.log("3: " + e.innerHTML);
<div id="asdf">
<p id="outer">
<div>ladygaga</div>
</p>
</div>
<!-- Script provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
Note that just like the innerHTML solution, this will wipe out event handlers on the elements in question. You could preserve handlers on the outermost element by creating a document fragment and cloning its children into it, but that would still lose handlers on the children.
This earlier solution won't apply to you, but may apply to others in the future:
My earlier solution was to track what you changed, and undo the changes one-by-one. So in your example, that means removing the insert element:
var e = document.getElementById('asdf')
console.log("1: " + e.innerHTML);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
var outer = document.getElementById('outer');
outer.appendChild(insert);
console.log("2: " + e.innerHTML);
outer.removeChild(insert);
console.log("3: " + e.innerHTML);
var e = document.getElementById('asdf')
snippet.log("1: " + e.innerHTML);
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
var outer = document.getElementById('outer');
outer.appendChild(insert);
snippet.log("2: " + e.innerHTML);
outer.removeChild(insert);
snippet.log("3: " + e.innerHTML);
<div id="asdf">
<p id="outer">
<div>ladygaga</div>
</p>
</div>
<!-- Script provides the `snippet` object, see http://meta.stackexchange.com/a/242144/134069 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>

Try utilizing Blob , URL.createObjectURL to export html ; include script tag in exported html which removes <div></div><p></p> elements from rendered html document
html
<body>
<div id="asdf">
<p id="outer"></p>
</div>
<script>
var insert = document.createElement("div");
var text = document.createTextNode("ladygaga");
insert.appendChild(text);
document.getElementById("outer").appendChild(insert);
var elem = document.getElementById("asdf");
var r = document.querySelectorAll("[id=outer] ~ *");
// remove last `div` , `p` elements from `#asdf`
for (var i = 0; i < r.length; ++i) {
elem.removeChild(r[i])
}
</script>
</body>
js
var e = document.getElementById("asdf");
var html = e.outerHTML;
console.log(document.body.outerHTML);
var blob = new Blob([document.body.outerHTML], {
type: "text/html"
});
var objUrl = window.URL.createObjectURL(blob);
var popup = window.open(objUrl, "popup", "width=300, height=200");
jsfiddle http://jsfiddle.net/b2x7rnfm/11/

see this example: http://jsfiddle.net/kevalbhatt18/1Lcgaprc/
MDN cloneNode
var e = document.getElementById('asdf')
console.log(e.innerHTML);
backupElem = e.cloneNode(true);
// Your tinkering with the original
e.parentNode.replaceChild(backupElem, e);
console.log(e.innerHTML);

You can not expect HTML to be parsed as a non-compliant HTML. But since the structure of compiled non-compliant HTML is very predictable you can make a function which makes the HTML non-compliant again like this:
function ruinTheHtml() {
var allElements = document.body.getElementsByTagName( "*" ),
next,
afterNext;
Array.prototype.map.call( allElements,function( el,i ){
if( el.tagName !== 'SCRIPT' && el.tagName !== 'STYLE' ) {
if(el.textContent === '') {
next = el.nextSibling;
afterNext = next.nextSibling;
if( afterNext.textContent === '' ) {
el.parentNode.removeChild( afterNext );
el.appendChild( next );
}
}
}
});
}
See the fiddle:
http://jsfiddle.net/pqah8e25/3/

You have to clone the node instead of copying html. Parsing rules will force the browser to close p when seeing div.
If you really need to get html from that string and it is valid xml, then you can use following code ($ is jQuery):
var html = "<p><div></div></p>";
var div = document.createElement("div");
var xml = $.parseXML(html);
div.appendChild(xml.documentElement);
div.innerHTML === html // true

You can use outerHTML, it perseveres the original structure:
(based on your original sample)
<div id="asdf"><p id="outer"></p></div>
<script type="text/javascript">
var insert = document.createElement('div');
var text = document.createTextNode('ladygaga');
insert.appendChild(text);
document.getElementById('outer').appendChild(insert);
var e = document.getElementById('asdf')
console.log(e.outerHTML);
e.outerHTML = e.outerHTML;
console.log(e.outerHTML);
</script>
Demo: http://jsfiddle.net/b2x7rnfm/7

Remove node function on parent element

I'm new to JS. I'm trying to delete the parent node with all the children by clicking a button. But the console tells me that undefined is not a function. What am I missing?
Fiddle:
http://jsfiddle.net/vy0d8bqt/
HTML:
<button type="button" id="output">Get contacts</button>
<button type="button" id="clear_contacts">clear contact</button>
<div id="output_here"></div>
JS:
// contact book, getting data from JSON and outputting via a button
// define a JSON structure
var contacts = {
"friends" :
[
{
"name" : "name1",
"surname" : "surname1"
},
{
"name" : "name2",
"surname" : "surname2"
}
]
};
//get button ID and id of div where content will be shown
var get_contacts_btn = document.getElementById("output");
var output = document.getElementById("output_here");
var clear = document.getElementById("clear_contacts");
var i;
// get length of JSON
var contacts_length = contacts.friends.length;
get_contacts_btn.addEventListener('click', function(){
//console.log("clicked");
for(i = 0; i < contacts_length; i++){
var data = contacts.friends[i];
var name = data.name;
var surname = data.surname;
output.style.display = 'block';
output.innerHTML += "<p> name: " + name + "| surname: " + surname + "</p>";
}
});
//get Children of output div to remove them on clear button
//get output to clear
output_to_clear = document.getElementById("output_here");
clear.addEventListener('click', function(){
output_to_clear.removeNode(true);
});

You should use remove() instead of removeNode()
http://jsfiddle.net/vy0d8bqt/1/
However, this also removes the output_to_clear node itself. You can use output_to_clear.innerHTML = '' if you like to just delete all content of the node, but not removing the node itself (so you can click 'get contacts' button again after clearing it)
http://jsfiddle.net/vy0d8bqt/3/

You want this for broad support:
output_to_clear.parentNode.removeChild(output_to_clear);
Or this in modern browsers only:
output_to_clear.remove();
But either way, make sure you don't try to remove it after it has already been removed. Since you're caching the reference, that could be an issue, so this may be safer:
if (output_to_clear.parentNode != null) {
output_to_clear.remove();
}
If you were hoping to empty its content, then do this:
while (output_to_clear.firstChild) {
output_to_clear.removeChild(output_to_clear.firstChild);
}

I think using jQuery's $.remove() is probably the best choice here. If you can't or don't want to use jQuery, The Mozilla docs for Node provides a function to remove all child nodes.
Element.prototype.removeAll = function () {
while (this.firstChild) { this.removeChild(this.firstChild); }
return this;
};
Which you would use like:
output_to_clear.removeAll();
For a one-off given the example provided:
while (output_to_clear.firstChild) { output_to_clear.removeChild(output_to_clear.firstChild); }

Develop Reference

JavaScript is the programming language of the Web.

Check if a text is HTML or not - javascript

Related

Adding ID based on innerHTML to all elements in class

Change liferay-ui:input-localized XML with javascript

Function to remove <span></span> from string in an json object array in JavaScript

Restore exact innerHTML to DOM

Remove node function on parent element

Categories

Resources