Does jQuery strip some html elements from a string when using .html()?

Does jQuery strip some html elements from a string when using .html()? - javascript

I have a var that contains a full html page, including the head, html, body, etc. When I pass that string into the .html() function, jQuery strips out all those elements, such as body, html, head, etc, which I don't want.
My data var contains:
<html>
<head>
<title>Untitled Document</title>
</head>
<body>
</body>
</html>
Then my jQuery is:
// data is a full html document string
data = $('<div/>').html(data);
// jQuery stips my document string!
alert(data.find('head').html());
I am needing to manipulate a full html page string, so that I can return what is in the element. I would like to do this with jQuery, but it seems all of the methods, append(), prepend() and html() all try to convert the string to dom elements, which remove all the other parts of a full html page.
Is there another way that I could do this? I would be fine using another method. My final goal is to find certain elements inside my string, so I figured jQuery would be best, since I am so used to it. But, if it is going to trim and remove parts of my string, I am going to have to look for another method.
Ideas?

After a few quick tests it seems do me that this behavior isn't caused by jQuery but instead by the browser.
As you can easily verify yourself (DEMO http://jsbin.com/ocupa3)
var data = "<html><head><title>Untitled Document</title></head><body><p>test</p></body></html>";
data = $('<div/>').html(data);
alert(data.html());
yields different results in different browsers
Opera 10.10
<HEAD><TITLE>Untitled Document</TITLE></HEAD><P>test</P>
FF 3.6
<title>Untitled Document</title><p>test</p>
IE6
<P>test</P>
so this has nothing to do with jQuery, It's the browsers which strip some tags when you insert a whole html string inside a div. But you would need to step through the whole jQuery code for html() to be sure. And you would need to do that for all browsers as there are several different ways jQuery tries to do it's job.
For a solution I advise you to investigate using an iframe (possibly hidden) and to set that iframe content to the html-string you have. But be aware that fiddling with iframes and changing their content programmatically isn't an easy task either. There are also different browser related quirks and timing issues involved.

Here is a solution, which will include the body, head and other attributes:
mydoc = document.getElementById('NAME_OF_PREVIEW_FRAME').contentWindow.document; mydoc.write(HTML_CODE); mydoc.close();

Nope, the jQuery html function is just sending the string through to the element's innerHTML property, which is a function of the browser that tells it to parse the HTML into DOM elements and add them to the page.
Your browser doesn't work with a page as HTML data, it works with it as DOM and imports/exports HTML.
JavaScript has very good Regular Expression support. Depending on the complexity of your task, you may find this is the best way to process your data.

There is no need for the container div.
Have you tried this?:
var foo = $(data); // data is your full html document string
Then you can search inside of it like so:
$('.someClass', foo); // foo is the document you created earlier
Update:
As another answered mentioned, how this will act comes down to the browser.
I looked at the jQuery docs a bit and found this:
When the HTML is more complex than a
single tag without attributes, as it
is in the above example, the actual
creation of the elements is handled by
the browser's innerHTML mechanism.
Specifically, jQuery creates a new
<div> element and sets the innerHTML
property of the element to the HTML
snippet that was passed in.
So it seems that when you are using a whole html doc as a string, it's no different than setting the innerHTML property of a div you make using createElement.

Related

How to convert from mixed HTML-string/DOM-elements to DOM-elements in Javascript?

I wish to implement the following Javascript function:
function AllToDom(partsArray)
{
// Magic!
}
// Called like:
var rowObject = AllToDom(['<tr>', tdElem1, tdElem2, '<td>XXX</td>', '<td>',
divContents,'</td></tr>']);
// Where tdElem1, tdElem2, divContents are DOM node objects.
The thing is I want it to work on any kinds of combinations of DOM nodes and HTML fragments. As long as it produces a valid HTML of course. Unclosed HTML tags and disallowed element combinations (like <table><div></div>) are allowed to have undefined behavior.
My first idea was to concatenate it all in a HTML string, except in place of DOM elements add a placeholder comment <!--SNOOPY-->. So the above would result in the following string:
<tr><!--SNOOPY--><!--SNOOPY--><td>XXX</td><td><!--SNOOPY--></td></tr>
This is already a valid piece of HTML, so next I create a <div>, assign this to innerHTML, gather the produced DOM nodes, and iterate through them and replace all <!--SNOOPY--> with the respective DOM element.
There are two flaws with this approach however:
Adding a <tr> as a child element to a <div> is invalid. I don't know if it might not break on some condition.
Internet Explorer 8 (the least version that I need to support) strips all comments when assigning to innerHTML.
Are there any workarounds? Is this possible at all?

Finally found an answer: jQuery has already done all the dirty work in their parseHTML() method. And I just happen to be using jQuery anyway, so good for me! :)
I checked what the magic was behind the scenes, and it's really pretty gruesome. First, they inspect the HTML (with regexs...) to see what parent tag they need to use, and then they have a workaround for IE8, which apparently it DOES preserve comment nodes - but only if they come after a text node. All comments before the first text node are lost. And some tags are affected this way too, which I had no idea about. And then there's half a dozen other workarounds for IE & Webkit problems that I've never even heard of.
So, I'm just going to leave it to them to do the right thing, because trying to reproduce that stuff would be madness.

Can I take HTML, loop through it to change elements, and display the results as plain text?

I'm trying to develop a script that will take user submitted HTML, loop through it to identify matching tags, make adjustments to those matched tags, and then spit out the resulting HTML as plain text that the user can copy. The end goal here is to replace all href's in a submission and replace them with different URL's.
So for example, this:
Link A
<a data-track="false" href="http://example.com/">Link B</a>
Link C
Becomes this:
Link A
<a data-track="false" href="http://example.com/">Link B</a>
Link C
My first thought was to take the submitted HTML from the <textarea> field and put it in a variable. At this point the HTML becomes a string and I was going to loop through it with a regex to find matching tags. My issue was that I needed to find all <a> tags that did NOT include the attribute data-track="false". And as far as I can tell that's impossible with regex since each link isn't going to be on its own line.
My second thought was to loop through it using jQuery where I could use something like this:
$("a:not([data-tracking='false'])");
But I can't use jQuery like this on a string, right? It needs to be in the DOM.
I'm unsure of the best way to go about doing this. Maybe another language would prove helpful, but other than HTML and CSS, javascript and jQuery are the only ones I'm experienced with.
Any and all help would be greatly appreciated.

I think your question is similar to
Convert String to XML Document in JavaScript
The answer is that you can wrap it in a jQuery object. Then use jQuery's normal DOM manipulation methods on it.
var myhtml = $($('#main-input').val());
myhtml.find('a').each(function () {
alert($(this).text());
});
if it's a top level element you need to use filter instead of find.

You can create a jQuery object from html strings outside of the DOM and maniuplate it just the same as if it was in the DOM.
Simple example:
var html='<div><p>ABC</p></div>';
alert( $(html).find('p').text() ); // alerts "ABC"
Or
var $div= $('<div>').append(html).find('p').after('<p>DEF</p>');
var newHtml= $div.html();
Will return
<div>
<p>ABC</p>
<p>DEF</p>
</div>
Conclusion, I would loop through a jQuery object created from your html and do what you need using jQuery methods

How to return complete html of a page using jQuery?

How do I return complete html of a page using jQuery?
I cannot do
return $('html').html()
or
return "<html>" + $("html").html() + "</html>";
because the page might not have html tag in it at all. I cannot use a class or id tag because this is not something that I control and can change. So the use case is if I pass a url I need to return complete html of the page and the page may or may not have html tagin it.

Did you try:
$(document).html();
Edit: This doesn't work for me in Firebug, so I am not sure if it will work for you.

document.documentElement.outerHTML;
For older Firefox, you'll need to use .innerHTML instead, and then concatenate the <html> tags.
"the page might not have html tag in it at all"
If you don't have <html> tags, the browser will insert them automatically. If you want an exact representation of the original document, make an AJAx request, as suggested by #Rab Nawaz

For what it's worth, $("html") will return an element in all sane situations, even if the source document didn't contain an <html> element. This is because the browser will automatically insert certain required elements (including <html>, <head>, and <body>) into the DOM tree, even if they were not specified in the source document.
The only exception I can think of (where you wouldn't get anything from $("html")) would be if you were running JQuery against a DOM tree that isn't HTML -- for instance, if you load JQuery into an SVG document. You'd be crazy to do that, though. :)

Replace part of innerHTML without reloading embedded videos

I have a div with id #test that contains lots of html, including some youtube-embeds etc.
Somewhere in this div there is this text: "[test]"
I need to replace that text with "(works!)".
The normal way of doing this would of course be:
document.getElementById("test").innerHTML = document.getElementById("test").replace("[test]","(works!)");
But the problem is that if i do that the youtube-embeds will reload, which is not acceptable.
Is there a way to do this?

You will have to target the specific elements rather than the parent block. Since the DOM is changing the videos are repainted to the DOM.

Maybe TextNode (textContent) will help you, MSDN documentation IE9, other browsers also should support it

Change your page so that
[test]
becomes
<span id="replace-me">[test]</span>
now use the following js to find and change it
document.getElementById('replace-me').text = '(works!)';
If you need to change more than one place, then use a class instead of an id and use document.getElementsByClassName and iterate over the returned elements and change them one by one.
Alternatively, you can use jQuery and do it even simpler like this:
$('#replace-me').text('(works!)');
Now for this single replacement using jQuery is probably overkill, but if you need to change multiple places (by class name), jQuery would definitely come in handy :)

How to read all child elements with tag names and their value from a xml file?

I've a xml file in which I'm storing some HTML content in an element tag called <body>. Now I'm trying to read all the HTML content of body tag using XML DOM in JavaScript.
I tried this code:
var xmlDoc=loadXMLDoc('QID_627.xml');
var bodytag = xmlDoc.getElementsByTagName("body");
document.write(bodytag);
but it is showing [object HTMLCollection] message on the browser screen.

Try this:
var xmlDoc=loadXMLDoc('QID_627.xml');
var bodytags = xmlDoc.getElementsByTagName("body");
document.write(bodytags[0]);
getElementsByTagName returns an array of elements (even if just one is found) so you need to subscript the array to retrieve your element.

Andrew Hare pointed out that getElementsByTagName() always returns an array, so you have to use bodytag[0] to get the element you want. This is correct, but not complete since even when you do that you'll still get an equally useless "[object ElementName]" message.
If you're set on using document.write() you can try to serialize out the content of the body tag with
document.write(bodytag[0].innerHTML);
Better yet would be directly attaching the source DOM nodes into your destination DOM.
You'd use something like
document.getElementById("destinationNodeId").appendChild(bodytag[0]);
There may be some issues with attaching DOM nodes from another document that may require you to copy the nodes, or jump through some other hoops to have it work.

You need to use document.write(bodytag.toXMLString());
EDIT: Andrew Hare also points out you need to subscript first. I think you may still need to use the toXMLString call as well.

Develop Reference

JavaScript is the programming language of the Web.

Does jQuery strip some html elements from a string when using .html()? - javascript

Here is a solution, which will include the body, head and other attributes: mydoc = document.getElementById('NAME_OF_PREVIEW_FRAME').contentWindow.document; mydoc.write(HTML_CODE); mydoc.close();

Related

How to convert from mixed HTML-string/DOM-elements to DOM-elements in Javascript?

Can I take HTML, loop through it to change elements, and display the results as plain text?

How to return complete html of a page using jQuery?

Replace part of innerHTML without reloading embedded videos

How to read all child elements with tag names and their value from a xml file?

Categories

Resources