Multiple HTML DOMs - Parse and Transfer Data

Multiple HTML DOMs - Parse and Transfer Data - javascript

I am requesting full HTML5 documents via Ajax using jQuery. I want to be able to parse them and transfer elements to my main page DOM, ideally with all major browsers, including mobile. I don't want to create an iframe as I want the process to be as quick as possible. With Chrome & Firefox I can do the following:
var contents = $(document.createElement('html'));
contents[0].innerHTML = data; // data : HTML document string
This will create a proper document, somewhat surprisingly, just without a doctype. In IE9, however, one may not use the innerHTML to set the contents of the html element. I tried to do the following, without any luck:
Create a DOM, open it, write to it and close it. Issue: on doc.open, IE9 throws an exception called Unspecified error..
var doc = document.implementation.createHTMLDocument('');
doc.open();
doc.write(data);
doc.close();
Create an ActiveX DOM. This time, the result is better but upon transferring / copying elements between documents IE9 crashes. Bad because no IE8 support (adoptNode / importNode support).
var doc = new ActiveXObject('htmlfile');
doc.open();
doc.write(data);
doc.close();
contents = $(doc.documentElement);
document.adoptNode(contents);
I was thinking about recursively recreating the elements, instead of transferring them between my documents, but that seems like an expensive task, given that I can have a lot nodes to transfer. I like my last ActiveX example as that will most likely work in IE8 and earlier (for parsing, at least).
Any ideas on this? Again, not only I need to be able to parse the head and body, but I also need to be able to append these new elements to my main dom.
Thanks much!

Answering my own question... To solve my issue I used all solutions mentioned in my post, with try/catch blocks if a browser throws an error (oh, how we love thee IE!). The following works in IE8, IE9, Chrome 23, Firefox 17, iOS 4 and 5, Android 3 & 4. I have not tested Android 2.1-2.3 and IE7.
var contents = $('');
try {
contents = $(document.createElement('html'));
contents[0].innerHTML = data;
}
catch(e) {
try {
var doc = document.implementation.createHTMLDocument('');
doc.open();
doc.write(data);
doc.close();
contents = $(doc.documentElement);
}
catch(e) {
var doc = new ActiveXObject('htmlfile');
doc.open();
doc.write(data);
doc.close();
contents = $(doc.documentElement);
}
}
At this point we can find elements using jQuery. Transferring them to a different DOM creates a bit of a problem. There are a couple of methods that do this, but they are not widely supported yet (importNode & adoptNode) and/or are buggy. Given that our selector string is called 'selector', below I re-created the found elements and append them to '.someDiv'.
var fnd = contents.find(selector);
if(fnd.length) {
var newSelection = $('');
fnd.each(function() {
var n = document.createElement(this.tagName);
var attr = $(this).prop('attributes');
n.innerHTML = this.innerHTML;
$.each(attr,function() { $(n).attr(this.name, this.value); });
newSelection.push(n);
});
$('.someDiv').append(newSelection);
};

Related

importNode and Microsoft Edge

I have a dynamic page where, with a bar button, I can change the main div content.
Most of the pages are static except one, which contains JavaScript (RGraph charts).
That's why in order to make it working I use the following code:
var data = new FormData();
data.append( 'action', 'charts' );
// clean the content
var myNode = document.getElementById("contentView");
while (myNode.firstChild)
{
myNode.removeChild(myNode.firstChild);
}
// set the new content
var div = document.createElement("div");
var t = document.createElement('template');
t.innerHTML = _connectToServer( data );
for (var i=0; i < t.content.childNodes.length; i++)
{
var node = document.importNode(t.content.childNodes[i], true);
div.appendChild(node);
}
document.getElementById("contentView").appendChild(div);
The problem is that as far as I see (and I read) such a code is not compatible with Microsoft Edge, and I would like to make it going with Edge as well.
What's the best way to succeed?

Ok, so I took this over to our DOM team to better understand this interop issue.
This is actually a Chrome bug per spec. What happens is when you create a template element and place innerHTML inside of it, it is treated as a DocumentFragment. And a template element's contents can't have executable script.
Here are the relavent spec links that cover this:
InnerHTML
Template Element
To work around this, as I pointed to earlier you'll need to create a <script> tag not within a <template> element and utilize textContent. Here's an example of this: http://jsbin.com/gizayuyape/edit?html,js,output
Or here is the code:
// Using textContent
var body = document.getElementsByTagName('body')[0];
var s = document.createElement('script');
s.textContent = 'document.write("Script textContent");';
body.appendChild(s);
Good news, at least on the interop front, is that Chrome has fixed this issue starting in version 67:

How to prevent resource loading of unattached elements in Chrome

I'm working on Chrome extension and I have following problem:
var myDiv = document.createElement('div');
myDiv.innerHTML = '<img src="a.png">';
What happens now is that Chrome tries to load the "a.png" resource, even If I don't attach the "div" element to document. Is there a way to prevent it?
_In the extension I need to get data from a site that doesn't provide any API, so I have to parse the whole HTML to get the necessary data. Writing my own simple HTML parser could be tricky so I would rather use the native HTML parser. However, in Chrome when I put the whole source code to some temporary non-attached element (so it would get parsed and I could filter the necessary data), ale the images (and possibly other resources) start to load as well, causing higher traffic or (in case of relative paths) lots of errors in console. _

To prevent the resources from being loaded, you'll need to create your Node in an entirely new #document. You can use document.implementation.createHTMLDocument for this.
var dom = document.implementation.createHTMLDocument(); // make new #document
// now use this to..
var myDiv = dom.createElement('div'); // ..create a <div>
myDiv.innerHTML = '<img src="a.png">'; // ..parse HTML

You can delay parsing/loading html by storing it in non-standard attribute, then assigning it to innerHtml, "when the time comes":
myDiv.setAttribute('deferredHtml', '<img src="http://upload.wikimedia.org/wikipedia/commons/4/4e/Single_apple.png">');
global.loadDeferredImage = function() {
if(myDiv.hasAttribute('deferredHtml')) {
myDiv.innerHTML = myDiv.getAttribute('deferredHtml');
myDiv.removeAttribute('deferredHtml');
}
};
... onclick="loadDeferredImage()"
I created jsfiddle illustrating this idea:
http://jsfiddle.net/akhikhl/CbCst/3/

How to add additional xmlns namespace attributes to XML in IE via javascript

I'm a little bit stuck trying to attach multiple namespaces to an XML element via javascript across browsers; I've tried about a dozen different ways to no avail.
I usually use plain old javascript but for the sake of keeping this example short, this is how what I'm doing would be done via jQuery:
var soapEnvelope = '<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"></soapenv:Envelope>';
var jXML = jQuery.parseXML(soapEnvelope);
$(jXML.documentElement).attr("xmlns:xsd", "http://www.w3.org/2001/XMLSchema");
In both Chrome and FF, this works as expected giving a result like this:
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema" />
But in IE9, I get a result like this:
<soapenv:Envelope
xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/"
xmlns:NS1="" NS1:xmlns:xsd="http://www.w3.org/2001/XMLSchema"/>
And I cannot find a way to add this namespace attribute without IE9 adding this NS1 prefix to my namespaces. Also if I try passing this result back into $.parseXML(result) I get a malformed XML exception.
Am I misunderstanding something to do with the way namespaces are declared in IE or can anyone suggest a way I can get a consistent result across browsers?
Thanks in advance

In case anyone else runs into a similar problem to this, I ended up finding out that it can be fixed by initialising the IE XML DOM object differently to the way jQuery does it. I used something similar to the following and now the xml namespaces seem to be working fine across all major browsers and the jQuery attr method will now work again also.
var getIEXMLDOM = function() {
var progIDs = [ 'Msxml2.DOMDocument.6.0', 'Msxml2.DOMDocument.3.0' ];
for (var i = 0; i < progIDs.length; i++) {
try {
var xmlDOM = new ActiveXObject(progIDs[i]);
return xmlDOM;
} catch (ex) { }
}
return null;
}
var xmlDOM;
if ( $.browser.msie ) {
xmlDOM = getIEXMLDOM();
xmlDOM.loadXML(soapEnvelope);
} else {
xmlDOM = jQuery.parseXML(soapEnvelope);
}
$(xmlDOM.documentElement).attr("xmlns:xsd", "http://www.w3.org/2001/XMLSchema");

Collect DOM elements from external HTML documents

I am trying to write a report-generator to collect user-comments from a list of external HTML files. User-comments are wrapped in < span> elements.
Can this be done using JavaScript?
Here's my attempt:
function generateCommentReport()
{
var files = document.querySelectorAll('td a'); //Files to scan are links in an HTML table
var outputWindow = window.open(); //Output browser window for report
for(var i = 0; i<files.length; i++){
//Open each file in a browser window
win = window.open();
win.location.href = files[i].href;
//Scan opened window for 'comment's
comments = win.document.querySelectorAll('.comment');
for(var j=0;j<comments.length;j++){
//Add to output report
outputWindow.document.write(comment[i].innerHTML);
}
}
}

You will need to wait for onload on the target window before you can read content from its document.
Also what type of element is comment? In general you can't put a name on just any element. Whilst unknown attributes like a misplaced name may be ignored, you can't guarantee that browsers will take account of them for getElementsByName. (In reality, most browsers do, but IE doesn't.) A class might be a better bet?

Each web browse works in a defined and controlled work space on a user computer where certain things are restrict to code like file system - these are safety standards to ensure that no malicious code from internet runs into your system to phishing sensitive information stored on in it. Only ways a webbrowser is allowed if access granted explicitly by the user.
But i can suggest you for Internet Application as
- If List of commands is static then cache either by XML, Json or Cookies [it will store on user's system until it expires]
- If dynamic then Ajax to retrieve it

I think I have the solution to this.
var windows = [];
var report = null;
function handlerFunctionFactory(i,max){
return function (evt){
//Scan opened window for 'comment's
var comments = windows[i].document.querySelectorAll('.comment');
for(var j=0;j<comments.length;j++){
//Add to output report
report.document.write(comments[j].innerHTML);
}
if((i+1)==max){
report.document.write("</div></body></html>");
report.document.close();
}
windows[i].close();
}
}
function generateReport()
{
var files = document.querySelectorAll('td a'); //The list of files to scan is stored as links in an HTML table
report = window.open(); //Output browser window for report
report.title = 'Comment Report';
report.document.open();
report.document.write('<!DOCTYPE html PUBLIC"-// W3C//DTD XHTML 1.0 Transitional//EN"" http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'
+ '<html><head><title>Comment Report</title>'
+ '</head><body>');
for(var i = 0; i<files.length; i++){
//Open each file in a browser window
win = window.open();
windows.push(win)
win.location.href = files[i].href;
win.onload = handlerFunctionFactory(i,files.length);
}
}
Any refactoring tips are welcome. I am not entirely convinced that factory is the best way to bind the onload handlers to an instance for example.
This works only on Firefox :(

XML in html div?

I have put some xml-fragments in a div and retrieve it with getElementsByTagName. It works fine in Firefox but Internet Explorer ain't so nice... What should I do to fix this?
var thumbnails = content.getElementsByTagName("thumbnails");
for (var i = 0; i < thumbnails.length; i++) {
thumbnails[i].innerHTML

You can't put arbitrary XML in an HTML document, in general. It's invalid HTML, and browser parsers may try to ‘fix’ the broken HTML, mangling your data.
You can embed XML inside HTML using <xml> data islands in IE, or using native-XHTML with custom namespaces in other browsers. But apart from the compatibility issue of the two different methods, it's just not really a very good idea.
Further, even if it worked, plain XML Element nodes don't have an innerHTML property in any case.
You could embed XML inside JavaScript:
<script type="text/javascript">
var xml= '<nails><thumb id="foo">bar</thumb><thumb id="bof">zot</thumb></nails>';
var doc= parseXML(xml);
var nails= doc.getElementsByTagName('thumb');
for (var i = 0; i<nails.length; i++) {
alert(nails[i].getAttribute('id'));
}
function parseXML(s) {
if ('DOMParser' in window) {
return new DOMParser().parseFromString(s, 'text/xml');
} else if ('ActiveXObject' in window) {
var doc= new ActiveXObject('MSXML2.DOMDocument');
doc.async= false;
doc.loadXML(s);
return doc;
} else {
alert('Browser cannot parse XML');
}
}
</script>
But this means you have to encode the XML as a JavaScript string literal (eg. using a JSON encoder if you are doing it dynamically). Alternatively you could use an XMLHttpRequest to fetch a standalone XML document from the server: this is more widely supported than the DOMParser/ActiveX approach.
If you are just using XML to pass data to your script, you will find it a lot easier to write JavaScript literals to do it instead of mucking about with parsing XML.
<script type="text/javascript">
var nails= [
{"id": "foo", "text": "bar"},
{"id": "bof", "text": "zot"}
];
for (var i = 0; i<nails.length; i++) {
// do something
}
</script>
Again, you can produce this kind of data structure easily using a JSON encoder if you need to do it dynamically.

IE 7 has a security issue with the innerHTML property of a DOM element. This security check silently blocks some code. It appears this may be your problem. I do not know if this is an issue with IE 8.
The fix just add the dynamically created element in the DOM tree before accessing any of the properties, not after.
However, for best practices it is wise to change the way you are doing this. Perhaps you should edit your question to ask a better way to do this.

What I've found to be the best way of doing this is to put your xml into a textarea. This is also ext-js's suggestion. That way, the browser doesn't try to create html out of your xml. When you retrieve its value, you just retrieve the texarea's value.
However, as other people have mentioned, I would suggest you retrieve the xml from the server for better separation between html and data, unless you really need to keep your http requests to a minimum.

Develop Reference

JavaScript is the programming language of the Web.

Multiple HTML DOMs - Parse and Transfer Data - javascript

Related

importNode and Microsoft Edge

How to prevent resource loading of unattached elements in Chrome

How to add additional xmlns namespace attributes to XML in IE via javascript

Collect DOM elements from external HTML documents

XML in html div?

Categories

Resources