We're looking for ways to create a DOM document in javascript from a string, but without using Jquery.
Is there a way to do so? [I would assume so, since Jquery can do it!]
For those curious, we can't use Jquery, becase we're doing this in the context of a Chrome application's content script, and using Jquery would just make our content script too heavy.
https://developer.mozilla.org/en-US/docs/Web/API/DOMParser
var parser = new DOMParser();
var doc = parser.parseFromString("<html_string>", "text/html");
(the resulting doc variable is a documentFragment Object).
In case you're still looking for an anwer, and for anyone else coming accross it, I just have been trying to do the same thing myself. It seems you want to be looking at javascript's DOMImplementation:
http://reference.sitepoint.com/javascript/DOMImplementation
There are few references to compatibility as well here, but it's fairly well supported.
In essence, to create a new document to manipulate, you want to create a new Doctype object (if you're going to output some standards based stuff) and then create the new Document using the newly created Doctype variable.
There are multiple options to be put into both the doctype and the document, but if you're creating an HTML5 document, it seems you want to leave most of them as blank strings.
Example (New HTML5 DOM Document):
var doctype = document.implementation.createDocumentType( 'html', '', '');
var dom = document.implementation.createDocument('', 'html', doctype);
The new Document now looks like this:
<!DOCTYPE html>
<html>
</html>
Example (New XHTML DOM Document):
var doctype = document.implementation.createDocumentType(
'html',
'-//W3C//DTD XHTML 1.0 Strict//EN',
'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'
);
var dom = document.implementation.createDocument(
'http://www.w3.org/1999/xhtml',
'html',
doctype
);
So it's up to you to populate the rest of it. You could do this as simply as changing
dom.documentElement.innerHTML = '<head></head><body></body>';
Or go with the more rigorous:
var head = dom.createElement( 'head' );
var body = dom.createElement( 'body' );
dom.documentElement.appendChild(head);
dom.documentElement.appendChild(body);
All yours.
createDocumentFragment may help you.
https://developer.mozilla.org/En/DOM/DocumentFragment
Browsers always create document by themselves with empty page (about:blank).
Maybe, in Chrome application there're some functions available (like XUL in FF), but there's no such function in ordinary javascript.
Solution - works with all browsers since IE 4.0
var doc = (new DOMParser).parseFromString(htmlString, "text/html");
Or
var doc = document.implementation.createHTMLDocument();
1)Example: new DOMParser()
var htmlString = `<body><header class="text-1">Hello World</header><div id="table"><!--TABLE HERE--></div></body>`;
var insertTableString = `<table class="table"><thead><tr><th>th cell</th></tr></thead><tbody><tr><td>td cell</td></tr></tbody></table>`;
var doc = (new DOMParser).parseFromString(htmlString, "text/html");
doc.getElementById('table').insertAdjacentHTML('beforeend', tableString);
console.log(doc);
2)Example: createHTMLDocument()
var htmlString = `<body><header class="text-1">Hello World</header><div id="table"><!--TABLE HERE--></div></body>`;
var insertTableString = `<table class="table"><thead><tr><th>th cell</th></tr></thead><tbody><tr><td>td cell</td></tr></tbody></table>`;
var doc = document.implementation.createHTMLDocument();
doc.open();
doc.write(htmlString);
doc.getElementById('table').insertAdjacentHTML('beforeend', tableString);
doc.close();
console.log(doc);
I tried some of the other ways here but there where issues when creating script elements such as Chrome refusing to load the actual .js file pointed to by the src attribute. Below is what works best for me.
It's up to 3x faster than jQuery and 3.5x faster than using DOMParser, but 2x slower than programmatically creating the element.
https://www.measurethat.net/Benchmarks/Show/2149/0
Object.defineProperty(HTMLElement, 'From', {
enumerable: false,
value: (function (document) {
//https://www.measurethat.net/Benchmarks/Show/2149/0/element-creation-speed
var rgx = /(\S+)=(["'])(.*?)(?:\2)|(\w+)/g;
return function CreateElementFromHTML(html) {
html = html.trim();
var bodystart = html.indexOf('>') + 1, bodyend = html.lastIndexOf('<');
var elemStart = html.substr(0, bodystart);
var innerHTML = html.substr(bodystart, bodyend - bodystart);
rgx.lastIndex = 0;
var elem = document.createElement(rgx.exec(elemStart)[4]);
var match; while ((match = rgx.exec(elemStart))) {
if (match[1] === undefined) {
elem.setAttribute(match[4], "");
} else {
elem.setAttribute(match[1], match[3]);
}
}
elem.innerHTML = innerHTML;
return elem;
};
}(window.document))
});
Usage Examples:
HTMLElement.From(`<div id='elem with quotes' title='Here is "double quotes" in single quotes' data-alt="Here is 'single quotes' in double quotes"><span /></div>`);
HTMLElement.From(`<link id="reddit_css" type="text/css" rel="stylesheet" async href="https://localhost/.js/sites/reddit/zinject.reddit.css">`);
HTMLElement.From(`<script id="reddit_js" type="text/javascript" async defer src="https://localhost/.js/sites/reddit/zinject.reddit.js"></script>`);
HTMLElement.From(`<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">`);
HTMLElement.From(`<div id='Sidebar' class='sidebar' display=""><div class='sb-handle'></div><div class='sb-track'></div></div>`);
I was able to do this by writing the html string on an iframe
const html = `<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<meta http-equiv="X-UA-Compatible" content="ie=edge">
<title>Document</title>
</head>
<body>
</body>
</html>`
const iframe = document.createElement('iframe')
iframe.contentDocument.open()
iframe.contentDocument.write(html)
iframe.contentDocument.close()
iframe.addEventListener('load', () => {
myDocumentObject = iframe.contentDocument
})
fetch("index.html", { // or any valid URL
method: "get"
}).then(function(e) {
return e.text().then(e => {
var t = document.implementation.createHTMLDocument("");
t.open();
t.write(e);
t.close();
return t;
});
}).then(e => {
// e will contain the document fetched and parsed.
console.log(e);
});
The DOM element has the property innerHTML that allows to change completely its contents. So you can create a container and fill it with a new HTML content.
function createElementFromStr(htmlContent) {
var wrapperElm = document.createElement("div");
wrapperElm.innerHTML = htmlContent; // Ex: "<p id='example'>HTML string</p>"
console.assert(wrapperElm.children.length == 1); //Only one child at first level.
return wrapperElm.children[0];
}
*
I know it is an old question, but i hope to help someone else.
var dom = '<html><head>....</head><body>...</body></html>';
document.write(dom);
document.close();
HTML would like this:
<html>
<head></head>
<body>
<div id="toolbar_wrapper"></div>
</body>
</html>
JS would look like this:
var data = '<div class="toolbar">'+
'<button type="button" class="new">New</button>'+
'<button type="button" class="upload">Upload</button>'+
'<button type="button" class="undo disabled">Undo</button>'+
'<button type="button" class="redo disabled">Redo</button>'+
'<button type="button" class="save disabled">Save</button>'+
'</div>';
document.getElementById("toolbar_wrapper").innerHTML = data;
Related
Updated/Simplified based on Mathias's comment:
I'm trying to dynamically create an HTML Document and then find elements within the DOM via XPath.
What's odd is that the created Document looks to be properly constructed and querying it with document.querySelector('<some el>') for example works as expected.
However, document.evaluate is always returning null for every XPath.
Update #2: This is true for Chrome + Safari. Everything works as expected in Firefox.
function createDocumentFromHTMLContent(htmlContent) {
const htmlEl = document.createElement('HTML');
htmlEl.innerHTML = htmlContent;
const doctype = document.implementation.createDocumentType('html', '', '');
const doc = document.implementation.createDocument('', 'html', doctype);
doc.replaceChild(htmlEl, doc.firstElementChild);
return doc;
}
function getElementByXpath(path, doc) {
doc = doc || document;
return doc.evaluate(path, doc, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
}
const pageContent = `
<!DOCTYPE html>
<html>
<head>
<title>Yup</title>
</head>
<body>
<h1>Title</h1>
</body>
</html>
`;
const doc = createDocumentFromHTMLContent(pageContent);
const xpath = '/html[1]/body[1]/h1';
const onDoc = {
viaXPath: getElementByXpath(xpath, doc),
viaSelector: doc.querySelector('h1'),
};
const onDocument = {
viaXPath: getElementByXpath(xpath, document),
viaSelector: document.querySelector('h1'),
};
const summarize = (obj) => `XPath El: ${!!obj.viaXPath}, Selector El: ${!!obj.viaSelector}`;
const summaryEl = document.createElement('p');
summaryEl.innerHTML = `Via Document: ${summarize(onDocument)}<br />Via Doc: ${summarize(onDoc)}`;
document.body.appendChild(summaryEl);
Here's the above in a JSFiddle: https://jsfiddle.net/two2hg0z/
I can't figure out why XPath selection works on one document object, but not the other.
Any help is appreciated! Very stumped.
I'm not entirely sure what happens here in webkit browsers, probably they don't like to Document.replaceChild the documentElement, or maybe it's because you are setting some markup that is actually invalid inside an <html> element (for instance the Doctype should actually be set outside, it can't contain an <html> node etc. but anyway, the correct way to parse a string as a Document is through the use of a DOMParser:
function createDocumentFromHTMLContent(htmlContent) {
return new DOMParser().parseFromString(htmlContent, 'text/html');
}
function getElementByXpath(path, doc) {
doc = doc || document;
return doc.evaluate(path, doc, null, XPathResult.FIRST_ORDERED_NODE_TYPE, null).singleNodeValue;
}
const pageContent = `
<!DOCTYPE html>
<html>
<head>
<title>Yup</title>
</head>
<body>
<h1>Title</h1>
</body>
</html>
`;
const doc = createDocumentFromHTMLContent(pageContent);
const xpath = '/html[1]/body[1]/h1';
const onDoc = {
viaXPath: getElementByXpath(xpath, doc),
viaSelector: doc.querySelector('h1'),
};
const onDocument = {
viaXPath: getElementByXpath(xpath, document),
viaSelector: document.querySelector('h1'),
};
const summarize = (obj) => `XPath El: ${!!obj.viaXPath}, Selector El: ${!!obj.viaSelector}`;
const summaryEl = document.createElement('p');
summaryEl.innerHTML = `Via Document: ${summarize(onDocument)}<br />Via Doc: ${summarize(onDoc)}`;
document.body.appendChild(summaryEl);
<h1>Title</h1>
Note that if instead of replacing the documentElement, you did set its innerHTML to the one of your generated HTMLElement, it would also have worked in Chrome, but not in Firefox anymore ;-)
And I can't access div tag on body's child at [9] , [11]
I use bd.firstChild, bd.childNodes[n] before but null always appears
<html>
<head>
<meta charset="EUC-KR">
<title>Insert title here</title>
<script>
var rt = document.getRootNode();
document.write(rt.nodeName + " "); //document
var ht = rt.firstChild;
document.write(ht.nodeName + " "); // html
var hd = ht.firstChild;
document.write(hd.nodeName + " "); // head
var bd = hd.nextSibling;
document.write(bd.nodeName + " "); // body
</script>
</head>
<body>
<br/><br/><br/><br/><br/><br/>
<h1>1</h1>
<h2>2</h2>
</body>
</html>
The browsers' web-page parser (and document-builder) is paused whenever a <script> element is encountered (assuming it doesn't have async or defer attributes).
document.write causes the argument text to be added to the document stream immediately when invoked.
The argument text is appended to the currently parsed document text then fed into the document builder synchronously.
This synchronous behaviour, and the fact parsing text and creating document nodes from it is expensive, is why document.write is deprecated and its use discouraged.
You are passing human-readable text into document.write so new #text nodes will be added immediately where the <script> element is, which is inside the <head> element.
But all of that is irrelevant because in your case your <html>'s firstChild is actually a #text node comprised of the whitespace betwen <html> and <head>, the only way to eliminate this is by having <html><head> instead of <html>(#text "\r\n\r\n")<head>.
var bd is not the <body> element. Use your debugger.
Also, avoid using the node-based DOM API because it includes things like whitespace #text nodes and comments. It's usually better to use the Element-based DOM API instead.
Before you edited the snippet to document.write before the document was completed, I created the code below - you many NOT use document.write after the document is completed but you CANNOT show the tags/nodes before they are rendered:
It seems there is some strange going's on with just getting the nodes - Chrome adds a newline after the head before the body
This shows this issue
You want to check out https://developer.mozilla.org/en-US/docs/Web/API/Node/nodeType
<!doctype html>
<html>
<head>
<title></title>
</head>
<body>
<p></p>
<script>
// https://developer.mozilla.org/en-US/docs/Web/API/NodeFilter/acceptNode
// https://developer.mozilla.org/en-US/docs/Web/API/Document/createTreeWalker
var nodeIterator = document.createNodeIterator(
// Node to use as root
document.querySelector('html'),
// Only consider nodes that are text nodes (nodeType 3)
NodeFilter.SHOW_ELEMENT,
// Object containing the function to use for the acceptNode method
// of the NodeFilter
{
acceptNode: function(node) {
// Logic to determine whether to accept, reject or skip node
// In this case, only accept nodes that have content
// other than whitespace
if (!/^\s*$/.test(node.data)) {
return NodeFilter.FILTER_ACCEPT;
}
}
},
false
);
// Show the content of every non-empty text node that is a child of root
var node;
while ((node = nodeIterator.nextNode())) {
console.log(node.tagName);
}
/* ----------- Older code ------------- */
/*
var rt = document.getRootNode();
console.log("Root", rt.nodeName + " "); //document
var ht = rt.firstChild;
console.log("Root's firstChild", ht.nodeName + " "); // html
var HTML = ht.nextSibling;
console.log("html's nextSibling", HTML.nodeName + " "); // HTML
var HEAD = HTML.firstChild;
console.log("Html's firstChild", HEAD.nodeName + " "); // HEAD
var newLine = HEAD.nextSibling;
var BODY = newLine.nextSibling;
console.log("newLine's nextSibling", BODY.nodeName + " "); // BODY
*/
</script>
</body>
</html>
I'm working on a form builder website. After a form is built it must be saved in database. When the user clicks on a form name from the list of saved forms the form information is restored from database. One of the variables I will restore is the structure of the form. In javascript I wrote these lines of code:
var prefix_content='<!DOCTYPE HTML>\n<html lang="en-US">\n<head>\n<meta charset="UTF-8">\n<title> </title>\n </head>\n<body>\n ';
var sufex_content=' \n</body></html>';
var dynamic_content=String(text_content);
document.write(prefix_content + dynamic_content + sufex_content );
The variable dynamic_content contains the dynamic structure.
The problem is that prefix_content and sufex_content is displayed as html but dynamic_content is written in the page as text. Any one knows why is that or knows how to solve this problem.
Note: when I write the text in dynamic content statically between single quotes it is displayed as html not text.
If you're seeing the content retrieved from your database as plaintext, instead of HTML, its HTML entities are probably getting escaped somewhere along the way. Check the contents of your text_content variable (e.g. use console.log(text_content) and if you're seeing stuff like <div> instead of <div>, go on and find out where your escaping happens and either remove it or manually unescape.
TRY THIS:
var prefix_content='<!DOCTYPE HTML>\n<html lang="en-US">\n<head>\n<meta charset="UTF-8">\n<title> </title>\n </head>\n<body>\n ';
var sufex_content=' \n</body></html>';
var dynamic_content=String(text_content);
var parser = new DOMParser();
var el = parser.parseFromString(dynamic_content, "text/html");
document.write(prefix_content + el + sufex_content );
Or you can try this too: Using jQuery
var dynamic_content=String(text_content);
var el = $.parseHTML( dynamic_content );
document.write(prefix_content + el + sufex_content );
var content = "<div style='color:red;'>TEST</div>";
var prefix ='<!DOCTYPE HTML>\n<html lang="en-US">\n<head>\n<meta charset="UTF-8">\n<title>TEST</title>\n</head>\n<body>\n';
var suffix ='\n</body></html>';
var all = prefix + content + suffix;
var parser = new DOMParser();
var doc = parser.parseFromString(all, "text/html");
console.log(doc.children[0].outerHTML);
Instead of children[0] you can also go for:
doc.documentElement.outerHTML
Results in:
<html lang="en-US"><head>
<meta charset="UTF-8">
<title>TEST</title>
</head>
<body>
<div style="color:red;">TEST</div>
</body></html>
I have the following tag in HTML:
<div data-dojo-type="dojox.data.XmlStore"
data-dojo-props="url:'http://135.250.70.162:8081/eqmWS/services/eq/Equipment/All/6204/2', label:'text'"
data-dojo-id="bookStore3"></div>
I have the values 6204 and 2 in a couple of global variables in the script section:
<html>
<head>
<script>
...
var newNeId = gup('neId');
var newNeGroupId = gup('neGroupId');
...
</script>
</head>
</html>
Is it possible to have these variables in the div tag in the HTML body? If so, how?
To clarify this a bit more, I need to have the URL in the tag something like this:
url: 'http://135.250.70.162:8081/eqmWS/services/eq/Equipment/All/'+newNeGroupId+'/'+newNeId
I changed it according to your requirement:
<html>
<head>
<script type="text/javascript">
// example data
var newNeId = 10;
var newNeGroupId = 500;
window.onload = function(e){
var myDiv = document.getElementById("myDiv");
myDiv.setAttribute("data-dojo-props", "url:'http://135.250.70.162:8081/eqmWS/services/eq/Equipment/All/" + newNeId + "/" + newNeGroupId + "', label:'text'");
}
</script>
</head>
<body>
<div id="myDiv" data-dojo-type="dojox.data.XmlStore"
data-dojo-props="url:'http://135.250.70.162:8081/eqmWS/services/eq/Equipment/All/6204/2', label:'text'"
data-dojo-id="bookStore3"></div>
</body>
</html>
You could add them to the <div> using the same datalist pattern (MDN docu) as Dojo:
<div id="savebox" data-newNeId="6204" data-newNeGroupId="2"></div>
These attributes are then accessible by the element.dataset.itemName.
var div = document.querySelector( '#savebox' );
// access
console.log( div.dataset.newNeId );
console.log( div.dataset.newNeGroupId );
As #EricFortis pointed out, the question remains, why you want to do this. This only makes sense, if you pass those values on from the server side.
Take one parent div then set its id and then you can rewrite whole div tag with attributes using innerHTML.
document.getElementById('id of parent div').innerHTml="<div data-dojo-type=/"dojox.data.XmlStore/"
data-dojo-props=/"url:'http://135.250.70.162:8081/eqmWS/services/eq/Equipment/All/6204/2', label:'text'/"
data-dojo-id=/"bookStore3/"></div>";
you can append values you wants in innerhtml now.
here's simple native js code to do it
var body = document.getElementsByTagName('body')[0];
var myDiv = document.createElement('div');
myDiv.setAttribute('id', 'myDiv');
var text = 'newNeId: ' + newNeId +
'<br/> newNeGroupId: ' + newNeGroupId';
body.appendChild(myDiv);
document.getElementById('myDiv').innerHTML = text;
I have this piece of HTML code.
<div class="tagWrapper">
<i style="background-image: url(https://fbcdn-photos-a.akamaihd.net/hphotos-ak-ash4/390945_10150419199065735_543370734_8636909_2105028019_a.jpg);"></i>
</div>
I need to get that url within the brackets. I tried using the getElementsByClassName() method but it didn't work. Since url is not a HTML element, I have no idea on how to take out the value. I can't use getElementById(), because I can't add an id to the HTML (it's not mine). It needs to work in Chrome and Firefox. Any suggestions?
You didn't add a jQuery tag, so here's a native solution (note that this likely won't work on older versions of IE, but you said it only has to work on Chrome and FF):
var origUrl = document.getElementsByClassName("tagWrapper")[0]
.children[0].style.backgroundImage;
var url = origUrl.substr(4, origUrl.length - 5);
Or
var url = origUrl.replace("url(", "").replace(")", "");
Here's a fiddle
EDIT
Answering your comment
document.getElementsByClassName("tagWrapper")
gets all elements with the class name tagWrapper. So to get the first one, you grab the zero index
document.getElementsByClassName("tagWrapper")[0]
Then you want the first child under there, and the backgroundImage property on this first child.
document.getElementsByClassName("tagWrapper")[0]
.children[0].style.backgroundImage;
From there it's a simple matter stripping the url( and ) from it
var url = origUrl.substr(4, origUrl.length - 5);
or
var url = origUrl.replace("url(", "").replace(")", "");
You can use querySelector():
Demo: http://jsfiddle.net/ThinkingStiff/gFy6R/
Script:
var url = document.querySelector( '.tagWrapper i' ).style.backgroundImage;
url = url.substr(4, url.length - 5);
If you where using jquery you could do something like this
$(".tagWrapper i").css("background-image")
I think if you use jQuery it will be easer.
var w = document.getElementsByClassName('tagWrapper')[0];
for (var i=0; i<w.childNodes.length; i++)
if (w.childNodes[i].tagName && w.childNodes[i].tagName.toLowerCase() == 'i')
return w.childNodes[i].style.backgroundImage;
<div class="tagWrapper">
<i id="something" style="background-image: url(https://fbcdn-photos-a.akamaihd.net/hphotos-ak-ash4/390945_10150419199065735_543370734_8636909_2105028019_a.jpg);"></i>
</div>
// script / without jQuery
var url = document.getElementById('something').style.backgroundImage.match(/\((.*?)\)/)[1];
Use jQuery!!!
$("div.tagWrapper i").css("background-image").substr(4, $("div.tagWrapper i").css("background-image").length-5)
Example
If You don't have to care about Microsoft browsers, the raw JavaScript is quite easy. You can use getElementsByClassName and getElementsByTagName, however it is easier to try querySelectorAll. I've included both. The use of regular expression preserve relative links.
<!DOCTYPE html>
<html>
<head>
<title>Test</title>
<script type='text/javascript'>
var do_find_a = function() {
var tmp = document.getElementsByClassName('tagWrapper')[0];
var tst = tmp.getElementsByTagName('i')[0].getAttribute('style');
return do_alert(tst);
}
var do_find_b = function() {
var tst = document.querySelectorAll('.tagWrapper i')[0].getAttribute('style');
return do_alert(tst);
}
var do_alert = function(tst) {
var reg = /background-image:\s*url\(["']?([^'"]*)["']?\);?/
var ret = reg.exec(tst);
alert (ret[1]);
return;
}
document.addEventListener('DOMContentLoaded',do_find_a,false);
document.addEventListener('DOMContentLoaded',do_find_b,false);
</script>
</head>
<body>
<div class='tagWrapper'>
<i style='background-image: url("http://example.com/image.jpg");'></i>
</div>
Text to ignore.
</body>
</html>
And jsFiddle version:
http://jsfiddle.net/hpgmr/