Create XML DOM Element while keeping case sensitivity - javascript

I'm trying to create the following element nodetree:
<v:custProps>
<v:cp v:nameU="Cost">
</v:custProps>
with:
newCustprop = document.createElement("v:custProps");
newcp = document.createElement("v:cp");
newcp.setAttribute("v:nameU", "Cost");
newCustprop.appendChild(newcp);
However, document.createElement("v:custProps") generates <v:custprops> as opposed to <v:custProps>. Is there anyway to escape this parsing?
Edit 1:
I'm currently reading this article on nodename case sensitivity. It's slightly irrelevant to my problem though because my code is unparsed with <![CDATA]]> and I'd rather not use .innerHTML.

You need to use createElementNS()/setAttributeNS() and provide the namespace, not only the alias/prefix. The example uses urn:v as namespace.
var xmlns_v = "urn:v";
var newCustprop = document.createElementNS(xmlns_v, "v:custProps");
var newcp = document.createElementNS(xmlns_v, "v:cp");
newcp.setAttributeNS(xmlns_v, "v:nameU", "Cost");
newCustprop.appendChild(newcp);
var xml = (new XMLSerializer).serializeToString(newCustprop);
xml:
<v:custProps xmlns:v="urn:v"><v:cp v:nameU="Cost"/></v:custProps>

It's not recommended to use document.createElement for qualified names. See if the document.createElementNS can better serve your purposes.

I still had issues where createElementNs would attach an attribute of "xmls" on my string about using new XMLSerializer().serializeToString(xmlDoc).
I ended up using the following function to create elements with case sensitive tag names:
function createElement(tagName) {
const doc = new DOMParser().parseFromString(`<${tagName}></${tagName}>`, 'text/xml')
return doc.children[0]
}

Related

javascript: save characters/digits without libraries

I use a lot of the following expressions in my code:
document.
.getElementsBy...
.querySelector...
I need to save characters without using any libraries. That can be done by
var d = document;
Then, instead of document. I can write d. now.
I am wondering if there is a simple way to do the same thing for methods
.getElementsBy... and .querySelector....
Since these have a variable term, I cannot put the entire thing into a
variable, like var q = .querySelector(".class"), because the .class
changes almost every time.
You can create functions to avoid adding properties to the document object as shortcut if you don't want to.
function gEBI(d,id)
{
return d.getElementById(id);
}
function qS(d,s)
{
return d.querySelector(s);
}
var d = document;
var ele1 = gEBI(d,"yourID");
var ele2 = qS(d,".class");
You can make your own shortcut functions-references manually.
document.gEBI = document.getElementById;
document.gEBI(id);
But it's not a good practice to make such shortcuts.

Parsing Html from string into document

I'm trying to parse Html code from string into a document and start appending each node at a time to the real dom.
After some research i have encountered this api :
DOMImplementation.createHTMLDocument()
which works great but if some node has descendants than i need to first append the node and only after its in the real dom than i should start appending its descendants , i think i can use
document.importNode(externalNode, deep);
with the deep value set to false in order to copy only the parent node.
so my approach is good for this case and how should i preserve my order of appended nodes so i wont append the same node twice?
and one more problem is in some cases i need to add more html code into a specific location (for example after some div node) and continue appending , any idea how to do that correctly?
You can use the DOMParser for that:
const parser = new DOMParser();
const doc = parser.parseFromString('<h1>Hello World</h1>', 'text/html');
document.body.appendChild(doc.documentElement);
But if you want to append the same thing multiple times, you will have better performances using a template:
const template = document.createElement('template');
template.innerHTML = '<h1>Hello World</h1>';
const instance = template.cloneNode(true);
document.body.appendChild(instance.content);
const instance2 = template.cloneNode(true);
document.body.appendChild(instance2.content);
Hope this helps

How to remove XML namespaces using Javascript?

I am finding that, for my purposes, XML namespaces are simply causing much headache and are completely unnecessary. (For example, how they complicate xpath.)
Is there a simple way to remove namespaces entirely from an XML document?
(There is a related question, but it deals with removing namespace prefixes on tags, rather than namespace declarations from the document root: "Easy way to drop XML namespaces with javascript".)
Edit: Samples and more detail below:
XML:
<?xml version="1.0" ?>
<main xmlns="example.com">
<primary>
<enabled>true</enabled>
</primary>
<secondary>
<enabled>false</enabled>
</secondary>
</main>
JavaScript:
function useHttpResponse()
{
if (http.readyState == 4)
{
if(http.status == 200)
{
var xml = http.responseXML;
var evalue = getXMLValueByPath('/main/secondary/enabled', xml);
alert(evalue);
}
}
}
function getXMLValueByPath(nodepath, xml)
{
var result = xml.evaluate(nodepath, xml, null, XPathResult.STRING_TYPE, null).stringValue;
return result;
}
The sample XML is just like the actual one I am working with, albeit much shorter. Notice that there are no prefixes on the tags for the namespace. I assume this is the null or default namespace.
The JavaScript is a snippet from my ajax functions. If I remove the xmlns="example.com" portion from the main tag, I am able to successfully get the value. As long as any namespace is present, the value becomes undefined.
Edit 2:
It may be worth mentioning that none of the declared namespaces are actually used in the XML tags (like the sample above). In the actual XML file I am working with, three namespaces are declared, but no tags are prefixed with a namespace reference. Thus, perhaps the question should be re-titled, "How to remove unused XML namespaces using Javascript?" I do not see the reason to retain a namespace if it is 1) never used and 2) complicating an otherwise simple path to a node using xpath.
This should remove any namespace declaration you find:
var xml = http.responseXML.replace(/<([a-zA-Z0-9 ]+)(?:xml)ns=\".*\"(.*)>/g, "<$1$2>");
Inorder to replace all the xmlns attributes from an XML javascript string
you can try the following regex
xmlns=\"(.*?)\"
NB: This regex can be used to replace any attributes
var str = `<?xml version="1.0" ?>
<main xmlns="example.com">
<primary>
<enabled>true</enabled>
</primary>
<secondary>
<enabled>false</enabled>
</secondary>
</main>`;
str = str.replace(/xmlns=\"(.*?)\"/g, '');
console.log(str)
Approach without using regex (This removes attributes also)
let xml = '';//input
let doc = new DOMParser().parseFromString(xml,"text/xml");
var root=doc.firstElementChild;
var newdoc = new Document();
newdoc.appendChild(removeNameSpace(root));
function removeNameSpace (root){
let parentElement = document.createElement(root.localName);
let nodeChildren = root.childNodes;
for (let i = 0; i <nodeChildren.length; i++) {
let node = nodeChildren[i];
if(node.nodeType == 1){
let child
if(node.childElementCount!=0)
child = removeNameSpace(node);
else{
child = document.createElement(node.localName);
let textNode = document.createTextNode(node.innerHTML);
child.append(textNode);
}
parentElement.append(child);
}
}
return parentElement;
}

construct a DOM tree from a string without loading resources (specifically images)

So I am grabbing RSS feeds via AJAX. After processing them, I have a html string that I want to manipulate using various jQuery functionality. In order to do this, I need a tree of DOM nodes.
I can parse a HTML string into the jQuery() function.
I can add it as innerHTML to some hidden node and use that.
I have even tried using mozilla's nonstandard range.createContextualFragment().
The problem with all of these solutions is that when my HTML snippet has an <img> tag, firefox dutifully fetches whatever image is referenced. Since this processing is background stuff that isn't being displayed to the user, I'd like to just get a DOM tree without the browser loading all the images contained in it.
Is this possible with javascript? I don't mind if it's mozilla-only, as I'm already using javascript 1.7 features (which seem to be mozilla-only for now)
The answer is this:
var parser = new DOMParser();
var htmlDoc = parser.parseFromString(htmlString, "text/html");
var jdoc = $(htmlDoc);
console.log(jdoc.find('img'));
If you pay attention to your web requests you'll notice that none are made even though the html string is parsed and wrapped by jquery.
The obvious answer is to parse the string and remove the src attributes from img tags (and similar for other external resources you don't want to load). But you'll have already thought of that and I'm sure you're looking for something less troublesome. I'm also assuming you've already tried removing the src attribute after having jquery parse the string but before appending it to the document, and found that the images are still being requested.
I'm not coming up with anything else, but you may not need to do full parsing; this replacement should do it in Firefox with some caveats:
thestring = thestring.replace("<img ", "<img src='' ");
The caveats:
This appears to work in the current Firefox. That doesn't meant that subsequent versions won't choose to handle duplicated src attributes differently.
This assumes the literal string "general purpose assumption, that string could appear in an attribute value on a sufficiently...interesting...page, especially in an inline onclick handler like this: <a href='#' onclick='$("frog").html("<img src=\"spinner.gif\">")'> (Although in that example, the false positive replacement is harmless.)
This is obviously a hack, but in a limited environment with reasonably well-known data...
You can use the DOM parser to manipulate the nodes.
Just replace the src attributes, store their original values and add them back later on.
Sample:
(function () {
var s = "<img src='http://www.google.com/logos/olympics10-skijump-hp.png' /><img src='http://www.google.com/logos/olympics10-skijump-hp.png' />";
var parser = new DOMParser();
var dom = parser.parseFromString("<div id='mydiv' >" + s + "</div>", "text/xml");
var imgs = dom.getElementsByTagName("img");
var stored = [];
for (var i = 0; i < imgs.length; i++) {
var img = imgs[i];
stored.push(img.getAttribute("src"));
img.setAttribute("myindex", i);
img.setAttribute("src", null);
}
$(document.body).append(new XMLSerializer().serializeToString(dom));
alert("Images appended");
window.setTimeout(function () {
alert("loading images");
$("#mydiv img").each(function () {
this.src = stored[$(this).attr("myindex")];
})
alert("images loaded");
}, 2000);
})();

Is there a getElementsByTagName() like function for javascript string variables?

I can use the getElementsByTagName() function to get a collection of elements from an element in a web page.
I would like to be able to use a similar function on the contents of a javascript string variable instead of the contents of a DOM element.
How do I do this?
EDIT
I can do this by creating an element on the fly.
var myElement = new Element('div');
myElement.innerHTML = "<strong>hello</strong><em>there</em><strong>hot stuff</strong>";
var emCollection = myElement.getElementsByTagName('em');
alert(emCollection.length); // This gives 1
But creating an element on the fly for the convenience of using the getElementsByTagName() function just doesn't seem right and doesn't work with elements in Internet Explorer.
Injecting the string into DOM, as you have shown, is the easiest, most reliable way to do this. If you operate on a string, you will have to take into account all the possible escaping scenarios that would make something that looks like a tag not actually be a tag.
For example, you could have
<button value="<em>"/>
<button value="</em>"/>
in your markup - if you treat it as a string, you may think you have an <em> tag in there, but in actuality, you only have two button tags.
By injecting into DOM via innerHTML you are taking advantage of the browser's built-in HTML parser, which is pretty darn fast. Doing the same via regular expression would be a pain, and browsers don't generally provide DOM like functionality for finding elements within strings.
One other thing you could try would be parsing the string as XML, but I suspect this would be more troublesome and slower than the DOM injection method.
function countTags(html, tagName) {
var matches = html.match(new RegExp("<" + tagName + "[\\s>]", "ig"));
return matches ? matches.length : 0;
}
alert(
countTags(
"<strong>hello</strong><em>there</em><strong>hot stuff</strong>",
"em"
)
); // 1
var domParser = new DOMParser();
var htmlString = "<strong>hello</strong><em>there</em><strong>hot stuff</strong>";
var docElement = domParser.parseFromString(htmlString, "text/html").documentElement;
var emCollection = docElement.getElementsByTagName("em");
for (var i = 0; i < emCollection.length; i++) {
console.log(emCollection[i]);
}
HTML in a string is nothing special. It's just text in a string. It needs to be parsed into a tree for it to be useful. This is why you need to create an element, then call getElementsByTagName on it, as you show in your example.

Categories

Resources