I am trying to parse a large XML file using JavaScript. Looking online, it seems that the easiest way to start is to use the browser's DOM parser. This works, and I can get elements by ID. I can also get the "class" attribute for those elements, and it returns what I would expect. However, I don't appear to be able to get elements by class.
The following was tried in the latest Chrome:
xmlString = '<?xml version="1.0"?>';
xmlString = xmlString + '<example class="test" id="example">content</example>'
parser = new DOMParser();
xmlDoc = parser.parseFromString(xmlString,"text/xml");
xmlDoc.getElementById("example");
// returns the example element (good)
xmlDoc.getElementById("example").getAttribute("class");
// returns "test" (good)
xmlDoc.getElementsByClassName("test");
// returns [] (bad)
Any ideas?
This should get all elements of a given class, assuming that the tag name will be consistent.
var elements = xmlDoc.getElementsByTagName('Example');
var classArray = [];
for(var i=0;i<elements.length;i++){
if(elements[i].className=="test"){
classArray.push(elements[i])
}}
You can use JQuery to parse an XML file by using a class selector. http://jquery.com
Updating the parser type to HTML as opposed to XML should work.
parser = new DOMParser();
xmlDoc = parser.parseFromString(xmlString,"text/html")
Related
var html = '<p>sup</p>'
I want to run document.querySelectorAll('p') on that text without inserting it into the dom.
In jQuery you can do $(html).find('p')
If it's not possible, what's the cleanest way to to do a temporary insert making sure it doesn't interfere with anything. then only query that element. then remove it.
(I'm doing ajax requests and trying to parse the returned html)
With IE 10 and above, you can use the DOM Parser object to parse DOM directly from HTML.
var parser = new DOMParser();
var doc = parser.parseFromString(html, "text/html");
var paragraphs = doc.querySelectorAll('p');
You can create temporary element, append html to it and run querySelectorAll
var element = document.createElement('div');
element.insertAdjacentHTML('beforeend', '<p>sup</p>');
element.querySelectorAll('p')
Let's say I have the following string:
var myString = "<p>hello</p><script>console.log('hello')</script><h1>Test</h1><script>console.log('world')</script>"
I would like to use split to get an array with the contents of the script tags. e.g. I want my output to be:
["console.log('hello')", "console.log('world')"]
I tried doing myString.split(/[<script></script>]/) But did not get the expected output.
Any help is appreciated.
You can't parse (X)HTML with regex.
Instead, you can parse it using innerHTML.
var element = document.createElement('div');
element.innerHTML = myString; // Parse HTML properly (but unsafely)
However, this is not safe. Even if innerHTML doesn't run the JS inside script elements, malicious strings can still run arbitrary JS, e.g. with <img src="//" onerror="alert()">.
To avoid that problem, you can use DOMImplementation.createHTMLDocument to create a new document, which can be used as a sandbox.
var doc = document.implementation.createHTMLDocument(); // Sandbox
doc.body.innerHTML = myString; // Parse HTML properly
Alternatively, new browsers support DOMParser:
var doc = new DOMParser().parseFromString(myString, 'text/html');
Once the HTML string has been parsed to the DOM, you can use DOM methods like getElementsByTagName or querySelectorAll to get all the script elements.
var scriptElements = doc.getElementsByTagName('script');
Finally, [].map can be used to obtain an array with the textContent of each script element.
var arrayScriptContents = [].map.call(scriptElements, function(el) {
return el.textContent;
});
The full code would be
var doc = document.implementation.createHTMLDocument(); // Sandbox
doc.body.innerHTML = myString; // Parse HTML properly
[].map.call(doc.getElementsByTagName('script'), function(el) {
return el.textContent;
});
Javascript Code:
function myFunction() {
var str = "<p>hello</p><script>console.log('hello')</script><h1>Test</h1><script>console.log('world')</script>";
console.log(str.match(/<script\b[^>]*>(.*?)<\/script>/gm));
}
You have to escape the forward slash like so: /.
myString.split(/(<script>|<\/script>)/)
I am trying to parse the below xml data by traversing through each node.
<example>
<name>BlueWhale</name>
<addr>101 Yesler Way, Suite 402</addr>
<city>Seattle</city>
<state>Washington</state>
</example>
Now I want to access each node without doing getElementsByTagName and print each NodeName & NodeValue in javascript, with the help of things like, rootElement,firstchild,nextSibling which i am not sure of.
I am trying the following manner
var txt = " <example> <name>BlueWhale</name> <addr>101 Yesler Way, Suite 402</addr> <city>Seattle</city> <state>Washington</state> </example> "
var domParser = new DOMParser();
xml = domParser.parseFromString(txt, "text/xml");
var el =xml.documentElement.nodeName;
console.log(el);
and print each var.
Could anyone please help.
if you xml is stored inside a string variable you can use jQuery.
var xml = "<example>...";
$(xml).children().each(function() {
var tagName = this.tagName;
var text = this.innerHtml
});
You should consider using library that does that for you rather than doing it by hand. One of commonly used one's you can find here.
I have the following code:
var xmlString = ajaxRequest.responseText.toString();
parser = new DOMParser()
doc = parser.parseFromString(xmlString, "text/xml");
The response text is a complete HTML document. After I create the XMLDocument (doc), I want to go over each node, manipulate some stuff and print it.
How can I iterate the XMLDocument? I want to go on each one of its nodes.
Thanks!
A little example if you want to get all links from this XML and print their text
var links = doc.documentElement.getElementsByTagName("a");
for (i=0;i<links.length;i++) {
var txt=links[i].firstChild.nodeValue;
document.write(txt + '<br>');
}
Almost sure that this is correct, didn't had time to test it.
You may read this articles to go deeper:
getElementsByTagName
nodeName
NodeList
Hope this helps.
Best regards!
I'm developing a Windows 8 Metro App using JavaScript. I need to manipulate a string of HTML to select elements like DOM.
How can I do that?
Example:
var html = data.responseText; // data.response is a string of HTML received from xhr function.
// Now I need to extract an element from the string like document.getElementById("some_element")...
Thanks!
UPDATE:
I solved!
var parser = new DOMParser();
var xml = parser.parseFromString(data.responseText);
I think your approach to the problem isn't the best, you could return JSON or xml. But if you need to do it that way:
To my knowledge you wont be able to use getElementById without inserting a new element in the document (in the example below, doing inserting div in document, for example document.appendChild(div)), but you could do this:
var div = document.createElement("div");
div.innerHTML = '<span id="rawr"></span>'; //here you would put data.responseText
var elements = div.getElementsByTagName("span"); // [<span id="rawr"></span>], there you could ask elements[0].id === "rawr" or whatever you like