jQuery navigating XML parent child nodes and selecting appropriate attributes from each?

jQuery navigating XML parent child nodes and selecting appropriate attributes from each? - javascript

I am creating a templating system which can be interpreted at client side with Javascript to construct a fill in the blanks form e.g. for a letter to a customer etc.
I have the template constructed and the logic set out in pseudo code, however my unfamiliarity with jQuery I could use some direction to get me started.
The basic idea is there is a markup in my text node that denotes a field e.g. ${prologue} this is then added to an array called "fields" which will then be used to search for corresponding node names in the xml.
XML
<?xml version="1.0" encoding="UTF-8"?>
<message>
<text>${Prologue} - Dear ${Title} ${Surname}. This is a message from FUBAR. An engineer called but was unable to gain access, a new appointment has been made for ${ProductName} with order number ${VOLNumber}, on ${AppointmentDate} between ${AppointmentSlot}.
Please ensure you are available at your premises for the engineer. If this is not convenient, go to fubar.com or call 124125121515 before 12:00 noon the day before your appointment. Please refer to your order confirmation for details on what will happen on the day. ${Epilogue} - Free text field for advisor input<
</text>
<inputTypes>
<textBox type="text" fixed="n" size="100" alt="Enter a value">
<Prologue size="200" value="BT ENG Appt Reschedule 254159" alt="Prologue field"></Prologue>
<Surname value="Hoskins"></Surname>
<ProductName value=""></ProductName>
<VOLNumber size="8" value="" ></VOLNumber>
<Epilogue value=""></Epilogue>
</textBox>
<date type="datePicker" fixed="n" size="8" alt="Select a suitable appointment date">
<AppointmentDate></AppointmentDate>
</date>
<select type="select" >
<Title alt="Select the customers title">
<values>
<Mr selected="true">Mr</Mr>
<Miss>Miss</Miss>
<Mrs>Mrs</Mrs>
<Dr>Dr</Dr>
<Sir>Sir</Sir>
</values>
</Title>
<AppointmentSlot alt="Select the appointment slot">
<values>
<Morning>9:30am - 12:00pm</Morning>
<Afternoon>1:00pm - 5:00pm</Afternoon>
<Evening>6:00pm - 9:00pm</Evening>
</values>
</AppointmentSlot>
</select>
</inputTypes>
</message>
Pseudocode
Get list of tags from text node and build array called "fields"
For each item in "fields" array:
Find node in xml that equals array item's name
Get attributes of that node
Jump to parent node
Get attributes of parent node
If attributes of parent node != child node then ignore
Else add the parent attributes to the result
Build html for field using all the data gathered from above
Addendums
Is this logic ok, is it possible to start at the parent of the node and navigate downwards instead?
Also with regards to inheritence could we get the parent attributes and if the child attributes are different then add them to the result? What about if the number of attributes in the parent does not equal the number in the child?
Please do not provide fully coded solutions, just a little teasers to get me started.
Here is what I have so far which is extracting the tags from text node
//get value of node "text" in xml
var start = $(xml).find("text").text().indexOf('$');
var end = $(xml).find("text").text().indexOf('}');
var tag = "";
var inputType;
// find all tags and add them to a tag array
while (start >= 0)
{
//console.log("Reach In Loop " + start)
tag = theLetter.slice(start + 2, end);
tagArray.push(tag);
tagReplaceArray.push(theLetter.slice(start, end + 1));
start = theLetter.indexOf('$', start + 1);
end = theLetter.indexOf('}', end + 1);
}
Any other recommendations or links to similar problems would be welcome.
Thankyou!

I am using a similar technique to do html templating.
Instead of working with elements, I find it easier to work with a string and then convert it to html. In your case with jQuery, you could do something similar:
Have your xml as a string:
var xmlString='<?xml version="1.0" encoding="UTF-8"?><message><text>${Prologue} - Dear ${Title} ${Surname}... ';
Iterate through the string to do the replacements with a regex ($1 is the captured placeholder, for example Surname):
xmlString.replace(/$\{([^}]+)}/g,function($0,$1)...}
Convert to nodes if needed:
var xml=$(xmlString);
The benefits of the regex:
faster (just a string, you're not walking the DOM)
global replace (for example if Surname appears several times), just loop through your object properties once
simple regex /${([^}]+)}/ to target the placeholder

Get list of tags from text node and build array called "fields"
To create the array I would rather user regular expression, this is one of the best use for it (in my opinion) because we are indeed searching for a pattern :
var reg = /\$\{(\w+)\}/gm;
var i = 0;
var fields = new Array();
while ( (m = reg.exec(txt)) !== null)
{
fields[i++] = m[1];
}
For each item in "fields" array
jQuery offers some utility functions :
To iterate through your fields you could do this : $.each(fields, function(index, value){});
Navigating through the nodes and retrieving the values
Just use the jQuery function like you are already doing.
Building the HTML
I would create templates objects for each types you would take in charge (in this example : Text, Select)
Then using said templates you could replace the tokens with the HTML of your templates.
Displaying the HTML
Last step would be to parse the result string and append it at the right place:
var ResultForm = $.parseHTML(txt);
$("#DisplayDiv").append(ResultForm);
Conclusion
Like you asked, I did not prepare anything that works right out of the box, I hope it will help you prepare your own answer. (And then I hope you will share it with the community)

This is just a framework to get you going, like you asked.
first concept is using a regex to just find all matches of ${ }. it returns an array like ["${one}","${t w 0 }","${ three}"].
second concept is a htmlGenerator json object mapping "inputTypes-->childname" to a function responsible for the html print out.
third is not to forget about natural javascript. .localname will give you the xml element's name, and node.attributes should give you a namedNodeMap back (remember not to perform natural javascript against the jquery object, make sure you're referencing the node element jQuery found for you).
the actual flow is simple.
find all the '${}'tokens and store the result in an array.
find all the tokens in the xml document and using their parents info, store the html in an map of {"${one}":"<input type='text' .../>","${two}":"<select><option value='hello'>world!</option></select>" ...}
iterate through the map and replace every token in the source text with the html you want.
javascript
var $xmlDoc = $(xml); //store the xml document
var tokenSource =$xmlDoc.find("message text").text();
var tokenizer=/${[^}]+/g; //used to find replacement locations
var htmlGenerators = {
"textBox":function(name,$elementParent){
//default javascript .attributes returns a namedNodeMap, I think jquery can handle it, otherwise parse the .attributes return into an array or json obj first.
var parentAttributes = ($elementParent[0] && $elementParent.attributes)?$elementParent.attributes:null;
//this may be not enough null check work, but you get the idea
var specificAttributes =$elementParent.find(name)[0].attributes;
var combinedAttributes = {};
if(parentAttributes && specificAttributes){
//extend or overwrite the contents of the first obj with contents from 2nd, then 3rd, ... then nth [$.extend()](http://api.jquery.com/jQuery.extend/)
$.extend(combinedAttributes,parentAttributes,specificAttributes);
}
return $("<input>",combinedAttributes);
},
"date":function(name,$elementParent){
//whatever you want to do for a 'date' text input
},
"select":function(name,$elementParent){
//put in a default select box implementation, obviously you'll need to copy options attributes too in addition to their value / visible value.
}
};
var html={};
var tokens = tokenSource.match(tokenizer); //pull out each ${elementKey}
for(index in tokens){
var elementKey = tokens[index].replace("${","").replace("}"),"");//chomp${,}
var $elementParent = $xmlDoc.find(elementKey).parent();//we need parent attributes. javascript .localname should have the element name of your xml node, in this case "textBox","date" or "select". might need a [0].localname....
var elementFunction = ($elementParent.localname)?htmlGenerators[elementParent.localname]:null; //lookup the html generator function
if(elementFunction != null){ //make sure we found one
html[tokens[index]] = elementFunction(elementKey,elementParent);//store the result
}
}
for(index in html){
//for every html result, replace it's token
tokenSource = tokenSource.replace(index,html[index]);
}

Related

Google Scripts - keep track of element [duplicate]

Update: This is a better way of asking the following question.
Is there an Id like attribute for an Element in a Document which I can use to reach that element at a later time. Let's say I inserted a paragraph to a document as follows:
var myParagraph = 'This should be highlighted when user clicks a button';
body.insertParagraph(0, myParagraph);
Then the user inserts another one at the beginning manually (i.e. by typing or pasting). Now the childIndex of my paragraph changes to 1 from 0. I want to reach that paragraph at a later time and highlight it. But because of the insertion, the childIndex is not valid anymore. There is no Id like attribute for Element interface or any type implementing that. CahceService and PropertiesService only accepts String data, so I can't store myParagraphas an Object.
Do you guys have any idea to achieve what I want?
Thanks,
Old version of the same question (Optional Read):
Imagine that user selects a word and presses the highlight button of my add-on. Then she does the same thing for several more words. Then she edits the document in a way that the start end end indexes of those highlighted words change.
At this point she presses the remove highlighting button. My add-on should disable highlighting on all previously selected words. The problem is that I don't want to scan the entire document and find any highlighted text. I just want direct access to those that previously selected.
Is there a way to do that? I tried caching selected elements. But when I get them back from the cache, I get TypeError: Cannot find function insertText in object Text. error. It seems like the type of the object or something changes in between cache.put() and cache.get().
var elements = selection.getSelectedElements();
for (var i = 0; i < elements.length; ++i) {
if (elements[i].isPartial()) {
Logger.log('partial');
var element = elements[i].getElement().asText();
var cache = CacheService.getDocumentCache();
cache.put('element', element);
var startIndex = elements[i].getStartOffset();
var endIndex = elements[i].getEndOffsetInclusive();
}
// ...
}
When I get back the element I get TypeError: Cannot find function insertText in object Text. error.
var cache = CacheService.getDocumentCache();
cache.get('text').insertText(0, ':)');
I hope I can clearly explained what I want to achieve.

One direct way is to add a bookmark, which is not dependent on subsequent document changes. It has a disadvantage: a bookmark is visible for everyone...
More interesting way is to add a named range with a unique name. Sample code is below:
function setNamedParagraph() {
var doc = DocumentApp.getActiveDocument();
// Suppose you want to remember namely the third paragraph (currently)
var par = doc.getBody().getParagraphs()[2];
Logger.log(par.getText());
var rng = doc.newRange().addElement(par);
doc.addNamedRange("My Unique Paragraph", rng);
}
function getParagraphByName() {
var doc = DocumentApp.getActiveDocument();
var rng = doc.getNamedRanges("My Unique Paragraph")[0];
if (rng) {
var par = rng.getRange().getRangeElements()[0].getElement().asParagraph();
Logger.log(par.getText());
} else {
Logger.log("Deleted!");
}
}
The first function "marks" the third paragraph as named range. The second one takes this paragraph by the range name despite subsequent document changes. Really here we need to consider the exception, when our "unique paragraph" was deleted.

Not sure if cache is the best approach. Cache is volatile, so it might happen that the cached value doesn't exist anymore. Probably PropertiesService is a better choice.

Python Selenium Scraping Javascript - Element not found

I am trying to scrape the following Javascript frontend website to practise my Javascript scraping skills:
https://www.oplaadpalen.nl/laadpaal/112618
I am trying to find two different elements by their xPath. The first one is the title, which it does find. The second one is the actual text itself, which it somehow fails to find. It's strange since I just copied the xPath's from Chrome browser.
from selenium import webdriver
link = 'https://www.oplaadpalen.nl/laadpaal/112618'
driver = webdriver.PhantomJS()
driver.get(link)
#It could find the right element
xpath_attribute_title = '//*[#id="main-sidebar-container"]/div/div[1]/div[2]/div/div[' + str(3) + ']/label'
next_page_elem_title = driver.find_element_by_xpath(xpath_attribute_title)
print(next_page_elem_title.text)
#It fails to find the right element
xpath_attribute_value = '//*[#id="main-sidebar-container"]/div/div[1]/div[2]/div/div[' + str(3) + ']/text()'
next_page_elem_value = driver.find_element_by_xpath(xpath_attribute_value)
print(next_page_elem_value.text)
I have tried a couple of things: change "text()" into "text", "(text)", but none of them seem to work.
I have two questions:
Why doesn't it find the correct element?
What can we do to make it find the correct element?

Selenium's find_element_by_xpath() method returns the first element node matching the given XPath query, if any. However, XPath's text() function returns a text node—not the element node that contains it.
To extract the text using Selenium's finder methods, you'll need to find the containing element, then extract the text from the returned object.

Keeping your own logic intact you can extract the labels and the associate value as follows :
for x in range(3, 8):
label = driver.find_element_by_xpath("//div[#class='labels']//following::div[%s]/label" %x).get_attribute("innerHTML")
value = driver.find_element_by_xpath("//div[#class='labels']//following::div[%s]" %x).get_attribute("innerHTML").split(">")[2]
print("Label is %s and value is %s" % (label, value))
Console Output :
Label is Paalcode: and value is NewMotion 04001157
Label is Adres: and value is Deventerstraat 130
Label is pc/plaats: and value is 7321cd Apeldoorn

I would suggest a slightly different approach. I would grab the entire text and then split one time on :. That will get you the title and the value. The code below will get Paalcode through openingstijden labels.
for x in range(2, 8):
s = driver.find_element_by_css_selector("div.leftblock > div.labels > div")[x].text
t = s.split(":", 1)
print(t[0]) # title
print(t[1]) # value
You don't want to split more than once because Status contains more semicolons.

Going with #JeffC's approach, if you want to first select all those elements using xpath instead of css selector, you may use this code:
xpath_title_value = "//div[#class='labels']//div[label[contains(text(),':')] and not(div) and not(contains(#class,'toolbox'))]"
title_and_value_elements = driver.find_elements_by_xpath(xpath_title_value)
Notice the plural elements in the find_elements_by_xpath method. The xpath above selects div elements that are descendants of a div element that had a class attribute of "labels". The nested label of each selected div must contain a colon. Furthermore, the div itself may not have a class of "toolbox" (Something that certain other divs on the page have), nor must it contain any additional nested divs.
Following which, you can extract the text within the individual div elements (which also contain the text from the nested label elements) and then split them using ":\n" which separates the title and value in the raw text string.
for element in title_and_value_elements:
element = element.text
title,value = element.split(":\n")
print(title)
print(value,"\n")

Since you want to practice JS skills you can do this also in JS, actually all the divs contain more data, you can see if you do paste this in the browser console:
labels = document.querySelectorAll(".labels");
divs = labels[0].querySelectorAll("div");
for (div of divs) console.log(div.firstChild, div.textContent);
you can push to an array and check only divs and that have label and return the resulted array in a python variable:
labels_value_pair.driver.execute_script('''
scrap = [];
labels = document.querySelectorAll(".labels");
divs = labels[0].querySelectorAll("div");
for (div of divs) if (div.firstChild.tagName==="LABEL") scrap.push(div.firstChild.textContent, div.textContent);
return scrap;
''')

I am getting empty values for sports-title and third

i am new to js.
can you tell me why I am getting empty values for sports-title and third.
since we have one div with content in it.
sports-title---->{"0":{}}
third---->{}
providing my code below.
findStringInsideDiv() {
/*
var str = document.getElementsByClassName("sports-title").innerHTML;
*/
var sportsTitle = document.getElementsByClassName("sports-title");
var third = sportsTitle[0];
var thirdHTML = third.innerHTML
//str = str.split(" ")[4];
console.log("sports-title---->" + JSON.stringify(sportsTitle));
console.log("third---->" + JSON.stringify(third));
console.log("thirdHTML---->" + JSON.stringify(thirdHTML));
if ( thirdHTML === " basketball football swimming " ) {
console.log("matching basketball---->");
var menu = document.querySelector('.sports');
menu.classList.add('sports-with-basketball');
// how to add this class name directly to the first div after body.
// but we are not rendering that div in accordion
//is it possible
}
else{
console.log("not matching");
}
}

When you call an object in the Document Object Model (DOM) using any of the GetElement selectors, it returns an object that can be considered that HTML element. This object includes much more than just the text included in the HTML element. In order to access the text of that element, you want to use the .textContent property.
In addition, an HTML class can potentially be assigned to several elements and therefore GetElementsByClassName returns an array so you would have to do the following, for example:
console.log("sports-title---->" + JSON.stringify(sportsTitle[0].textContent));
You can find a brief introduction to the DOM on the W3Schools Website. https://www.w3schools.com/js/js_htmldom.asp If you follow along it gives an overview of different aspects of the DOM including elements.

Maybe this would be helpful
As you see sportsTitle[0].textContent returns full heading and 0 is the index thus you get "0" when you stringify (serialize) sportsTitle. Why 0? Because you have one <h1> element . See this fiddle http://jsfiddle.net/cqj6g7f0/3/
I added second h1 and see the console.log and you get two indexes 0 and 1
if you want to get a word from element so get substring use substr() method https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr
One way is to change <h1> class attr to id and do sportsTitle.textContent;
and use substr() on this string
or
2nd way is to remain class attr and do sportsTitle[0].textContent;
and substr() on this string
The 2nd is the better way

How can I find out whether a HTML-Tagname is Standalone

Is there a way to find out whether a HTML-tagName comes in pair or alone (Standalone-Tag)?
E.g. <div></div>, <em></em>, <p></p>, ... they come in pair, but <br/>, <input>, <area> ... are Standalone.
I need a function which should find out if a HTML-Code snippet is entered correct. Therefore the function has to investigate among others which HTML-Element can be created with Standalone-Tag.
Do you have any idea how can I find out if an HTML element is standalone? Except for example
something like this:
var myArray = [ list of Standalone-Tags ];
if(jQuery.inArray("test", myArray) != -1 ) { ... }
Thanks.

Browsers don't have a built in list of elements which are defined as empty.
You're most reliable bet would be to create one manually by reading the HTML specification.
Alternatively, you could create an element and see what the browser returns when you convert it to HTML.
var element = prompt("What element name? e.g. br");
var container = document.createElement('div');
var content = document.createElement(element);
container.appendChild(content);
var reg = new RegExp("/" + element);
alert(reg.test(container.innerHTML) ? "Not Empty" : "Empty");

How do I parse nodes from an XML document into an HTML page using javascript?

I'm trying to list the Across and Down clues of a crossword puzzle from an XML feed, but can't figure out how to drill into the nodes. Here's a piece sample of the XML:
<across>
<a1 a="BLOC"
c="Group of like-minded voters"
n="1"
cn="1" />
<a2 a="BATOR"
c="Ulan ___, Mongolia"
n="6"
cn="5" />
<a3 a="OMEN"
c="Black cat, supposedly"
n="12"
cn="10" />
...
<down>
<d1 a="BLIPS"
c="Spots on a radar screen"
n="1"
cn="1" />
<d2 a="LIMIT"
c="Word at the express checkout aisle"
n="2"
cn="2" />
<d3 a="OMANI"
c="Man from Muscat"
n="3"
cn="3" />
<d4 a="CACKLE"
c="Laugh like the Wicked Witch"
n="4"
cn="4" />
I used the following method in my javascript:
document.getElementById('across_clues').innerText = xmlhttp.responseXML.getElementsByTagName('across')[0].childNodes;
document.getElementById('down_clues').innerText = xmlhttp.responseXML.getElementsByTagName('down')[0].childNodes;
I made two divs in my HTML ("#across_clues" and "#down_clues"), but this is what is rendered on my HTML page:
ACROSS
[object NodeList]
DOWN
[object NodeList]
I'm sure this is a rather simple error on my part, as I am very new to parsing XML with javascript, but what am I missing? Is there another call that I need to pass in to actually list the XML data? Please advise.
Thanks,
Carlos
UPDATED QUESTION
OK, so here's what I ran into in my second attempt to solve this problem:
I figured out how to pull in data from individual nodes...one at a time, but I can't figure out how to list an array of nodes all at once. I don't want to list every value in , , etc...only 'cn' and 'c' as a pair for each childNode of (same thing with ).
Here's my revised code:
var across = xmlhttp.responseXML.getElementsByTagName('across')[0].childNodes;
var down = xmlhttp.responseXML.getElementsByTagName('down')[0].childNodes;
var a1Node = xmlhttp.responseXML.getElementsByTagName('a1')[0];
var a2Node = xmlhttp.responseXML.getElementsByTagName('a2')[0];
var d1Node = xmlhttp.responseXML.getElementsByTagName('d1')[0];
var d1Node = xmlhttp.responseXML.getElementsByTagName('d2')[0];
document.getElementById('across_clues').innerText = a1Node.getAttribute('cn');
document.getElementById('across_clues').innerText = a1Node.getAttribute('c');
document.getElementById('down_clues').innerText = d1Node.getAttribute('cn');
document.getElementById('down_clues').innerText = d1Node.getAttribute('c');
RESULT:
Only one attribute renders into #accross_clues and one attribute in #down_clues. Each div only displays the last .getAttribute() call on the list...no matter how many I list. I also tried the function below, but still no success:
function getAcross() {
if (across) {
var acrossNodes = across.childNodes;
if (acrossNodes) {
for (var i = 0; i < acrossNodes.length; i++) {
var cnAttr = acrossNodes[i].getAttribute('cn');
var cAttr = acrossNodes[i].getAttribute('c');
document.getElementById('across_clues').innerText = cnAttr + '.' + cAttr;
}
}
}
}
Can anyone help me out with this? I am a serious novice when it comes to XML and childNodes. Your feedback is much appreciated.
Thanks,
Carlos

Here you are mentioning it as childNodes, this itself is an object and u have access the childNodes with the indexes, like childNodes[0], childNodes[1],..........
Hope this helps you

In your code you try assing XML DOM nodes collection (collection of nodes) to innerHTML property (string type). Additionaly, tags A1, A2 ... is not defined in html: what do you expect to get?
In my opinion you can do one from next (without using extrenal libs, only with javascript):
(bad way) With using javascript you can go starting from
xmlhttp.responseXML.documentElement and looping through all XML DOM
tree elements, attributes, values and create from it content HTML DOM
node tree. This root HTML DOM you can append with using appendChild()
to your document HTML DOM.
(true way) Second way is more proper. You need XSL document to transform your XML document to fragment with using importStylesheet() and transformToFragment() methods of XSLTProcessor object. This fragment you can easily append to your document HTML DOM with appendChild() method. -
For IE8 and less (for IE9 not tested) this above "true way" is not works because XSLTProcessor is not implemented. In IE you must use transformNode(stylesheet). This create HTML string and you can assing it to innerHTML property of your document HTML DOM node.
Update: XSLTProcessor can be not implemented in some mobile browsers.

Develop Reference

JavaScript is the programming language of the Web.

jQuery navigating XML parent child nodes and selecting appropriate attributes from each? - javascript

Related

Google Scripts - keep track of element [duplicate]

Python Selenium Scraping Javascript - Element not found

I am getting empty values for sports-title and third

How can I find out whether a HTML-Tagname is Standalone

How do I parse nodes from an XML document into an HTML page using javascript?

Categories

Resources