Why nodeValue is not getting me the text between the <span> - javascript

I have the following HTML:
<html>
<body>
<div>
<span> $12.95 </span>
</div>
</body>
</html>
And the following Javascript:
var all = document.body.getElementsByTagName("*");
for (var i=0, max=all.length; i < max; i++) {
console.log(all[i].nodeValue);
}
I see null in the console when it gets to the element. I am wondering how may I be able to get just the text of all the elements in a page? I know that if I use innerHTML I would get the text, but then I would get the text repeated somehow. So, for the <div> I would get <span> $12.95 </span> and then for the <span> I would get $12.95

If you want to use nodeValue to get the contents then you have to traverse down to the text node that is contained within the span.
http://jsfiddle.net/xLJMb/
var all = document.body.getElementsByTagName("*");
for (var i=0, max=all.length; i < max; i++) {
console.log(all[i].nodeValue);
for(var j = 0, max2 = all[i].childNodes.length; j < max2; j++) {
console.log(all[i].childNodes[j].nodeValue);
}
}
Text Nodes are not elements, so they are not returned directly by getElementsByTagName().

Why do not use from this html:
<div>
<span id="span">$12.95 </span>
</div>
and this Script:
console.log($('#span').html());

As addendum to the answer above, in modern browser, if you want to iterate only text node, you could use the TreeWalker API:
var treeWalker = document.createTreeWalker(
document.body,
NodeFilter.SHOW_TEXT,
// Using ES6 arrow function, this is removing all "empty" text nodes
// equivalent to:
// function (node) { return !!node.nodeValue.trim() }
node => !!node.nodeValue.trim()
);
while(treeWalker.nextNode())
console.log(treeWalker.currentNode.nodeValue);

Related

Select sibling dom element based on the text content of one of sibling

I have following html structure -
<p>
<span class="font-weight-medium">
Phone Number:
</span>
<br>
<span>(123) 456-7869</span>
</p>
Now I want to select all such span which have phone number using pure javascript or css selector.
I am using x-ray.
I had tried this but this doesn't seems to work.
x(html,'span[contains(text(),"Phone Number")]')
(function(err, obj) {
console.log(obj);
});
If you are fine using jQuery, then, here is the answer:
$("span:contains(Phone Number:)~span")
or this
$('span:contains(Phone Number:)').siblings('span')
You can get all span elements and loop to find next span after those containing Phone Number:
span_tags = document.getElementsByTagName('span');
for (var i=0, max=span_tags.length; i < max; i++) {
el = span_tags[i];
str = el.innerHTML;
regex = / *Phone Number: */;
if ( str.match(regex)) {
console.log(span_tags[i+1].innerHTML);
}
}
<p>
<span class="font-weight-medium">
Phone Number:
</span>
<span>(123) 456-7869</span>
</p>
I'm not sure there is a css selector based solution (:contain is a jquery pseudo selector). But here is another approach using indexOf:
span_tags = document.getElementsByTagName('span');
for (var i=0, max=span_tags.length; i < max; i++) {
if (span_tags[i].innerHTML.indexOf("Phone Number:") != -1) {
console.log(span_tags[i+1].innerHTML);
}
}
<p>
<span class="font-weight-medium">
Phone Number:
</span>
<span>(123) 456-7869</span>
</p>

Surrounding each word from user selected text with a DOM element

From a DOM fragment coming from the user selected text window.getSelection():
<h1>Hello world!</h1>
<p>How <b>do y<i>o<i>u</b> do?</p>
I'd like to surround a <span class="foo"></span> around each word, ie:
<h1><span class="foo">Hello</span> <span class="foo">world!</span></h1>
<p><span class="foo">How</span> <b><span class="foo">do</span> <span class="foo">y<i>o<i>u</span></b> <span class="foo">do?</span></p>
How can I do this in javascript?
For now here is what I have:
// Get highlighted text
var selection = window.getSelection();
// Iterate over ranges
for (var i = 0, l = selection.rangeCount; i < l; ++i) {(function () {
var range = selection.getRangeAt(i);
var fragment = range.cloneContents();
// HELP HERE: Surround each word in the fragment...
}())}

Stack at getting elements with javascript

I have the following html elements from which I have to get some specific texts,
example "John Doe"
I'm a newbie in javascript but have been playing with getElementById etc but I can't seem to get this one right.
<div id="name">
<p><span id="nameheading">name: </span> John Doe</p>
</div>
Bellow is What I have tried:
function askInformation()
{
var nameHeading = document.getElementById("nameheading");
var paragraph = document.getElementsByTagName("p").item(0).innerHTML ;
var name = paragraph[4];
console.log(name); // prints letter (n)
}
I need help please
If you want to get the text following the span in the following:
<div id="name">
<p><span id="nameheading">name: </span> John Doe</p>
</div>
You can use something like:
// Get a reference to the span
var span = document.getElementById('nameheading');
// Get the following text
var text = span.nextSibling.data;
However that is highly dependent on the internal structure, it may be best to loop over text node children and collect the content of all of them. You may also want to trim leading and trailing white space.
You could also get a reference to the parent DIV and use a function like the following that collects the text children and ignores child elements:
// Return the text of the child text nodes of an element,
// but not descendant element text nodes
function getChildText(element) {
var children = element.childNodes;
var text = '';
for (var i=0, iLen=children.length; i<iLen; i++) {
if (children[i].nodeType == '3') {
text += children[i].data;
}
}
return text;
}
var text = getChildText(document.getElementById('name').getElementsByTagName('p')[0]);
or more concisely for hosts that support the querySelector interface:
var text = getChildText(document.querySelector('#name p'));
var paragraph = document.getElementsByTagName("p").item(0).innerHTML ;
var name = paragraph.replace('<span id="nameheading">name: </span>','').trim(); // John Doe

How to get innerHTML of DIV without few inside DIV's?

I have some DIV, what contains HTML with images, styles e.t.c. I want to remove exact div's that contains id = 'quot' or className = 'quote', but i don't understand how i can get not only innerHTML of each tag. For example, < p > and < /p > which don't have innerHTML also should be included in final parsed HTML.
var bodytext = document.getElementById("div_text");
var NewText = "";
if (bodytext.hasChildNodes){
var children = bodytext.childNodes;
for (var i = 0; i < children.length; i++){
if (children[i].id != "quot" && children[i].className != "quote" && children[i].innerText != ""){
NewText = NewText + children[i].innerHTML;
}
}
HTML of source need to be parsed:
<div id="div_text">
<p>
Some Text</p>
<p>
Some Text</p>
<p>
<img alt="" src="localhost/i/1.png" /></p>
<div id="quot" class="quote" />
any text <div>text of inside div</div>
<table><tr><td>there can be table</td></tr></table>
</div>
<p>
</p>
</div>
Desired output:
<p>
Some Text</p>
<p>
Some Text</p>
<p>
<img alt="" src="localhost/i/1.png" /></p>
<p>
</p>
Just grab a reference to the targeted divs and remove them from their respective parents.
Perhaps something a little like this?
EDIT: Added code to perform operation on a clone, rather than the document itself.
div elements don't have .getElementById method, so we search for an element manually.
window.addEventListener('load', myInit, false);
function removeFromDocument()
{
// 1. take car of the element with id='quot'
var tgt = document.getElementById('quot');
var parentNode = tgt.parentNode;
parentNode.removeChild(tgt);
// 2. take care of elements whose class == 'quote'
var tgtList = document.getElementsByClassName('quote');
var i, n = tgtList.length;
for (i=0; i<n; i++)
{
// we really should be checking to ensure that there aren't nested instances of matching divs
// The following would present a problem - <div class='quote'>outer<div class='quote'>inner</div></div>
// since the first iteration of the loop would also remove the second element in the target list,
parentNode = tgtList[i].parentNode;
parentNode.removeChild(tgtList[i]);
}
// 3. remove the containing div
var container = document.getElementById('div_text');
container.outerHTML = container.innerHTML;
}
function cloneAndProcess()
{
var clonedCopy = document.getElementById('div_text').cloneNode(true);
var tgt;// = clonedCopy.getElementById('quot');
var i, n = clonedCopy.childNodes.length;
for (i=0; i<n; i++)
{
if (clonedCopy.childNodes[i].id == 'quot')
{
tgt = clonedCopy.childNodes[i];
var parentNode = tgt.parentNode;
parentNode.removeChild(tgt);
break; // done with for loop - can only have 1 element with any given id
}
}
// 2. take care of elements whose class == 'quote'
var tgtList = clonedCopy.getElementsByClassName('quote');
var i, n = tgtList.length;
for (i=0; i<n; i++)
{
// we really should be checking to ensure that there aren't nested instances of matching divs
// The following would present a problem - <div class='quote'>outer<div class='quote'>inner</div></div>
// since the first iteration of the loop would also remove the second element in the target list,
parentNode = tgtList[i].parentNode;
parentNode.removeChild(tgtList[i]);
}
// 3. remove the containing div
//var container = clonedCopy; //.getElementById('div_text');
//container.outerHTML = container.innerHTML;
console.log(clonedCopy.innerHTML);
}
function myInit()
{
cloneAndProcess();
//removeFromDocument();
}

RegEx JavaScript problem

I have this text:
<body>
<span class="Forum"><div align="center"></div></span><br />
<span class="Topic">Text</span><br />
<hr />
<b>Text</b> Text<br />
<hr width=95% class="sep"/>
TextText
<hr />
<b>Text</b> -Text<br />
<hr width=95% class="sep"/>
**Text what i need.**
<hr />
and my RegEx for "Text what I need" - /"sep"(.*)hr/m .
It's wrong: Why?
Don’t use regular expression, use DOM methods instead:
var elems = document.getElementByTagName("hr");
for (var i=0; i<elems.length; ++i) {
var elem = elems[i];
if (/(?:^|\s+)sep(?:\s|$)/.test(elem.className) &&
elem.nextSibling && elem.nextSibling.nodeType === Node.TEXT_NODE) {
var text = elems.nextSibling.nodeValue;
break;
}
}
This selects all HR elements, checks if it has the class sep and grabs the next sibling node if it is a text node.
. doesn't match newlines in JavaScript regular expressions. Try:
/"sep"([\s\S]*)hr/m
IMO, you're much better off going for a different approach, regex isn't ideal for extracting data from HTML. A better method would be to create a div, set the element's innerHTML property to the HTML string you have, then use DOM traversal to find the text node you need.
Here's an example of what I mean: http://www.jsfiddle.net/W33n6/. It uses the following code to get the text:
var div = document.createElement("div");
div.innerHTML = html;
var hrs = div.getElementsByTagName("hr");
for (var i = 0; i < hrs.length; i++) {
if (hrs[i].className == "sep") {
document.body.innerHTML = hrs[i].nextSibling.nodeValue;
break;
}
}​
EDIT: Gumbo's version is a little stricter than mine, checking for the "sep" class among other classes and ensuring the node following is a text node.

Categories

Resources