Loop through textNodes within selection with unknown number of descendants - javascript

I'm required to basically Find and replace a list of words retrieved as an array of objects (which have comma separated terms) from a webservice. The find and replace only occurs on particular elements in the DOM, but they can have an unknown and varying number of children (of which can be nested an unknown amount of times).
The main part I'm struggling with is figuring out how to select all nodes down to textNode level, with an unknown amount of nested elements.
Here is a very stripped-down example:
Retrieved from the webservice:
[{
terms: 'first term, second term',
youtubeid: '123qwerty789'
},{
terms: 'match, all, of these',
youtubeid: '123qwerty789'
},{
terms: 'only one term',
youtubeid: '123qwerty789'
},
etc]
HTML could be something like:
<div id="my-wrapper">
<ol>
<li>This is some text here without a term</li>
<li>This is some text here with only one term</li>
<li>This is some text here that has <strong>the first term</strong> nested!</li>
</ol>
</div>
Javascript:
$('#my-wrapper').contents().each(function(){
// Unfortunately only provides the <ol> -
// How would I modify this to give me all nested elements in a loopable format?
});

The following function is very similar to cbayram's but should be a bit more efficient and it skips script elements. You may want to skip other elements too.
It's based on a getText function I have used for some time, your requirements are similar. The only difference is what to do with the value of the text nodes.
function processTextNodes(element) {
element = element || document.body;
var self = arguments.callee; // or processTextNodes
var el, els = element.childNodes;
for (var i=0, iLen=els.length; i<iLen; i++) {
el = els[i];
// Exclude script element content
// May need to add other node types here
if (el.nodeType == 1 && el.tagName && el.tagName.toLowerCase() != 'script') {
// Have an element node, so process it
self(el);
// Othewise see if it's a text node
// If working with XML, add nodeType 4 if you want to process
// text in CDATA nodes
} else if (el.nodeType == 3) {
/* do something with el.data */
}
}
/* return a value? */
}
The function should be completely browser agnostic and should work with any conforming DOM (e.g. XML and HTML). Incidentally, it's also very similar to jQuery's text function.
One issue you may want to consider is words split over two or more nodes. It should be rare, but difficult to find when it happens.

I think you want
$('#my-wrapper *').each
This should select all the descendants of #my-wrapper no matter what they are.
See this fiddle for an example

I'm not sure if you are looking strictly for a jQuery answer, but here is one solution in JavaScript:
var recurse = function(el) {
// if text node or comment node
if(el.nodeType == 3 || el.nodeType == 8) {
// do your work here
console.log("Text: " + el.nodeValue);
}else {
for(var i = 0, children = el.childNodes, len = children.length; i < len; i++) {
recurse(children[i]);
}
}
}
recurse(document.getElementById("my-wrapper"));

Try the below:
$('#my-wrapper li')

Related

Vanilla JS: Find all the DOM elements that just contain text

I want to get all the DOM elements in an HTML that doesn't contain any node, but text only.
I've got this code right now:
var elements = document.querySelectorAll("body *");
for(var i = 0; i < elements.length; i++) {
if(!elements[i].hasChildNodes()) {
console.log(elements[i])
}
}
This prints of course elements that have absolutely no content (and curiously enough, iframes).
Texts are accounted as a child node, so the .childNodes.length equals 1, but I don't know how to distinguish the nodes from the text. typeof the first node is always object, sadly.
How to distinguish the texts from the nodes?
Basically you are looking for leaf nodes of DOM with something inside the textContent property of the leaf node.
Let's traverse DOM and work out our little logic on leaf nodes.
const nodeQueue = [ document.querySelector('html') ];
const textOnlyNodes = [];
const textRegEx = /\w+/gi;
function traverseDOM () {
let currentNode = nodeQueue.shift();
// Our Leaf node
if (!currentNode.childElementCount && textRegEx.test(currentNode.textContent)) {
textOnlyNodes.push(currentNode);
return;
}
// Nodes with child nodes
nodeQueue.push(...currentNode.children);
traverseDOM();
}
childElementCount property make sure that the node is the leaf node and the RegEx test on textContent property is just my understanding of what a text implies in general. You can anytime tune the expression to make it a btter fit for your use case.
You can check for elements that have no .firstElementChild, which means it will only have text (or other invisible stuff).
var elements = document.querySelectorAll("body *");
for (var i = 0; i < elements.length; i++) {
if (!elements[i].firstElementChild) {
console.log(elements[i].nodeName)
}
}
<p>
text and elements <span>text only</span>
</p>
<div>text only</div>
The script that the stack snippet is included because it also only has text. You can filter out scripts if needed. This will also include elements that can not have content, like <input>.

Get all the descendant nodes (also the leaves) of a certain node

I have an html document consists of a <div id="main">. Inside this div may be several levels of nodes, without a precise structure because is the user who creates the document content.
I want to use a JavaScript function that returns all nodes within div id="main". Any tag is, taking into account that there may be different levels of children.
For example, if I has this document:
...
<div id="main">
<h1>bla bla</h1>
<p>
<b>fruits</b> apple<i>text</i>.
<img src="..">image</img>
</p>
<div>
<p></p>
<p></p>
</div>
<p>..</p>
</div>
...
The function getNodes would return an array of object nodes (I don't know how to represent it, so I list them):
[h1, #text (= bla bla), p, b, #text (= fruits), #text (= _apple), i, #text (= text), img, #text (= image), div, p, p, p, #text (= ..)]
As we see from the example, you must return all nodes, even the leaf nodes (ie #text node).
For now I have this function that returns all nodes except leaf:
function getNodes() {
var all = document.querySelectorAll("#main *");
for (var elem = 0; elem < all.length; elem++) {
//do something..
}
}
In fact, this feature applied in the above example returns:
[H1, P, B, I, IMG, DIV, P, P, P]
There aren't #text nodes.
Also, if text elements returned by that method in this way:
all[elem].children.length
I obtain that (I tested on <p>fruits</p>) <p> is a leaf node.
But if I build the DOM tree it is clear that is not a leaf node, and that in this example the leaf nodes are the #text...
Thank you
Classic case for recursion into the DOM.
function getDescendants(node, accum) {
var i;
accum = accum || [];
for (i = 0; i < node.childNodes.length; i++) {
accum.push(node.childNodes[i])
getDescendants(node.childNodes[i], accum);
}
return accum;
}
and
getDescendants( document.querySelector("#main") );
Aside from the already existing and perfectly functional answer, I find it worth mentioning that one can do away with the recursion and the many resulting function calls by simply navigating via the firstChild, nextSibling, and parentNode properties:
function getDescendants(node) {
var list = [], desc = node, checked = false, i = 0;
do {
checked || (list[i++] = desc);
desc =
(!checked && desc.firstChild) ||
(checked = false, desc.nextSibling) ||
(checked = true, desc.parentNode);
} while (desc !== node);
return list;
}
(Whenever we encounter a new node, we add it to the list, then try going to its first child node. If such does not exist, get the next sibling instead. Whenever no child node or following sibling is found, we go back up to the parent, while setting the checked flag to avoid adding that to the list again or reentering its descendant tree.)
This will, in virtually every case, improve performance greatly. Not that there is nothing left to optimize here, e.g. one could cache the nodes where we descend further into the hierarchy so as to later get rid of the parentNode when coming back up. I leave implementing this as an exercise for the reader.
Keep in mind though that iterating through the DOM like this will rarely be the bottleneck in a script. Unless you are going through a large DOM tree many tens/hundreds of times a second, that is — in which case you probably ought to think about avoiding that if at all possible, rather than simply optimizing it.
the children property only returns element nodes. If you want all children, I would suggest using the childNodes property. Then you can loop through this nodeList, and eliminate nodes that have nodeType of Node.ELEMENT_NODE or pick which other node types you would be interested in
so try something like:
var i, j, nodes
var result=[]
var all = document.querySelectorAll("#main *");
for (var elem = 0; elem < all.length; elem++) {
result.push(all[elem].nodeName)
nodes = all[elem].childNodes;
for (i=0, j=nodes.length; i<j; i++) {
if (nodes[i].nodeType == Node.TEXT_NODE) {
result.push(nodes[i].nodeValue)
}
}
}
If you only need the html tags and not the #text, you can just simply use this:<elem>.querySelectorAll("*");

Replace all the ocurrance of a string in an element

I want to replace a particular string in (the text of) all the descendant elements of a given element.
innerHTML cannot be used as this sequence can appear in attributes. I have tried using XPath, but it seems the interface is essentially read-only. Because this is limited to one element, functions like document.getElementsByTagName cannot be used either.
Could any suggest any way to do this? Any jQuery or pure DOM method is acceptable.
Edit:
Some of the answers are suggesting the problem I was trying to work around: modifying the text directly on an Element will cause all non-Text child nodes to be removed.
So the problem essentially comes down to how to efficiently select all the Text nodes in a tree. In XPath, you can easily do it as //text(), but the current XPath interface does not allow you to change these Text nodes it seems.
One way to do this is by recursion as shown in the answer by Bergi. Another way is to use the find('*') selector of jQuery, but this is a bit more expensive. Still waiting to see if there' are better solutions.
Just use a simple selfmade DOM-iterator, which walks recursively over all nodes:
(function iterate_node(node) {
if (node.nodeType === 3) { // Node.TEXT_NODE
var text = node.data.replace(/any regular expression/g, "any replacement");
if (text != node.data) // there's a Safari bug
node.data = text;
} else if (node.nodeType === 1) { // Node.ELEMENT_NODE
for (var i = 0; i < node.childNodes.length; i++) {
iterate_node(node.childNodes[i]); // run recursive on DOM
}
}
})(content); // any dom node
A solution might be to surf through all available nodes (TextNodes included) and apply a regexp pattern on the results. To grab TextNodes as well, you need to invoke jQuerys .contents(). For instance:
var search = "foo",
replaceWith = 'bar',
pattern = new RegExp( search, 'g' );
function searchReplace( root ) {
$( root ).contents().each(function _repl( _, node ) {
if( node.nodeType === 3 )
node.nodeValue = node.nodeValue.replace( pattern, replaceWith );
else searchReplace( node );
});
}
$('#apply').on('click', function() {
searchReplace( document.getElementById('rootNode') );
});
Example: http://jsfiddle.net/h8Rxu/3/
Reference: .contents()
Using jQuery:
$('#parent').children().each(function () {
var that = $(this);
that.text(that.text().replace('test', 'foo'));
});
If you prefer to search through all children instead of just immediate children, use .find() instead.
http://jsfiddle.net/ExwDx/
Edit: Documentation for children, each, text, and find.
Sorry, just got it myself:
$('#id').find('*').each(function(){
$.each(this.childNodes, function() {
if (this.nodeType === 3) {
this.data = this.data.toUpperCase();
}
})
})
I used toUpperCase() here to make the result more obvious, but any String operation would be valid there.

how to replace all matching plain text strings in string using javascript (but not tags or attributes)?

imagine this html on a page
<div id="hpl_content_wrap">
<p class="foobar">this is one word and then another word comes in foobar and then more words and then foobar again.</p>
<p>this is a link with foobar in an attribute but only the foobar inside of the link should be replaced.</p>
</div>
using javascript, how to change all 'foobar' words to 'herpderp' without changing any inside of html tags?
ie. only plain text should be changed.
so the successful html changed will be
<div id="hpl_content_wrap">
<p class="foobar">this is one word and then another word comes in herpderp and then more words and then herpderp again.</p>
<p>this is a link with herpderp in an attribute but only the herpderp inside of the link should be replaced. </p>
</div>
Here is what you need to do...
Get a reference to a bunch of elements.
Recursively walk the children, replacing text in text nodes only.
Sorry for the delay, I was sidetracked before I could add the code.
var replaceText = function me(parentNode, find, replace) {
var children = parentNode.childNodes;
for (var i = 0, length = children.length; i < length; i++) {
if (children[i].nodeType == 1) {
me(children[i], find, replace);
} else if (children[i].nodeType == 3) {
children[i].data = children[i].data.replace(find, replace);
}
}
return parentNode;
}
replaceText(document.body, /foobar/g, "herpderp");​​​
jsFiddle.
It's a simple matter of:
identifying all text nodes in the DOM tree,
then replacing all foobar strings in them.
Here's the full code:
// from: https://stackoverflow.com/questions/298750/how-do-i-select-text-nodes-with-jquery
var getTextNodesIn = function (el) {
return $(el).find(":not(iframe)").andSelf().contents().filter(function() {
return this.nodeType == 3;
});
};
var replaceAllText = function (pattern, replacement, root) {
var nodes = getTextNodesIn(root || $('body'))
var re = new RegExp(pattern, 'g')
nodes.each(function (i, e) {
if (e.textContent && e.textContent.indexOf(pattern) != -1) {
e.textContent = e.textContent.replace(re, replacement);
}
});
};
// replace all text nodes in document's body
replaceAllText('foobar', 'herpderp');
// replace all text nodes under element with ID 'someRootElement'
replaceAllText('foobar', 'herpderp', $('#someRootElement'));
Note that I do a precheck on foobar to avoid processing crazy long strings with a regexp. May or may not be a good idea.
If you do not want to use jQuery, but only pure JavaScript, follow the link in the code snippet ( How do I select text nodes with jQuery? ) where you'll also find a JS only version to fetch nodes. You'd then simply iterate over the returned elements in a similar fashion.

Are there any text selector in jquery?

Are there any text selector in jquery ?
My Code
<anything>Hello World! Hello World!</anything>
Reslut Should be (Using Jquery)
<anything>Hello <span>World</span>! Hello <span>World</span>!</anything>
No. jQuery works primarily with elements and gives you very little for handling text.
To do a find-and-replace on text you will need to check each text node separately and do DOM splitText operations to take it apart when a match is found. For example:
function findText(element, pattern, callback) {
for (var childi= element.childNodes.length; childi-->0;) {
var child= element.childNodes[childi];
if (child.nodeType==1) {
var tag= child.tagName.toLowerCase();
if (tag!=='script' && tag!=='style' && tag!=='textarea')
findText(child, pattern, callback);
} else if (child.nodeType==3) {
var matches= [];
var match;
while (match= pattern.exec(child.data))
matches.push(match);
for (var i= matches.length; i-->0;)
callback.call(window, child, matches[i]);
}
}
}
findText(element, /\bWorld\b/g, function(node, match) {
var span= document.createElement('span');
node.splitText(match.index+match[0].length);
span.appendChild(node.splitText(match.index));
node.parentNode.insertBefore(span, node.nextSibling);
});
$('anything').html(function(i, v) {
return v.replace(/(World)/g, '<span>$1</span>');
});
The above snippet uses functionality added in jQuery 1.4.
Note: this solution is safe for elements containing only raw text (and no child elements).
You can do a regex replacement, etc for your simple case, but for a more general answer: no.
jQuery just doesn't provide much help when dealing with text nodes, it's designed primarily for dealing with element node types (nodeType == 1), not text node types (nodeType == 3)...so yes you can use it where it helps (e.g. .contents() and .filter()), but that won't be often since it's not the library's main purpose.

Categories

Resources