Size limit to javascript [node].nodeValue field? - javascript

I'm receiving XML data via an AJAX call. One of the tags has a large amount of text, roughly 4000-5000 characters. In Firefox, the field is being truncated around the 3000th character. Most everything I've found online says there is no limit to node value sizes, but sometime it's implementation dependent - no solid answers.
Does anyone have any suggestions for why this might be occurring, assuming there is no restriction on the size of the nodeValue? Any workarounds if so?
<test>
<foo>very long string...</foo>
</test>
value = testTag.getElementsByTagName("foo").item(0).firstChild.nodeValue;
value is truncated.
-If I print the xmlHttp.responseText, all of the data from is printed.

Check this. It says:
"Also important to note is that although the specifications say that no matter how much text exists between tags, it should all be in one text node, in practice this is not always the case. In Opera 7-9.2x and Mozilla/Netscape 6+, if the text is larger than a specific maximum size, it is split into multiple text nodes. These text nodes will be next to each other in the childNodes collection of the parent element."

#Kooilnc has it right, 4k limit on text nodes in Firefox.
You can work around it by doing this:
function getNodeText(xmlNode) {
if(!xmlNode) return '';
if(typeof(xmlNode.textContent) != "undefined") return xmlNode.textContent;
return xmlNode.firstChild.nodeValue;
}
text = getNodeText(document.getElementsByTagName("div").item(0));
alert(text.length);
See it in action here: http://jsfiddle.net/Bkemk/2/
Function borrowed from here: http://www.quirksmode.org/dom/tests/textnodesize.html

What I've come up with instead of targeting a single node:
function getDataOfImmediateChild(parentTag, subTagName)
{
var val = "";
var listOfChildTextNodes;
var directChildren = parentTag.childNodes;
for (m=0; m < directChildren.length; m++)
{
if (directChildren[m].nodeName == subTagName)
{
/* Found the tag, extract its text value */
listOfChildTextNodes = directChildren[m].childNodes;
for (n=0; n < listOfChildTextNodes.length; n++)
{
if (typeof listOfChildTextNodes[n] == "TextNode")
val += listOfChildTextNodes[n].nodeValue;
}
}
}
return val;
It might be worthwhile to also ensure the listOfChildTextNodes[n] element is a TextNode.

#Ryley
The only reason I do an iteration over the direct children is because getElementsByTagName and getElementsById will return nodes that are farther down the hierarchy. Better explained as an example:
<foo>
<bar>
<zoo>
<bar>-</bar>
</zoo>
<bar></bar>
</zoo>
If I say fooTag.getElementsByTagName("bar"), it's going to return an array of both s, even though I only want the second (since it's the only true child of ). The only way I can think of enforcing this "search only my direct children" is by iterating over the children.

Related

Get start index of selected text in entire html page

I want to be able to obtain the character(s) that come before the text that the user has highlighed, and i need to include html in this. When the example below is rendered, the user will only see "word". When a user highlights this i want to be able to figure out that there is an angle bracket (or any other char)
<span>word</span>
The following will give me the selection and it gives me a start and an end index, but these are relative to the node that this text belongs to. It is also dom related and doesn't include html.
selectedText= childwindow.getSelection();
To get around this i have tried to get the indexOf value for this piece of text on the html string of the page and this works.
childwindow.document.documentElement.innerHTML.indexOf(selectedText)
The issue with this is that if I highlight something like "the", a word that appears on the page several times, the index is not correct. I can understand why, but I dont know what else to do. I would imagine i need to get this from the dom somehow as this knows the exact piece of text that I have obtained.
Even if this is not possible, is there anything i can grab from the dom that will help me make the indexOf request more reliable. I can add extra data under the hood to make sure it matches the value I want.
This ought to do the trick:
function textSelected() {
if (window.getSelection) {
t = window.getSelection().toString() || "";
} else if (document.selection && document.selection.type != "Control") {
t = document.selection.createRange().text || "";
}
return t;
}
Main thing is window's function getSelection which:
Returns a Selection object representing the range of text selected by
the user or the current position of the caret.
Once you have it, just get what you wanna from Selection obj.
The else part is in case you have selected across elements. Then you need to use selection range. More on document.selection:
docs

IE Issue with Javascript Regex replacement

r = r.replace(/<TR><TD><\/TD><\/TR>/gi, rider_html);
...does not work in IE but works in all other browsers.
Any ideas or alternatives?
I've come to the conclusion that the variable r must not have the value in it you expect because the regex replacement should work fine if there is actually a match. You can see in this jsFiddle that the replace works fine if "r" actually has a match in it.
This is the code from fiddle and it shows the proper replacement in IE.
var r = "aa<TR><TD></TD></TR>bb";
var rider_html = " foo ";
r = r.replace(/<TR><TD><\/TD><\/TR>/gi, rider_html);
alert(r);
So, we can't really go further to diagnose without knowing what the value of "r" is and where it came from or knowing something more specific about the version of IE that you're running in (in which case you can just try the fiddle in that version yourself).
If r came from the HTML of the document, then string matching on it is a bad thing because IE does not keep the original HTML around. Instead it reconstitutes it when needed from the parsed page and it puts some things in different order (like attributes), different or no quotes around attributes, different capitalization, different spacing, etc...
You could do something like this:
var rows = document.getElementsByTagName('tr');
for (var i = 0; i < rows.length; i++) {
var children = rows[i].children;
if (children.length === 1 && children[0].nodeName.toLowerCase() === 'td') {
children[0].innerHTML = someHTMLdata
}
}
Note that this sets the value of the table cell, rather than replacing the whole row. If you want to do something other than this, you'll have to use DOM methods rather than innerHTML and specify exactly what you actually want.

JavaScript & string length: why is this simple function slow as hell?

i'm implementing a charcounter in the UI, so a user can see how many characters are left for input.
To count, i use this simple function:
function typerCount(source, layerID)
{
outPanel = GetElementByID(layerID);
outPanel.innerHTML = source.value.length.toString();
}
source contains the field which values we want to meassure
layerID contains the element ID of the object we want to put the result in (a span or div)
outPanel is just a temporary var
If i activate this function, while typing the machine really slows down and i can see that FF is using one core at 100%. you can't write fluently because it hangs after each block of few letters.
The problem, it seems, may be the value.length() function call in the second line?
Regards
I can't tell you why it's that slow, there's just not enough code in your example to determine that. If you want to count characters in a textarea and limit input to n characters, check this jsfiddle. It's fast enough to type without obstruction.
It could be having problems with outPanel. Every time you call that function, it will look up that DOM node. If you are targeting the same DOM node, that's very expensive for the browser if it's doing that every single time you type a character.
Also, this is too verbose:
source.value.length.toString();
This is sufficient:
source.value.length;
JavaScript is dynamic. It doesn't need the conversion to a string.
I doubt your problem is with the use of innerHTML or getElementById().
I would try to isolate the problem by removing parts of the function and seeing how the cpu is used. For instance, try it all these ways:
var len;
function typerCount(source, layerID)
{
len = source.value.length;
}
function typerCount(source, layerID)
{
len = source.value.length.toString();
}
function typerCount(source, layerID)
{
outPanel = GetElementByID(layerID);
outPanel.innerHTML = "test";
}
As artyom.stv mentioned in the comments, cache the result of your GetElementByID call. Also, as a side note, what is GetElementByID doing? Is it doing anything else other than calling document.getElementById?
How would you cache this you say?
var outPanelsById = {};
function getOutPanelById(id) {
var panel = outPanelsById[id];
if (!panel) {
panel = document.getElementById(id);
outPanelsById[id] = panel;
}
return panel;
};
function typerCount(source, layerId) {
var panel = getOutPanelById(layerId);
panel.innerHTML = source.value.length.toString();
};
I'm thinking there has to be something else going on though, as even getElementById calls are extremely fast in FF.
Also, what is "source"? Is it a DOMElement? Or is it something else?

swapping nodes in a liveset of nodes

This has been a challenge for me...
I have a set of nodes in an XML doc. I need to sort them based on a certain node value. So if I iterate through the nodes, and then the node value matches my criteria, I want it to go to the end.
Problem is, of course as soon as I swap, as nodes are in a live set, the iteration pointer misses one entry of course, as the appendChild is operating on a live-set.
This is my code so far, but as I said, it may miss an entry due to the swapping:
for (var i=1; i <= nElem; i++)
{
var node = getNode(dom,"//item[" + i + "]");
var state = getNodeValue(dom,"//item[" + i + "]/state");
if ((state != 'XX') && (i != nElem))
{
node.parentNode.appendChild(node);
}
}
What I actually want is that all items in state "XX" are at the top.
Has anyone an intelligent idea to this?
Thanks
You could use array.sort() and pass a custom sort routine:
var nodes = getNode(dom, "//item"); gets you an array of items
next, remove the entries in nodes from the dom
do an nodes.sort(sortfunction) where sortfunction is sortfunction(a,b)
implement sortfunction so that it returns
-1 if a shall be lower than b
0 if equal
1 if a shall be higher than b
add the entries of nodes back to the dom
I think, that would do it (as long as I'm not missing something).

find words in html page with javascript

how can i search an html page for a word fast?
and how can i get the html tag that the word is in? (so i can work with the entire tag)
To find the element that word exists in, you'd have to traverse the entire tree looking in just the text nodes, applying the same test as above. Once you find the word in a text node, return the parent of that node.
var word = "foo",
queue = [document.body],
curr
;
while (curr = queue.pop()) {
if (!curr.textContent.match(word)) continue;
for (var i = 0; i < curr.childNodes.length; ++i) {
switch (curr.childNodes[i].nodeType) {
case Node.TEXT_NODE : // 3
if (curr.childNodes[i].textContent.match(word)) {
console.log("Found!");
console.log(curr);
// you might want to end your search here.
}
break;
case Node.ELEMENT_NODE : // 1
queue.push(curr.childNodes[i]);
break;
}
}
}
this works in Firefox, no promises for IE.
What it does is start with the body element and check to see if the word exists inside that element. If it doesn't, then that's it, and the search stops there. If it is in the body element, then it loops through all the immediate children of the body. If it finds a text node, then see if the word is in that text node. If it finds an element, then push that into the queue. Keep on going until you've either found the word or there's no more elements to search.
You can iterate through DOM elements, looking for a substring within them. Neither fast nor elegant, but for small HTML might work well enough.
I'd try something recursive, like: (code not tested)
findText(node, text) {
if(node.childNodes.length==0) {//leaf node
if(node.textContent.indexOf(text)== -1) return [];
return [node];
}
var matchingNodes = new Array();
for(child in node.childNodes) {
matchingNodes.concat(findText(child, text));
}
return matchingNodes;
}
You can try using XPath, it's fast and accurate
http://www.w3schools.com/Xpath/xpath_examples.asp
Also if XPath is a bit more complicated, then you can try any javascript library like jQuery that hides the boilerplate code and makes it easier to express about what you want found.
Also, as from IE8 and the next Firefox 3.5 , there is also Selectors API implemented. All you need to do is use CSS to express what to search for.
You can probably read the body of the document tree and perform simple string tests on it fast enough without having to go far beyond that - it depends a bit on the HTML you are working with, though - how much control do you have over the pages? If you are working within a site you control, you can probably focus your search on the parts of the page likely to be different page from page, if you are working with other people's pages you've got a tougher job on your hands simply because you don't necessarily know what content you need to test against.
Again, if you are going to search the same page multiple times and your data set is large it may be worth creating some kind of index in memory, whereas if you are only going to search for a few words or use smaller documents its probably not worth the time and complexity to build that.
Probably the best thing to do is to get some sample documents that you feel will be representative and just do a whole lot of prototyping based around the approaches people have offered here.
form.addEventListener("submit", (e) => {
e.preventDefault();
var keyword = document.getElementById("search_input");
let words = keyword.value;
var word = words,
queue = [document.body],
curr;
while (curr = queue.pop()) {
if (!curr.textContent.toUpperCase().match(word.toUpperCase())) continue;
for (var i = 0; i < curr.childNodes.length; ++i) {
switch (curr.childNodes[i].nodeType) {
case Node.TEXT_NODE: // 3
if (curr.childNodes[i].textContent.toUpperCase().match(word.toUpperCase())) {
console.log("Found!");
console.log(curr);
curr.scrollIntoView();
}
break;
case Node.ELEMENT_NODE: // 1
queue.push(curr.childNodes[i]);
break;
}
}
}
});

Categories

Resources