document.querySelector via textContent

document.querySelector via textContent - javascript

Is it possible in JS to select the first element whatsoever with document.querySelector solely by a given textDocument, if the that textDocument is available on viewport when the code is executed?
I am looking for a way without the following code because it brings out all relevant tagNames and filters them via textContent, but I want to select them by text content, not to filter.
document.querySelectorAll('tagName').forEach( (e)=>{
if (e.textContent.includes('Delete')) {
e.click();
}
});

There is no CSS selector targeting on textContent.
Also, as your code is currently written, it's quite easy to grab the first element which textContent includes this string, it will always be document.documentElement or null.
You should make your query a bit more strict.
You could probably build an XPath query to this very extent, but that would end up slower than iterating over all the nodes by yourself.
So if performance is an issue, a TreeWalker is the way to go.
Here is a function that will grab elements by textContent.
It has different optional arguments which will allow you to tell
if the query should be strict ("string === textContent" which is the default),
a node to start the search from (defaults to document.documentElement)
if you are only interested in the elements without children
function getElementByTextContent(str, partial, parentNode, onlyLast) {
var filter = function(elem) {
var isLast = onlyLast ? !elem.children.length : true;
var contains = partial ? elem.textContent.indexOf(str) > -1 :
elem.textContent === str;
if (isLast && contains)
return NodeFilter.FILTER_ACCEPT;
};
filter.acceptNode = filter; // for IE
var treeWalker = document.createTreeWalker(
parentNode || document.documentElement,
NodeFilter.SHOW_ELEMENT, {
acceptNode: filter
},
false
);
var nodeList = [];
while (treeWalker.nextNode()) nodeList.push(treeWalker.currentNode);
return nodeList;
}
// only the elements whose textContent is exactly the string
console.log('strict', getElementByTextContent('This should be found'))
// all elements whose textContent contain the string (your code)
console.log('partial', getElementByTextContent('This should', true))
// only the elements whose textContent is exactly the string and which are the last Element of the tree
console.log('strict onlyLast', getElementByTextContent('This should be found', false, null, true))
<p><span>This should be found</span></p>
<span>This should only in partial mode</span><br>
<span>This must not be found</span>
<!-- p should not be found in onlyLast mode -->

No, there is not. document.querySelector can only accept a string argument that describes one or more CSS selectors separated by commas. You cannot provide document.querySelector a textDocument.
You will have to check the content of a node different, with one way being the way you've described in the question.

Related

Check if an HTML string only has element children (or whitespace between elements) and no element is unknown

I try to test if a string contains some HTML text with some specific properties:
Everything at the top level needs to be wrapped in a tag, so "<div>abc</div><div>xyz</div>" is valid, but "<div>abc</div> 123 <div>xyz</div>" is not. Whitespace between tags is fine.
Every tag needs to be an existing HTML tag, so "<div></div><x></x>" or "<div><x></x></div>" are both invalid since <x></x> is an unknown tag.
console.log(/(<|<)br\s*\/?(>|>)|(<([A-Za-z][A-Za-z0-9]*)\b[^>]*>(.*?)<\/\1>)/.test('<br/> span>test<span>'))
// test <br/> -> (<|<)br\s*\/?(>|>)
// test the rest tags -> (<([A-Za-z][A-Za-z0-9]*)\b[^>]*>(.*?)<\/\1>)
Also, I tried using DOMParser:
function isValidHTML(html) {
const parser = new DOMParser();
const doc = parser.parseFromString(html, "text/html");
if (doc.documentElement.querySelector("parsererror")) {
return doc.documentElement.querySelector("parsererror").innerText;
} else {
return true;
}
}
console.log(isValidHTML("<span>test</span> 123 <p>ss</p>"))
Here, I expect an error, but it returns true.
According to the code, I expect to get false, because my code is not “valid” HTML. How to fix the code?

First of all, note that something like <div>abc</div> 123 <div>xyz</div> is a valid HTML fragment.
Checking for your requirements and checking if an HTML string is something that would commonly be referred to as “valid” are two very different things.
Your requirements ask for a function that I’m going to call htmlStringHasElementOrWhitespaceChildrenAndNoUnknownElements.
Because, what you’re looking for is
a function that takes a string (an HTML string, presumably), and returns a boolean based on if the HTML string has certain properties. (htmlStringHas…)
Those properties are:
The Nodes, when parsed from the string, consist of either Elements, or of Text nodes which contain only whitespace. These Nodes are all at the root of the parsed structure.1 (…ElementOrWhitespaceRoots…)
The Elements are all defined in HTML. (…AndNoUnknownElements)
This is a function that checks for these properties:
const htmlStringHasElementOrWhitespaceRootsAndNoUnknownElements = (string) => {
const parsed = new DOMParser().parseFromString(string, "text/html").body;
return Array.from(parsed.childNodes)
.every(({ nodeType, textContent }) => (nodeType === Document.ELEMENT_NODE || nodeType === Document.TEXT_NODE) && (nodeType !== Document.TEXT_NODE || !textContent.trim()))
&& Array.from(parsed.querySelectorAll("*"))
.every((node) => !(node instanceof HTMLUnknownElement));
};
console.log(htmlStringHasElementOrWhitespaceChildrenAndNoUnknownElements("<br><span>test</span> test <div>aa8<x></x><y>asd</y></div>")); // false
every is used to check validity on every Node.
Alternatively, if you want to remove those “invalid” nodes, use filter and call the remove method (either for Elements or for CharacterData nodes, which Texts inherit from) on each node using forEach:
Array.from(parsed.childNodes)
.filter(({ nodeType, textContent }) => (nodeType !== Document.ELEMENT_NODE && nodeType !== Document.TEXT_NODE) || (nodeType === Document.TEXT_NODE && textContent.trim()))
.concat(Array.from(parsed.querySelectorAll("*"))
.filter((node) => node instanceof HTMLUnknownElement))
.forEach((node) => node.remove());
I’ve started by filtering the set of valid nodes, then negated the predicate, and simplified using De Morgan’s laws.
Since the function name is unwieldy, let’s abbreviate that to validHTMLString for now, although you must document what you define as “valid”.
Some test cases:
validHTMLString("<div></div><div></div>"); // true
validHTMLString("<x></x>"); // false
validHTMLString("<div><x></x></div>"); // false
validHTMLString("<img/> <span>test</span>"); // true
validHTMLString("a <div>b</div> c"); // false
Please note that there are some major caveats with this:
First, you’ve been asking about “valid” HTML for a while, but usually “valid HTML” means “conforms to the HTML specification”, which can be checked by an HTML validator.
This is non-trivial to check by yourself, since DOMParser will apply exactly the same fixes to broken HTML that your browser will apply for any website it encounters.
Something like validHTMLString("<p><p></p></p><input></input><span>") will therefore result in true, despite containing three errors (or four errors, as the validator counts).
But DOMParser is the best tool we have, other than writing our own validator from scratch or searching for an existing one.
Regular expressions are guaranteed to be insufficient for the purpose of validating arbitrary HTML strings.
You could attempt comparing the result of serializing the parsed result with the original string, but the serialization includes unrelated fixes which don’t cause a validation error. Example: tags like <img/> are serialized as <img />.
Second, custom elements exist.
Something like <my-element> may be a valid element, with its own class, derived from HTMLElement, after it has been defined.
1: When DOMParser parses HTML, it will try to create a valid HTML document. Your root nodes are the childNodes of the created body.

jQuery.find() returns an object even when there's no matching child element in the DOM

I am trying to find an element with the ID '' that is within the element '', and therefore is its child.
I am using the $.find method to perform the search.
If the child object is found, I'd like to perform some actions, and if the child object isn't found, I'd like to do different things.
However, even though I know that there is no such child element existing, the jQuery.find method reports an object that I am not sure, from inspecting in the Watches window, what it is.
Here's the relevant code snippet:
function CreateResourceKeyTextBox(resourceKeyId, editMode) {
var resourceKeyTableCell = $("#tdKeyResourceKeyId" + resourceKeyId);
var resourceKeyNameTextBox = null;
var alreadyExistingResourceKeyNameTextBox = resourceKeyTableCell.find('#txtResourceKeyName' + resourceKeyId);
if (alreadyExistingResourceKeyNameTextBox != null && typeof alreadyExistingResourceKeyNameTextBox != "undefined") {
resourceKeyTableCell.html('');
resourceKeyNameTextBox = alreadyExistingResourceKeyNameTextBox;
resourceKeyNameTextBox.css('display', 'block');
resourceKeyNameTextBox.appendTo('#tdKeyResourceKeyId' + resourceKeyId);
resourceKeyNameTextBox.css('width', '96%');
}

jQuery query functions always return an object, even if there's no matching DOM elements.
Check the length, it will be 0 if there's no element in the set :
if (alreadyExistingResourceKeyNameTextBox.length ...

jquery's find method returns a jquery object whose internal matched elements are the corresponding elements to your css selector.
If css selector fails to match any elements, then, jquery's find method's return object's internal matched elements is an empty array. You can get internal matched elements with .get method as follows:
var elems = $.find(css_selector).get()
this method returns array of DOM elements not jquery object instances, and you can check empty array using following syntax
var elems = $.find(css_selector).get()
if(elems.length === 0){
//array is empty
}else{
//array is not empty
}
This behaviour of jquery minimizes any syntax errors you might get otherwise, jquery will work without errors, no matter your css selector matches any DOM elements or not. This is beneficial in most cases, where you simply apply some changes on matched elements regardless of there are any. If existence of such elements is critical to your business logic, you should check it manually.

You should use alreadyExistingResourceKeyNameTextBox.length != 0 instead I think

if an object is not found using jquery .find() method, it always return an empty array. if you are getting anything other than that, you need to check your DOM. You can always check the length of the result i.e. result.length > 0 || result.length === 1, depending on your need

Select all elements with a "data-xxx" attribute without using jQuery

Using only pure JavaScript, what is the most efficient way to select all DOM elements that have a certain data- attribute (let's say data-foo).
The elements may be different, for example:
<p data-foo="0"></p><br/><h6 data-foo="1"></h6>

You can use querySelectorAll:
document.querySelectorAll('[data-foo]');

document.querySelectorAll("[data-foo]")
will get you all elements with that attribute.
document.querySelectorAll("[data-foo='1']")
will only get you ones with a value of 1.

document.querySelectorAll('[data-foo]')
to get list of all elements having attribute data-foo
If you want to get element with data attribute which is having some specific value e.g
<div data-foo="1"></div>
<div data-foo="2"></div>
and I want to get div with data-foo set to "2"
document.querySelector('[data-foo="2"]')
But here comes the twist ... what if I want to match the data attirubte value with some variable's value? For example, if I want to get the elements where data-foo attribute is set to i
var i=2;
so you can dynamically select the element having specific data element using template literals
document.querySelector(`[data-foo="${i}"]`)
Note even if you don't write value in string it gets converted to string like if I write
<div data-foo=1></div>
and then inspect the element in Chrome developer tool the element will be shown as below
<div data-foo="1"></div>
You can also cross verify by writing below code in console
console.log(typeof document.querySelector(`[data-foo="${i}"]`).dataset('dataFoo'))
why I have written 'dataFoo' though the attribute is data-foo reason dataset properties are converted to camelCase properties
I have referred below links:
MDN: data-*
MDN: Using data attributes

Try it → here
<!DOCTYPE html>
<html>
<head></head>
<body>
<p data-foo="0"></p>
<h6 data-foo="1"></h6>
<script>
var a = document.querySelectorAll('[data-foo]');
for (var i in a) if (a.hasOwnProperty(i)) {
alert(a[i].getAttribute('data-foo'));
}
</script>
</body>
</html>

Here is an interesting solution: it uses the browsers CSS engine to to add a dummy property to elements matching the selector and then evaluates the computed style to find matched elements:
It does dynamically create a style rule [...] It then scans the whole document (using the
much decried and IE-specific but very fast document.all) and gets the
computed style for each of the elements. We then look for the foo
property on the resulting object and check whether it evaluates as
“bar”. For each element that matches, we add to an array.

Native JavaScript's querySelector and querySelectorAll methods can be used to target the element(s). Use a template string if your dataset value is a variable.
var str = "term";
var term = document.querySelectorAll(`[data-type=${str}]`);
console.log(term[0].textContent);
var details = document.querySelector('[data-type="details"]');
console.log(details.textContent);
<dl>
<dt data-type="term">Thing</dt>
<dd data-type="details">The most generic type.</dd>
</dl>

var matches = new Array();
var allDom = document.getElementsByTagName("*");
for(var i =0; i < allDom.length; i++){
var d = allDom[i];
if(d["data-foo"] !== undefined) {
matches.push(d);
}
}
Not sure who dinged me with a -1, but here's the proof.
http://jsfiddle.net/D798K/2/

While not as pretty as querySelectorAll (which has a litany of issues), here's a very flexible function that recurses the DOM and should work in most browsers (old and new). As long as the browser supports your condition (ie: data attributes), you should be able to retrieve the element.
To the curious: Don't bother testing this vs. QSA on jsPerf. Browsers like Opera 11 will cache the query and skew the results.
Code:
function recurseDOM(start, whitelist)
{
/*
* #start: Node - Specifies point of entry for recursion
* #whitelist: Object - Specifies permitted nodeTypes to collect
*/
var i = 0,
startIsNode = !!start && !!start.nodeType,
startHasChildNodes = !!start.childNodes && !!start.childNodes.length,
nodes, node, nodeHasChildNodes;
if(startIsNode && startHasChildNodes)
{
nodes = start.childNodes;
for(i;i<nodes.length;i++)
{
node = nodes[i];
nodeHasChildNodes = !!node.childNodes && !!node.childNodes.length;
if(!whitelist || whitelist[node.nodeType])
{
//condition here
if(!!node.dataset && !!node.dataset.foo)
{
//handle results here
}
if(nodeHasChildNodes)
{
recurseDOM(node, whitelist);
}
}
node = null;
nodeHasChildNodes = null;
}
}
}
You can then initiate it with the following:
recurseDOM(document.body, {"1": 1}); for speed, or just recurseDOM(document.body);
Example with your specification: http://jsbin.com/unajot/1/edit
Example with differing specification: http://jsbin.com/unajot/2/edit

forcing firefox skip "text nodes" in DOM parsing by javascript

Hi
I'm writing a javascript code to traverse HTML dom and highlight elements.
My problem is firefox returns whitespaces as text node.
Is there any solution to force it to just return tags? for example I need "firstChild" always return first tag and not any text!
Thanks

You can check if a node is an element with node.nodeType === 1.
You can also implement the new DOM Travelsal API as functions.
var dummy = document.createElement("div");
var firstElementChild = ('firstElementChild' in dummy)
? function (el) {
return el.firstElementChild;
}
: function (el) {
el = el.firstChild;
while (el && el.nodeType !== 1)
el = el.nextSibling;
return el;
}
usage
firstElementChild(el)

You can use element.firstElementChild instead. Unfortunately, this isn't supported in IE8 and below.
Alternatively, you might want to write a small function to crawl the childNodes until you find the next element node.

Maybe you could try one of the other DOM traversal methods, such as a TreeWalker.

Versatile xml attribute regex with javascript

Basically I have an xml document and the only thing I know about the document is an attribute name.
Given that information, I have to find out if that attribute name exists, and if it does exist I need to know the attribute value.
for example:
<xmlroot>
<ping zipcode="94588" appincome = "1750" ssn="987654321" sourceid="XX9999" sourcepw="ioalot">
<status statuscode="Success" statusdescription="" sessionid="1234" price="12.50">
</status>
</ping>
</xmlroot>
I have the names appincome and sourceid. what are the values?
Also if there are two appincome attribute names in the document I need to know that too, but I don't need their values, just that more then one match exists.

Regular expressions may not be the best tool for this, particularly if your JS is running in reasonably modern browsers with XPath support. This regex should work, but beware of false positives if you don't have tight control over the document's contents:
var match, rx = /\b(appincome|sourceid)\s*=\s*"([^"]*)"/g;
while (match = rx.exec(xml)) {
// match[1] is the name
// match[2] is the value
// this loop executes once for each instance of each attribute
}
Alternatively, try this XPath, which won't generate false positives:
var node, nodes = xmldoc.evaluate("//#appincome|//#sourceid", xmldoc, null, XPathResult.UNORDERED_NODE_ITERATOR_TYPE, null);
while (node = nodes.iterateNext()) {
// node.nodeName is the name
// node.nodeValue is the value
// this loop executes once for each instance of each attribute
}

Develop Reference

JavaScript is the programming language of the Web.

document.querySelector via textContent - javascript

Related

Check if an HTML string only has element children (or whitespace between elements) and no element is unknown

jQuery.find() returns an object even when there's no matching child element in the DOM

Select all elements with a "data-xxx" attribute without using jQuery

forcing firefox skip "text nodes" in DOM parsing by javascript

Versatile xml attribute regex with javascript

Categories

Resources