CasperJS/PhantomJS Get nodes and click them

CasperJS/PhantomJS Get nodes and click them - javascript

I am trying to get DOMElements and click on each of them. After that I want to run assertions on the response.
var nodes = this.evaluate(function(){
var nodes = document.querySelectorAll('.editable .action');
return nodes;
});
//Print the base URI for the node
for (i = 0; i < nodes.length; ++i) {
if(null != nodes[i]){
require('utils').dump(nodes[i].baseURI);
}
}
I have around 5 nodes that are a match, but nodes[0] is the only one that is not null. The rest are null in CasperJS. However running the same test in chrome browser I get all the nodes, none of them are null.

CasperJS is built upon PhantomJS and has the same limitations. One of those limitations is that there are two contexts and the page context which has access to the DOM is sandboxed. It's not possible to pass non-primitive objects such as DOM nodes out of the page context.
Documentation:
Note: The arguments and the return value to the evaluate function must be a simple primitive object. The rule of thumb: if it can be serialized via JSON, then it is fine.
Closures, functions, DOM nodes, etc. will not work!
You cannot compare this with Chrome, because Chrome doesn't have two contexts in normal operation.
You can pass representations of DOM nodes out of the page context. CasperJS has some convenience functions for this like casper.getElementsInfo(selector).
If you want to click every element, then there are different ways to achieve that depending on how the elements are positioned on the page. My answer here shows two ways with CSS selectors and XPath expressions outside of the page context.
See also:
casper.evaluate(fn, args...)

Related

How to get all possible valid attributes on a DOM element [duplicate]

This question already has answers here:
How to list all element attributes in JS?
(3 answers)
Closed 5 years ago.
Please note that .attributes only gets the current attributes, which is not what this question is about.
I want a way to get all the attributes of a DOM element. Not just the ones that are on it now, but the ones that are possible in the future too.
The specific use case is to find the potential attributes in an SVGElement that aren't in an HTMLElement - there's a list on MDN (SVG Attribute reference), but, for obvious reasons, hardcoding is not a good idea.
My initial thought was to iterate through the prototype chain of an instance of each and compare the two lists (with basic filtering for event handlers), but this doesn't actually give the potential svg attributes.
EDIT
IMPORTANT NOTE - pasting your answer code into the console on this page and using document.body as a target should show a list of over 50 attributes, including things like contentEditable, contextMenu, dir, style, etc.
This also needs to work cross-browser.

Could something like this be what you're looking for?
It looks like a DOM element object stores empty keys for all possible attributes. Could you loop over these keys and store them in an array, then from there use something similar to filter out inherited attributes?
HTML
<svg id="blah"></svg>
Javascript
const blah = document.getElementById('blah')
let possibleKeys = []
for (let key in blah) {
possibleKeys.push(key)
}
Here's a JSBin example ... it looks like it produces a list of all possible attributes but it would need some filtering.
See also this thread.
How to list all element attributes in JS?

Any one of these should work because they return a live HTMLCollection.
var svgElement = window.document.getElementsByClassName("someSvgClass")[0];
//assume someSvgClass is on svg element.
//var svgElement = window.document.getElementsByTagName("svg")[0];
//var svgElement = window.document.getElementsByName("mySvg")[0];
//assume svg has this name.
var svgAttributes = svgElement.attributes;
for(let i=0; i<svgAttributes.length; i++) {
console.log(svgAttributes[i]);
}
See the below documentation from MDN on getElementsByTagName()
The Element.getElementsByTagName() method returns a live
HTMLCollection of elements with the given tag name. The subtree
underneath the specified element is searched, excluding the element
itself. The returned list is live, meaning that it updates itself with
the DOM tree automatically. Consequently, there is no need to call
several times Element.getElementsByTagName() with the same element and
arguments.
The documentation for getElementsByName , and getElementsByClassName say the same thing; a live node list is returned. If you'd like to try it, I created a fiddle here.
You'll see that svgAttributes list is automatically updated upon clicking "Add Attribute" without re-executing any of those functions.

There is no API for that and I don't think a workaround is possible because when you change an attribute on a current DOM node, the browser is responsible for re-rendering and updating the webpage in a low-level way that is not exposed to the JavaScript context.
Also, keep in mind that any correctly formatted attribute is actually valid in the context of a DOM tree, even though it might not trigger any change at the rendering level or in the way the browser renders the page. Especially the data-* attributes.
There might be some vendor-specific API but that wouldn't be useful if you want cross-browser compatibility.
You need to hardcode it, sadly. Given that you specifically want the SVGElement attributes, maybe you can scrape the W3's SVG standard document to automatically create the list?
Edit: I made a snippet to easily scrape the values from the standard:
const uniq = arr => Array.from(new Set(arr))
const nameElements = document.querySelectorAll('table[summary="Alphabetic list of attributes"] .attr-name')
const arrNameElements = Array.prototype.slice.call(nameElements)
const svgAttributes = uniq(arrNameElements.map(el => el.innerText.replace(/\‘|\’/g, '')))
Just execute it on the svg attributes page, by opening the dev console on the page and pasting in this code :)
Edit 2: I forgot the presentation attributes. I'll let you figure that one out ;)

Why should y.innerHTML = x.innerHTML; be avoided?

Let's say that we have a DIV x on the page and we want to duplicate ("copy-paste") the contents of that DIV into another DIV y. We could do this like so:
y.innerHTML = x.innerHTML;
or with jQuery:
$(y).html( $(x).html() );
However, it appears that this method is not a good idea, and that it should be avoided.
(1) Why should this method be avoided?
(2) How should this be done instead?
Update:
For the sake of this question let's assume that there are no elements with ID's inside the DIV x.
(Sorry I forgot to cover this case in my original question.)
Conclusion:
I have posted my own answer to this question below (as I originally intended). Now, I also planed to accept my own answer :P, but lonesomeday's answer is so amazing that I have to accept it instead.

This method of "copying" HTML elements from one place to another is the result of a misapprehension of what a browser does. Browsers don't keep an HTML document in memory somewhere and repeatedly modify the HTML based on commands from JavaScript.
When a browser first loads a page, it parses the HTML document and turns it into a DOM structure. This is a relationship of objects following a W3C standard (well, mostly...). The original HTML is from then on completely redundant. The browser doesn't care what the original HTML structure was; its understanding of the web page is the DOM structure that was created from it. If your HTML markup was incorrect/invalid, it will be corrected in some way by the web browser; the DOM structure will not contain the invalid code in any way.
Basically, HTML should be treated as a way of serialising a DOM structure to be passed over the internet or stored in a file locally.
It should not, therefore, be used for modifying an existing web page. The DOM (Document Object Model) has a system for changing the content of a page. This is based on the relationship of nodes, not on the HTML serialisation. So when you add an li to a ul, you have these two options (assuming ul is the list element):
// option 1: innerHTML
ul.innerHTML += '<li>foobar</li>';
// option 2: DOM manipulation
var li = document.createElement('li');
li.appendChild(document.createTextNode('foobar'));
ul.appendChild(li);
Now, the first option looks a lot simpler, but this is only because the browser has abstracted a lot away for you: internally, the browser has to convert the element's children to a string, then append some content, then convert the string back to a DOM structure. The second option corresponds to the browser's native understanding of what's going on.
The second major consideration is to think about the limitations of HTML. When you think about a webpage, not everything relevant to the element can be serialised to HTML. For instance, event handlers bound with x.onclick = function(); or x.addEventListener(...) won't be replicated in innerHTML, so they won't be copied across. So the new elements in y won't have the event listeners. This probably isn't what you want.
So the way around this is to work with the native DOM methods:
for (var i = 0; i < x.childNodes.length; i++) {
y.appendChild(x.childNodes[i].cloneNode(true));
}
Reading the MDN documentation will probably help to understand this way of doing things:
appendChild
cloneNode
childNodes
Now the problem with this (as with option 2 in the code example above) is that it is very verbose, far longer than the innerHTML option would be. This is when you appreciate having a JavaScript library that does this kind of thing for you. For example, in jQuery:
$('#y').html($('#x').clone(true, true).contents());
This is a lot more explicit about what you want to happen. As well as having various performance benefits and preserving event handlers, for example, it also helps you to understand what your code is doing. This is good for your soul as a JavaScript programmer and makes bizarre errors significantly less likely!

You can duplicate IDs which need to be unique.
jQuery's clone method call like, $(element).clone(true); will clone data and event listeners, but ID's will still also be cloned. So to avoid duplicate IDs, don't use IDs for items that need to be cloned.

It should be avoided because then you lose any handlers that may have been on that
DOM element.
You can try to get around that by appending clones of the DOM elements instead of completely overwriting them.

First, let's define the task that has to be accomplished here:
All child nodes of DIV x have to be "copied" (together with all its descendants = deep copy) and "pasted" into the DIV y. If any of the descendants of x has one or more event handlers bound to it, we would presumably want those handlers to continue working on the copies (once they have been placed inside y).
Now, this is not a trivial task. Luckily, the jQuery library (and all the other popular libraries as well I assume) offers a convenient method to accomplish this task: .clone(). Using this method, the solution could be written like so:
$( x ).contents().clone( true ).appendTo( y );
The above solution is the answer to question (2). Now, let's tackle question (1):
This
y.innerHTML = x.innerHTML;
is not just a bad idea - it's an awful one. Let me explain...
The above statement can be broken down into two steps.
The expression x.innerHTML is evaluated,
That return value of that expression (which is a string) is assigned to y.innerHTML.
The nodes that we want to copy (the child nodes of x) are DOM nodes. They are objects that exist in the browser's memory. When evaluating x.innerHTML, the browser serializes (stringifies) those DOM nodes into a string (HTML source code string).
Now, if we needed such a string (to store it in a database, for instance), then this serialization would be understandable. However, we do not need such a string (at least not as an end-product).
In step 2, we are assigning this string to y.innerHTML. The browser evaluates this by parsing the string which results in a set of DOM nodes which are then inserted into DIV y (as child nodes).
So, to sum up:
Child nodes of x --> stringifying --> HTML source code string --> parsing --> Nodes (copies)
So, what's the problem with this approach? Well, DOM nodes may contain properties and functionality which cannot and therefore won't be serialized. The most important such functionality are event handlers that are bound to descendants of x - the copies of those elements won't have any event handlers bound to them. The handlers got lost in the process.
An interesting analogy can be made here:
Digital signal --> D/A conversion --> Analog signal --> A/D conversion --> Digital signal
As you probably know, the resulting digital signal is not an exact copy of the original digital signal - some information got lost in the process.
I hope you understand now why y.innerHTML = x.innerHTML should be avoided.

I wouldn't do it simply because you're asking the browser to re-parse HTML markup that has already been parsed.
I'd be more inclined to use the native cloneNode(true) to duplicate the existing DOM elements.
var node, i=0;
while( node = x.childNodes[ i++ ] ) {
y.appendChild( node.cloneNode( true ) );
}

Well it really depends. There is a possibility of creating duplicate elements with the same ID, which is never a good thing.
jQuery also has methods that can do this for you.

How can I select nodes from an object and then relocate those nodes to my document in Internet Explorer 8+?

What I want to accomplish:
create an object by performing an XSL transformation (on an XML DOM object), let's call it 'fragment'
select some nodes by running an XPATH query on the resulting object
collect those nodes in, let's say, and array
append 'fragment' to a certain container in my HTML document
alter any one of those nodes, not by querying the DOM, but by referencing the said node like so: nodeCollection[x]
My motivation:
This is essentially a way of "caching" the objects that I know for sure I will have to alter later. So instead of walking the DOM to find them each time I need them, I want to "cache" them. It's much faster to select them with XPATH queries and once I do that I will no longer need to do any DOM traversing, which in my case can be rather slow (we're talking a lot of nodes here). I know this may generally come off as a very convoluted solution, but in my particular situation it pays off big time, at least in Firefox (see below).
How I did it in Firefox:
create a document fragment ('xml' and 'xsl' are XML DOM objects):
xsltProcessor=new XSLTProcessor();
xsltProcessor.importStylesheet(xsl);
fragment = xsltProcessor.transformToFragment(xml,document);
XPATH query:
var nodes = document.evaluate("//div", fragment.firstChild, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
collect nodes in an array:
nodeCollection = new Array();
i = 0;
node = nodes.iterateNext();
while(node) {
nodeCollection[i] = node;
node = nodes.iterateNext();
i++;
}
append the fragment to a container in the HTML document:
document.getElementById("container").appendChild(fragment);
alter a node in the collection:
nodeCollection[0].style.border = '1px solid red';
.. and it works as intended.
The problems I ran into in Internet Explorer:
I used
var fragmentObject = new ActiveXObject("Msxml2.DOMDocument.3.0");
xml.transformNodeToObject(xsl,fragmentObject);
to create an object on which I can perform XPATH queries like so:
var nodes = fragmentObject.selectNodes("//div");
, but after this step I don't know how to append 'fragmentObject' to my container in the HTML document and then select individual nodes from 'nodes' and manipulate them like I did on Firefox.
What I think went wrong:
'fragmentObject' is a "Msxml2.DOMDocument.3.0", which is entirely different than a document fragment, so I can't just move nodes from it to my HTML document. If I try something like
container.appendChild(nodes[1]);
I get this error: "SCRIPT5022: DOM Exception: HIERARCHY_REQUEST_ERR (3)", which is what usually happens when trying to insert a node where it doesn't belong (or at least that's the explanation I found for this particular error).
Maybe there's some type of object that I`m unaware of that supports XPATH queries and can be appended to the HTML document (?)

Instead of Msxml2.DOMDocument.3.0 try Microsoft.XMLDOM

if you are adding the node, then as you said it must be in DOM range. like document.appendChild() will give the error.
So we always do document.body.append..
Similarly, check if this container can directly add the Element..

How to find with javascript if element exists in DOM or it's virtual (has been just created by createElement)

I'm looking for a way to find if element referenced in javascript has been inserted in the document.
Lets illustrate a case with following code:
var elem = document.createElement('div');
// Element has not been inserted in the document, i.e. not present
document.getElementByTagName('body')[0].appendChild(elem);
// Element can now be found in the DOM tree
Jquery has :visible selector, but it won't give accurate result when I need to find that invisible element has been placed somewhere in the document.

Here's an easier method that uses the standard Node.contains DOM API to check in an element is currently in the DOM:
document.body.contains(MY_ElEMENT);
CROSS-BROWSER NOTE: the document object in IE does not have a contains() method - to ensure cross-browser compatibility, use document.body.contains() instead. (or document.head.contains if you're checking for elements like link, script, etc)
Notes on using a specific document reference vs Node-level ownerDocument:
Someone raised the idea of using MY_ELEMENT.ownerDocument.contains(MY_ELEMENT) to check for a node's presence in the document. While this can produce the intended result (albeit, with more verbosity than necessary in 99% of cases), it can also lead to unexpected results, depending on use-case. Let's talk about why:
If you are dealing with a node that currently resides in an separate document, like one generated with document.implementation.createHTMLDocument(), an <iframe> document, or an HTML Import document, and use the node's ownerDocument property to check for presence in what you think will be your main, visually rendered document, you will be in a world of hurt.
The node property ownerDocument is simply a pointer to whatever current document the node resides in. Almost every use-case of contains involves checking a specific document for a node's presence. You have 0 guarantee that ownerDocument is the same document you want to check - only you know that. The danger of ownerDocument is that someone may introduce any number of ways to reference, import, or generate nodes that reside in other documents. If they do so, and you have written your code to rely on ownerDocument's relative inference, your code may break. To ensure your code always produces expected results, you should only compare against the specifically referenced document you intend to check, not trust relative inferences like ownerDocument.

Do this:
var elem = document.createElement('div');
elem.setAttribute('id', 'my_new_div');
if (document.getElementById('my_new_div')) { } //element exists in the document.

The safest way is to test directly whether the element is contained in the document:
function isInDocument(el) {
var html = document.body.parentNode;
while (el) {
if (el === html) {
return true;
}
el = el.parentNode;
}
return false;
}
var elem = document.createElement('div');
alert(isInDocument(elem));
document.body.appendChild(elem);
alert(isInDocument(elem));

You can also use jQuery.contains:
jQuery.contains( document, YOUR_ELEMENT)

Use compareDocumentPosition to see if the element is contained inside document. PPK has browser compatibility details and John Resig has a version for IE.

function isInDocument(query){
return document.querySelectorAll(query).length != 0;
}
// isInDocument("#elemid")

How does jQuery store xml used as a context parameter as it modifies it?

I'm trying to reduce DOM manipulation during my javascript execution, and I thought about lazy-writing DOM elements, versus writing a large section of the DOM hidden and unhiding the needed parts later.
To do this, I've separated out all my DOM content into a JSP which I load into a variable via ajax and manipulate with jQuery. However, if jQuery is doing this manipulation in a hidden div on the DOM, I've achieved no performance gain. Help?

New Answer
I just checked a couple of your recent questions and I think I see what you're asking here.
When you call jQuery like $('selector string', context) it gets routed to this:
jQuery(context).find('selector string')
do you see that? The context argument becomes the first argument to a [sub] call to jQuery. That in turn gets routed to this batch of shortcut code:
// Handle $(DOMElement)
if ( selector.nodeType ) {
this[0] = selector;
this.length = 1;
this.context = selector;
return this;
}
so the context argument gets stored on the returned jQuery object in the .context property.
Now judging from your comment to this question, you are aware that you should have your server side code respond with a "Content-Type: text/xml" header. Doing so will instruct the browser to automatically create a DOMDocument tree from your xml response body (FYI it's accessible in the responseXML property on the XMLHttpRequest object itself - jQuery simply passes it on to your success handler).
To answer your question: if you pass this DOMDocument object on to jQuery as the context parameter and proceed to modify attributes on this object, there is no manipulation in any hidden/temporary div that you have to worry about. The DOMDocument is simply mutated in-place.
Hope that answers your question.
Original Answer:
If you have html in a string and you need it to be nodes in a DOM structure, you've got to convert it somehow. Putting it inside a throwaway element is the only way I know to do this while leveraging the built-in browser parser:
// HTML string
var s = '<span>text</span>';
var div = document.createElement('div');
div.innerHTML = s;
var elements = div.childNodes;
Many of the libraries do this more or less, including jQuery. If it makes you feel any better, the throwaway div above is not inserted into the DOM, thereby avoiding DOM reflow (a very expensive operation), saving some cycles.

You might want to look at document fragments as a container to hold the generated DOM nodes.
var fragment = document.createDocumentFragment();
for ( var e = 0; e < elems.length; e++ ) {
// append elements to fragment
fragment.appendChild( elems[e] );
}

Develop Reference

JavaScript is the programming language of the Web.