Puppeteer.js : get href attribute of link with given text - javascript

is there a way in puppeteer to get the href attribute of an anchor element with text "See more".
I want to grab the href attribute of an element like this:
See more
maybe it's possble to do with eval?

you could try looping through all links
document.querySelectorAll('a').forEach(link => link.innerText === "See more" )
This might not work with newer JavaScript versions since querySelectorAll as well as getElementsByTagName do not yield an Array but HTMLCollection which is not iterable using forEach. See the linked answers on SO on how to fix this.

maybe you should try native js
url=document.getElementById("Id_of_AnchorTag").href;
keep in mind that you must be able to uniquely identify the anchor by some method.

Related

Combe Multiple JS Selectors with jQuery Selector

I am selecting a group of table rows by using the following line of JS:
document.getElementById('tab1_content').contentDocument.documentElement.getElementsByClassName("data1_smaller")
These represent entries in a table of contents. I want to return only those above which also contain the word 'CHAPTER', so I was attempting to use the jQuery :contains() selector to accomplish this and attempted to convert the entire thing into a single jQuery selector; so, to begin with, I tried converting the following invalid line:
document.getElementById('tab1_content').contentDocument.documentElement.getElementsByClassName("data1_smaller").$(":contains('CHAPTER')")
to this:
$("#tab1_content > contentDocument > documentElement > .data1_smaller:contains('CHAPTER')")
The selector above doesn't give an error but it fails to find anything. Does anybody know the correct way to do this?
You can achieve what you want with pure vanilla js just like you tried in the beginning. You just need to do some small adjustments to your code. You can use querySelectorAll() to query all elements matching a selector inside your ID. Something like this should work just by looking at your example, but might need some small adjustments.
[...document.getElementById('tab1_content').querySelectorAll(".data1_smaller")].filter((node) => node.textContent.includes('CHAPTER'))
// Edit, saw in the comments that you're accessing content in an iframe
[...document.getElementById('tab1_content').contentWindow.document.querySelectorAll(".data1_smaller")].filter((node) => node.textContent.includes('CHAPTER'))
I found this solution based on Anurag Srivastava's comments:
$("#tab1_content").contents().find(".data1_smaller:contains('CHAPTER')")
The issue was that I was trying to select things that are inside of an iframe and the the .contentDocument.documentElement that I used to access the iframe in JS has to be changed to .contents() in jQuery in order for it to work.
Neither contentDocument or documentElement are valid HTML tags. Try to select by id or class name.

Click on Link with javascript

Is there a way to find a link in web page and click on It with javascript code ?
I also tried document.getElementById('yourLinkID').click();
but i want to replace URL instead of ID
but i want to replace URL instead of ID
Sounds like querySelector() may be what you're looking for. Something like this perhaps:
document.querySelector('a[href="your_url_here"]').click();
If you have multiple matching elements then you might also take a look at querySelectorAll() and perhaps just invoke .click() on the first matching element.

select all anchor tags having onclick using xpath

I want to select all anchor tags on a web page having 'onClick' function defined as an attribute using XPATH.
The web page i am targeting has anchor tags like this
Delete
After a lot of searching I found following potential solutions but none of them seem to work. I have tried (and failed)
//a[#onclick|#href]
//a[onclick]
//a[#onclick]
//a/#*[name()='onclick']
./*[onclick] #this should have selected all nodes with onclick function
I also tried
//a/#onclick
but this only returns the onclick function definition where as i want the entire anchor tag.
Question: How do i get all the anchor tags that have onclick function defined as an attribute using XPATH?
The XPath //a[#onclick] should work.
Try executing this in the console of your browser to alert each anchor:
(function(){
var withOnclick = document.evaluate('//a[#onclick]', document, null, XPathResult.ORDERED_NODE_ITERATOR_TYPE, null);
var anchor = withOnclick.iterateNext();
while (anchor) {
alert(anchor.outerHTML);
anchor = withOnclick.iterateNext();
}
}())
$("a[onclick]") should work to find all anchors with onclick if you're using jQuery
Finally i was able to do it using
//a[#onclick]
Earlier i was applying this on incorrect response object and hence was not able to get the links.
To add further, to get the 'url' values one could do:
HtmlXPathSelector(response).select('//a[#onclick]/#href').extract()
this will return the list of all the links that belong to an anchor tag having 'onclick' function defined as an attribute.
Thanks for your answers.

Replace part of innerHTML without reloading embedded videos

I have a div with id #test that contains lots of html, including some youtube-embeds etc.
Somewhere in this div there is this text: "[test]"
I need to replace that text with "(works!)".
The normal way of doing this would of course be:
document.getElementById("test").innerHTML = document.getElementById("test").replace("[test]","(works!)");
But the problem is that if i do that the youtube-embeds will reload, which is not acceptable.
Is there a way to do this?
You will have to target the specific elements rather than the parent block. Since the DOM is changing the videos are repainted to the DOM.
Maybe TextNode (textContent) will help you, MSDN documentation IE9, other browsers also should support it
Change your page so that
[test]
becomes
<span id="replace-me">[test]</span>
now use the following js to find and change it
document.getElementById('replace-me').text = '(works!)';
If you need to change more than one place, then use a class instead of an id and use document.getElementsByClassName and iterate over the returned elements and change them one by one.
Alternatively, you can use jQuery and do it even simpler like this:
$('#replace-me').text('(works!)');
Now for this single replacement using jQuery is probably overkill, but if you need to change multiple places (by class name), jQuery would definitely come in handy :)

Can I get the full HTML representation of an HTMLElement DOM object?

I'm using jquery to parse some HTML, something like:
$(html).contents().each(function(){
var element = this.tagName;
...
I can access the tagName, children, parent... using the DOM or the more friendly jQuery functions.
But at one point a need the whole HTML of the current element (not what innerHTML or .html() return) and I can't figure out a one liner to get it (I always could attach the tag and the attributes manually to the innerHTML).
For example:
Link
The innerHTML is Link but I'm looking for the whole Link
does that oneliner exists?
Looks like this guy has a pretty nifty solution using jQuery: outerHTML
just saw the anwser for this on the other thread :D
outerHTML
outerHTML 2

Categories

Resources