How to manipulate the DOM before in-page scripts are executed? - javascript

Using Puppeteer, how can I run a script in the page context, with the full DOM available, before the in-page JS is executed?
For example, how can I run the following script to remove alt attributes from img elements, before any of the page JS is run?
document.querySelectorAll('img[alt]').forEach(
e => e.removeAttribute('alt')
)
(page.evaluateOnNewDocument looks like it would be useful, but it appears to be executed before the page content is available--at the point at which it runs, the page is blank.)

I think the way to achieve what you are looking for is to perform:
set page.setJavaScriptEnabled(false)
enter the page
extract all the scripts and HTML without scripts
set page.setJavaScriptEnabled(true)
enter page.goto(`data:text/html,${HTMLWithoutScript}`) with HTML from step 3
execute your scripts
incject original extracted scripts page.addScriptTag({ content: script }) from step 3
Example
Here is a visualization of your problematic example:
const puppeteer = require('puppeteer');
const html = `
<html>
<head></head>
<body>
<img src="https://picsum.photos/200/300?image=1062" alt="dog ">
<img src="https://picsum.photos/200/300?image=1072" alt="car ">
<div class="alts">List of alts: </div>
<script>
const images = document.querySelectorAll('img');
const altsContainer = document.querySelector('.alts');
images.forEach(image => {
const alt = image.getAttribute('alt') || 'missing alt ';
altsContainer.insertAdjacentHTML('beforeend', alt);
})
</script>
</body>
</html>`;
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.goto(`data:text/html,${html}`);
await page.evaluate(() => {
document.querySelectorAll('img[alt]').forEach(
e => e.removeAttribute('alt')
)
});
await page.screenshot({ path: 'image.png' });
await browser.close();
})();
This code produce:
So remove alts is not working here.
solution
(async () => {
const browser = await puppeteer.launch();
const page = await browser.newPage();
await page.setJavaScriptEnabled(false);
await page.goto(`data:text/html,${html}`);
const { script, HTMLWithoutScript } = await page.evaluate(() => {
const script = document.querySelector('script').innerHTML;
document.querySelector('script').innerHTML = '';
const HTMLWithoutScript = document.body.innerHTML;
return { script, HTMLWithoutScript }
});
await page.setJavaScriptEnabled(true);
await page.goto(`data:text/html,${HTMLWithoutScript}`);
await page.evaluate(() => {
document.querySelectorAll('img[alt]').forEach(
e => e.removeAttribute('alt')
)
});
await page.addScriptTag({ content: script });
await page.screenshot({ path: 'image.png' });
await browser.close();
})();
This will produce results as you expect in a question:

You can move your script tags to body instead of head. Then using document onload event you can execute a script. According to MDN this event fires when an object has been loaded. Below is the example code
function removeAlt(){
document.querySelectorAll('img[alt]').forEach((e)=>{
e.removeAttribute('alt');
});
}
<body onload="removeAlt()">
<img src="http://placehold.it/64x64" alt="1">
<img src="http://placehold.it/64x64" alt="2">
</body>
Let me know whether this fits into your requirement, I tested and function is removing alt tags from image

Related

how to unite cheerio with puppeteer so he can click on elements

I tried cheerio to find the element and if the element is found then he has to click but I don't know what to do with the puppeteer combination, the button I want to click is in the 3rd pict
await page.waitForTimeout(10000)
const contentHTML = await page.content();
const $ = cheerio.load(contentHTML);
const outerHTML = $('<button class="sc-nkuzb1-0 sc-d5trka-0 dsOMxw button" data-theme="home.verifyButton">Authenticate</button>').prop('innerText');
console.log(outerHTML);

html element caputre using html2canvas error

Hi I'm making a code using mediapipe javascript and I want to capture the output canvas
which has the landmarks and line, so I'm trying to use html2canvas but I'm having some errors could you guys help me or give me other solutions? As a result I want the capture the image and store it in sessionStorage or firestore and show it on the next html page.
This is the part where the html elements are.
the output_canvas is where the mediapipe landmarks are shown.
When I run this code I get the error of typeerror: cannot read properties of undefined (reading 'toDataURL')
<body>
<div class="container">
<div class = "ui-element" id="angle21"></div>
<div class="ui-element" id = "clock"></div>
<div class="ui-element" id = "clock2"></div>
<video class="input_video"></video>
<canvas class="output_canvas" width="1280px" height="720px"></canvas>
<div class="landmark-grid-container"></div>
</div>
</body>
And this is the script part where I use the html2canvas
const takeScreenShot = async () => {
//const screenshotTarget = document.body;
var screenshotTarget = document.querySelector('.output_canvas');
// document.querySelectorAll('.ui-element').forEach(element => {
// element.style.display = 'none'
// })
//var screenshotTarget=document.querySelector('.output_canvas');
const canvas = html2canvas(screenshotTarget, {
allowTaint:true,
foreignObjectRendering: true
});
console.log(canvas)
const base64image = canvas[1].toDataURL();
window.location.href = base64image;
// document.querySelectorAll('.ui-element').forEach(element => {
// element.style.display = 'block'
// })
}
setTimeout(async () => {
await takeScreenShot()
}, 1000 * 8)

Node.js Puppeteer UnhandledPromiseRejectionWarning trying to navigate Google Maps

(node:15348) UnhandledPromiseRejectionWarning: Error: Execution context was destroyed, most likely because of a navigation.
const browser = await puppeteer.launch({headless: false});
const page = await browser.newPage();
page.goto("https://www.google.com/maps/place/Faruk+G%C3%BCll%C3%BCo%C4%9Flu+-+Sunny/#41.0298046,28.7909262,13z/data=!4m8!1m2!2m1!1sfaruk+gulluoglu!3m4!1s0x14caa4f77579848b:0x37c42d8b0cecc146!8m2!3d41.0298046!4d28.8151116");
page.waitFor
const seeAllReviewsButton = "#pane > div > div.widget-pane-content.scrollable-y > div > div > div:nth-child(45) > div > div > button > span";
page.click(seeAllReviewsButton);
I can't navigate to Google Maps Link Of A Business.
There are few corrections needed: You need to await page.goto, page.waitFor, and page.click methods. And most importantly page.waitFor() is a method and it takes string or number or function as arguments and all of these methods return a promise. So they need to be awaited or do then on it.
You need to use await before page.goto, page.waitFor and page.click because it return Promise. and use { waitUntil: "domcontentloaded" } with page.goto to wait for DOM. then I fix seeAllReviewsButton selector.
The code below works fine with me.
const puppeteer = require("puppeteer");
(async () => {
const browser = await puppeteer.launch({ headless: false });
const page = await browser.newPage();
await page.goto(
"https://www.google.com/maps/place/Faruk+G%C3%BCll%C3%BCo%C4%9Flu+-+Sunny/#41.0298046,28.7909262,13z/data=!4m8!1m2!2m1!1sfaruk+gulluoglu!3m4!1s0x14caa4f77579848b:0x37c42d8b0cecc146!8m2!3d41.0298046!4d28.8151116",
{ waitUntil: "domcontentloaded" }
);
const seeAllReviewsButton =
"#pane > div > div.widget-pane-content.scrollable-y > div > div > div.section-hero-header-title > div.section-hero-header-title-top-container > div.section-hero-header-title-description > div.section-hero-header-title-description-container > div > div.gm2-body-2.section-rating-line > span:nth-child(3) > span > span:nth-child(1) > span:nth-child(2) > span:nth-child(1) > button";
await page.waitForSelector(seeAllReviewsButton);
await page.click(seeAllReviewsButton);
})();

Can't find a working selector on drop down menu while using Puppeteer? [duplicate]

This question already has answers here:
Issue with CSS locator select-react
(2 answers)
Closed 3 years ago.
I'm creating an automated script with puppeteer and I'm running across a problem of trying to find a selector that could be understood. I have tried many different options but gotten no luck.
Note:Don't worry its a dummy account so nothing important is on it.
I tried using
const myacc = '.li.member-nav-item.d-sm-ib.va-sm-m > button';
and bunch of others but still getting selector error
Code:
const puppeteer = require('puppeteer');
var emailVal = 'kellybrando23434#gmail.com';
var passwordVal = 'd34gfA#4dfW';
const AcceptCookies = '#cookie-settings-layout > div > div > div > div.ncss-row.mt5-sm.mb7-sm > div:nth-child(2) > button';
const loginBtn = 'li.member-nav-item.d-sm-ib.va-sm-m > button';
const email = 'input[type="email"]';
const password = 'input[type="password"]';
const logsubmit = '.loginSubmit.nike-unite-component > input[type="button"]';
const myacc = '.li.member-nav-item.d-sm-ib.va-sm-m > button'; //this line contains error
(async () => {
const browser = await puppeteer.launch({headless: false, slowMo: 150});
const page = await browser.newPage();
await page.setViewport({ width: 1920, height: 1080 })
await page.goto('https://www.nike.com/launch/'); const AcceptCookies = '#cookie-settings-layout > div > div > div > div.ncss-row.mt5-sm.mb7-sm > div:nth-child(2) > button'; await page.click(loginBtn);
console.log("Login Button Clicked...");
await page.waitFor(5000);
console.log("email: " + emailVal);
await page.type(email, emailVal);
console.log("entered email");
await page.type(password, passwordVal);
console.log("waiting 0.5s");
await page.waitFor(500);
console.log("waiting done");
await page.click(logsubmit);
console.log("submitted"); await page.waitFor(10000); await page.click(myacc); await page.waitFor(10000);
await browser.close(); })();
I'm trying to get the correct selector - "const myacc=..."- to click account profile as shown in the picture (highlighted section) but instead I'm getting a selector error ("Error:No node found for selector:...."). How would you find it in this situation as their is no id?
Before Picture
After Picture
There are almost all of the elements contains a unique attribute for testing data-qa. I would recommend using it for testing, so replace all your selectors with it.
Here is the example for the 'my account' selector:
const myAcc = '[data-qa="user-name"]';
Also, you may not see that the selector was clicked due to screen size, so you will need to maximize screen size.

Using Puppeteer to find element by title

Assuming I have a link in an iframe without an id, where the only uniquely identifiable piece of information is the element title of this link, how would I go about finding it? It could look like this:
<a class="link spacer--double" href="#" tabindex="51" title="Click here to use new code">Use new code</a>
This is what I have attempted so far:
(async() => {
const browser = await puppeteer.launch({
headless: false
});
const page = await browser.newPage();
console.log("starting new page");
var contentHtml = fs.readFileSync('1-iframe.html', 'utf8');
await page.setContent(contentHtml);
const result = await page.evaluate(() => {
let elements = document.getElementById("framecontentscroll").innerText;
for (let element of elements)
console.log(element);
})
})();
But I don't seem to get anything in elements. The id of the iframe is "framecontentscroll".
Is there a way where I can go directly for the link element using it's title and a querySelector or something similar?
You can use the querySelector to select by title. See:
Select an element by title with JavaScript and tweak from the browser?

Categories

Resources