Output execution time for a Playwright step with AJAX payload - javascript

I am trying to dump out a few key measurements to console when my test runs, rather than getting them from the reporter output, but I can't see how to grab the time taken for the last step to execute. Here's a simplified version based on the docs for request.timing() but I don't think that what I'm doing is classed as a request:
const { test, expect } = require('#playwright/test');
test('ApplicationLoadTime', async ({ page }) => {
// Wait for applications to load
await page.waitForSelector('img[alt="Application"]');
// Not working! - get time for step execution
const [fir] = await Promise.all([
page.click('text=Further information requested'),
page.waitForSelector('img[alt="Application"]')
]);
console.log(fir.timing());
});
The click on "Further information requested" causes the page to be modified based on an AJAX call in the background and the appearance of the Application img tells me it's finished. Is this possible or do I need to rely on the reports instead?

fir is going to be undefined in your code as page.click() doesn't return anything. You need to wait for the request whose timing you're interested in, use page.waitForEvent('requestfinished') or waitForNavigation:
const { test, expect } = require('#playwright/test');
test('ApplicationLoadTime', async ({ page }) => {
// Wait for applications to load
await page.waitForSelector('img[alt="Application"]');
const [fir] = await Promise.all([
// Wait for the request
page.waitForEvent('requestfinished', r => r.url() == '<url of interest>'),
page.click('text=Further information requested'),
page.waitForSelector('img[alt="Application"]')
]);
console.log(fir.timing());
});

Related

Close the page after certain interval [Puppeteer]

I have used puppeteer for one of my projects to open webpages in headless chrome, do some actions and then close the page. These actions, however, are user dependent. I want to attach a lifetime to the page, where it closes automatically after, say 30 minutes, of opening irrespective of whether any action is performed or not.
I have tried setTimeout() functionality of Node JS but it didn't work (or I just couldn't figure how to make it work).
I have tried the following:
const puppeteer = require('puppeteer-core');
const browser = await puppeteer.connect({browserURL: browser_url});
const page = await browser.newPage();
// timer starts ticking here upon creation of new page (maybe in a subroutine and not block the main thread)
/**
..
Do something
..
*/
// timer ends and closePage() is triggered.
const closePage = (page) => {
if (!page.isClosed()) {
page.close();
}
}
But this gives me the following error:
Error: Protocol error: Connection closed. Most likely the page has been closed.
Your provided code should work as excepted. Are you sure the page is still opened after the timeout and it is indeed the same page?
You can try this wrapper for opening pages and closing them correctly.
// since it is async it won't block the eventloop.
// using `await` will allow other functions to execute.
async function openNewPage(browser, timeoutMs) {
const page = await browser.newPage()
setTimeout(async () => {
// you want to use try/catch for omitting unhandled promise rejections.
try {
if(!page.isClosed()) {
await page.close()
}
} catch(err) {
console.error('unexpected error occured when closing page.', err)
}
}, timeoutMs)
}
// use it like so.
const browser = await puppeteer.connect({browserURL: browser_url});
const min30Ms = 30 * 60 * 1000
const page = await openNewPage(browser, min30Ms);
// ...
The above only closes the Tabs in your browser. For closing the puppeteer instance you would have to call browser.close() which could may be what you want?
page.close returns a promise so you need to define closePage as an async function and use await page.close(). I believe #silvan's answer should address the issue, just make sure to replace if condition
if(page.isClosed())
with
if(!page.isClosed())

How to deep copy a page of puppeteer in javascript?

I'm using puppeteer to navigate my website. I want to wait for an api that sometimes gets called and sometimes not. I'm using
await page.waitForResponse((response =>response.url().includes(myurl)), { timeout: 1000 });
to wait for that api. This works fine when the api gets called, but whenever the api doesn't get called, it crashes and the page isn't same anymore. So, I want to deep copy the page so that I can just check for the api via it's copy and even if that page gets damaged. I will have another that I can use.
I think you don't need to copy your page. That's probably not doable very easy and seems like a bit of overkill. Instead, preventing the page from crashing would be a simpler approach.
Try something like this:
async function waitForApi(url, timeoutMs) {
try {
console.log('waiting ', timeoutMs+'ms for special API. url:', url);
const opts = { timeout: timeoutMs || 1000 };
await page.waitForResponse(response => response.url().includes(url), opts);
console.log('Special API was called!.');
return true;
} catch(err) {
console.log('Special Api was appearantly not called. (Or may be failed.. Error:', err);
return false;
}
}
// example call of waitForApi ..
const myUrl = '...'
const apiCalled = await waitForApi(myUrl, 1000)
if(apiCalled) {
// do stuff if you want to..
} else {
// do stuff if you want to..
}
This should now log if the api was called or not and when needed you can handle the cases differently.

Firebase update Function is sometimes slower executed

I have a simple update function which sometimes executes really slow compared to other times.
// Five executions with very different execution times
Finished in 1068ms // Almost six times slower than the next execution
Finished in 184ms
Finished in 175ms
Finished in 854ms
Finished in 234ms
The Function is triggered from the frontend and doesn't run on Firebase Cloud Functions.
const startAt = performance.now()
const db = firebase.firestore();
const ref = db.doc(`random/nested/document/${id}`);
ref.update({
na: boolean // a calculated boolean with array.includes(...)
? firebase.firestore.FieldValue.arrayRemove(referenceId)
: firebase.firestore.FieldValue.arrayUnion(referenceId)
})
.then(() => {
let endAt = performance.now();
console.log("Finished in " + (endAt - startAt) + "ms");
});
Is there anything I can improve to fix these performance differences?
Also the longer execution times dont only appear when removing something from an array or adding something to an array. It appears on adding and removing. Sometimes these execution times go up to 3000ms.
Similar to cold-starting a Cloud Function where everything is spun up, initialized and made ready for use, a connection to Cloud Firestore also needs to be resolved through DNS, ID tokens need to be obtained to authenticate the request, a socket to the server opened and any handshakes are exchanged between the server and the SDK.
Any new operations on the database can make use of the previous work taken to initialize the connection and that is why they look like they are faster.
Showing this as loose pseudocode:
let connection = undefined;
function initConnectionToFirestore() {
if (!connection) {
await loadFirebaseConfig();
await Promise.all([
resolveIpAddressOfFirebaseAuth(),
resolveIpAddressOfFirestoreInstance()
]);
await getIdTokenFromFirebaseAuth();
await addAuthenticationToRequest();
connection = await openConnection();
}
return connection;
}
function doUpdate(...args) {
const connection = await initConnectionToFirestore();
// do the work
connection.send(/* ... */);
}
await doUpdate() // has to do work of initConnectionToFirestore
await doUpdate() // reuses previous work
await doUpdate() // reuses previous work
await doUpdate() // reuses previous work

Puppeteer wait for new page after form submit

I'm trying to use puppeteer to load a page, submit a form (which takes me to a different URL) and then ideally run something once this new page had loaded. I'm using Node JS, and am generalising my logic into separate files, one of which is search.js as per the below:
const puppeteer = require('puppeteer')
const createSearch = async (param1) => {
puppeteer.launch({
headless: false,
}).then(async browser => {
const page = await browser.newPage(term, location)
await page.goto('https://example.com/')
await page.waitForSelector('body')
await page.evaluate(() => {
const searchForm = document.querySelector('form.searchBar--form')
searchForm.submit() // this takes me to a new page which I need to wait for and then ideally return something.
// I've tried adding code here, but it doesn't run...
}, term, location)
})
}
exports.createSearch = createSearch
I'm then calling my function from my app's entry point...
(async () => {
// current
search.createSearch('test')
// proposed
search.createSearch('test').then(() => {
// trigger puppeteer to look at the new page and start running asserts.
})
})()
Unfortunately, due to the form submitting, I'm unsure how I can wait for the new page to load and run a new function? The new URL will be unknown, and different each time, e.g: https://example.com/page20
After form submit, you need to wait until the page reloads. Please add this following the await page.evaluate() function call.
await page.waitForNavigation();
And then you can perform action you want.

Get Nightmare to wait for next page load after clicking link

I'm using nightmare.js to scrape public records and am just trying to get the scraper to wait for the next page to load. I'm crawling search results which I press a next button to (obviously) get to the next page. I can't use nightmare.wait(someConstTime) to accurately wait for the next page to load because sometimes someConstTime is shorter than the time it takes for the next page to load (although it's always under 30 seconds). I also can't use nightmare.wait(selector) because the same selectors are always present on all result pages. In that case nightmare basically doesn't wait at all because the selector is already present (on the page I already scraped) so it it will proceed to scrape the same page several times unless the new page loads before the next loop.
How can I conditionally wait for the next page to load after I click on the next button?
If I could figure out how - I would compare the "Showing # to # of ## entries" indicator of the current page (currentPageStatus) to the last known value (lastPageStatus) and wait until they're different (hence the next page loaded).
(ignore that the example image only has one search result page)
I'd do that using this code from https://stackoverflow.com/a/36734481/3491991 but that would require passing lastPageStatus into deferredWait (which I can't figure out).
Here's the code I've got so far:
// Load dependencies
//const { csvFormat } = require('d3-dsv');
const Nightmare = require('nightmare');
const fs = require('fs');
var vo = require('vo');
const START = 'http://propertytax.peoriacounty.org';
var parcelPrefixes = ["01","02","03","04","05","06","07","08","09","10",
"11","12","13","14","15","16","17","18","19"]
vo(main)(function(err, result) {
if (err) throw err;
});
function* main() {
var nightmare = Nightmare(),
currentPage = 0;
// Go to Peoria Tax Records Search
try {
yield nightmare
.goto(START)
.wait('input[name="property_key"]')
.insert('input[name="property_key"]', parcelPrefixes[0])
// Click search button (#btn btn-success)
.click('.btn.btn-success')
} catch(e) {
console.error(e)
}
// Get parcel numbers ten at a time
try {
yield nightmare
.wait('.sorting_1')
isLastPage = yield nightmare.visible('.paginate_button.next.disabled')
while (!isLastPage) {
console.log('The current page should be: ', currentPage); // Display page status
try {
const result = yield nightmare
.evaluate(() => {
return [...document.querySelectorAll('.sorting_1')]
.map(el => el.innerText);
})
// Save property numbers
// fs.appendFile('parcels.txt', result, (err) => {
// if (err) throw err;
// console.log('The "data to append" was appended to file!');
// });
} catch(e) {
console.error(e);
return undefined;
}
yield nightmare
// Click next page button
.click('.paginate_button.next')
// ************* THIS IS WHERE I NEED HELP *************** BEGIN
// Wait for next page to load before continue while loop
try {
const currentPageStatus = yield nightmare
.evaluate(() => {
return document.querySelector('.dataTables_info').innerText;
})
console.log(currentPageStatus);
} catch(e) {
console.error(e);
return undefined;
}
// ************* THIS IS WHERE I NEED HELP *************** END
currentPage++;
isLastPage = yield nightmare.visible('.paginate_button.next.disabled')
}
} catch(e) {
console.error(e)
}
yield nightmare.end();
}
I had a similar issue that I managed to fix. Basically I had to navigate to a search page, select the '100 per page' option and then wait for the refresh. Only problem was, it was a crapshoot as to whether a manual wait time allowed the AJAX to fire and repopulate with more than 10 results (the default).
I ended up doing this:
nightmare
.goto(url)
.wait('input.button.primary')
.click('input.button.primary')
.wait('#searchresults')
.select('#resultsPerPage',"100")
.click('input.button.primary')
.wait('.searchresult:nth-child(11)')
.evaluate(function() {
...
}
.end()
With this, the evaluate won't fire until it detects at least 11 divs with the class of .searchresult. Given that the default is 10, it has to wait for the reload for this to complete.
You could extend this to scrape the total number of available results from the first page to ensure that there are - in my case - more than 10 available. But the foundation of the concept works.
From what I could understand, basically you need the DOM change to be completed before you start extracting from the page being loaded.
In your case, the element for DOM changes is table with CSS selector: '#search-results'
I think MutationObserver is what you need.
I have used Mutation Summary library which provides a nice wrapper on raw functionality of MutationObservers, to achieve something similar
var observer = new MutationSummary({
callback: updateWidgets,
queries: [{
element: '[data-widget]'
}]
});
:From Tutorial
First register MutationSummary observer when the search results are loaded.
Then, after clicking 'Next' use nightmare.evaluate to wait for mutationSummary callback to return extracted values.

Categories

Resources