I am building a code that allows me to wait for a promise to be resolved to advance to the next, I am also adding a separate time for each wait, but only the first loop is printed, if I remove the wait in seconds this works well.
in play code I get this error, in other editors I simply print the first loop
error: Infinity loop on line 8, char 6. You can increase loop timeout in settings.
https://playcode.io/309050?tabs=console&script.js&output
in the stackoverflow code editor if the code works but in my project or in another code editor they do not work
const urls = [
'https://jsonplaceholder.typicode.com/todos/1',
'https://jsonplaceholder.typicode.com/todos/2',
'https://jsonplaceholder.typicode.com/todos/3'
];
async function getTodos() {
for (const [idx, url] of urls.entries()) {
const todo = await fetch(url);
console.log(`Received Todo ${idx+1}:`, todo);
await wait(1000)
}
console.log('Finished!');
}
getTodos();
function wait(ms) {
return new Promise(r => setTimeout(r, ms));
}
Some online code playgrounds, like CodePen, have built-in infinite-loop detection. This is to prevent the UI from freezing in the event that you accidentally write an infinite loop while scratch coding. This is usually achieved by statically analyzing your code. Because of this, it cannot tell what your code actually does at runtime (and it may not be infinite at all). The best they can do is guess based on structure.
In your case, your code editor thinks that you're doing an infinite loop. There's probably settings in your editor to disable that.
Related
Follow my previous article Built method time is not a function, I managed to successfully implement the functions with an appropriate wait time by following a combination of #ggorlen's comment and #Konrad Linkowski answer, additionally, this article puppeteer: wait N seconds before continuing to the next line that #ggorlen answered in, this comment especially helped: -
Something else? Run an evaluate block and add your own code to wait for a DOM mutation or poll with setInterval or requestAnimationFrame and effectively reimplement waitForFunction as fits your needs.
Instead I incorporated waitForSelector, produces the following script:
const puppeteer = require('puppeteer')
const EXTENSION = '/Users/usr/Library/Application Support/Google/Chrome/Profile 1/Extensions/gidnphnamcemailggkemcgclnjeeokaa/1.14.4_0'
class Agent {
constructor(extension) {
this._extension = extension
}
async runBrowser() {
const browser = await puppeteer.launch({
headless:false,
devtools:true,
args:[`--disable-extensions-except=${this._extension}`,
`--load-extension=${this._extension}`,
'--enable-automation']
})
return browser
}
async getPage(twitch) {
const page = await (await this.runBrowser()).newPage()
await page.goto('chrome-extension://gidnphnamcemailggkemcgclnjeeokaa/popup.html')
const nextEvent = await page.evaluate(async () => {
document.getElementById('launch-trace').click()
})
const waitSelector = await page.waitForSelector('.popup-body')
const finalEvent = (twitch) => new Promise(async (twitch) => page.evaluate(async (twitch) => {
const input = document.getElementById('user-trace-id')
input.focus()
input.value = twitch
}))
await finalEvent(twitch)
}
}
const test = new Agent(EXTENSION)
test.getPage('test')
However, my webpage produces undefined rather than test, I am a little confused by the parameters twich and k, and how to properly assert the parameter twitch so its entered inside the function finalEvent.
Alternatively, I have also tried wrapping finalEvent into a Promise so I can assert the parameter twitch into it as a function, but this does not fill any value:
const finalEvent = (val) => new Promise(async () => await page.evaluate(async () => {
const nextTime = () => new Promise(async () => setInterval(async () => {
const input = document.getElementById('user-trace-id')
input.focus()
input.value = val
}, 3000))
//await nextTime(k)
}))
await finalEvent(twitch)
There are a few issues here. First,
const page = await (await this.runBrowser()).newPage()
hangs the browser handle and leaks memory which keeps the process alive. Always close the browser when you finish using it:
const browser = await this.runBrowser();
const page = await browser.newPage();
// ... do your work ...
await browser.close();
Here, though, Puppeteer can throw, again leaking the browser and preventing your app from cleanly exiting, so I suggest adding a try/catch block with a finally block that closes the browser.
Generally speaking, try to get the logic working first, then do a refactor to break code into functions and classes. Writing abstractions and thinking about design while you're still battling bugs and logical problems winds up making both tasks harder.
Secondly, there's no need to async a function if you never use await in it, as in:
const nextEvent = await page.evaluate(async () => {
document.getElementById('launch-trace').click()
})
Here, nextEvent is undefined because evaluate()'s callback returned nothing. Luckily, you didn't attempt to use it. You also have const waitSelector = await page.waitForSelector('.popup-body') which does return the element, but it goes unused. I suggest enabling eslint no-unused-vars, because these unused variables make a confusing situation worse and often indicate typos and bugs.
On to the main problem,
const finalEvent = (twitch) => new Promise(async (twitch) => page.evaluate(async (twitch) => {
const input = document.getElementById('user-trace-id')
input.focus()
input.value = twitch
}))
await finalEvent(twitch)
There are a number of misunderstandings here.
The first is the age-old Puppeteer gotcha, confusing which code executes in the browser process and which code executes in the Node process. Everything in an evaluate() callback (or any of its family, $eval, evaluateHandle, etc) executes in the browser, so Node variables that look like they should be in scope won't be. You have to pass and return serializable data or element handles to and from these callbacks. In this case, twitch isn't in scope of the evaluate callback. See the canonical How can I pass a variable into an evaluate function? for details.
The second misunderstanding is technically cosmetic in that you can make the code work with it, but it's a serious code smell that indicates significant confusion and should be fixed. See What is the explicit promise construction antipattern and how do I avoid it? for details, but the gist is that when you're working with a promise-based API like Puppeteer, you should never need to use new Promise(). Puppeteer's methods already return promises, so it's superfluous at best to wrap more promises on top of the them, and at worst, introduces bugs and messes up error handling.
A third issue is that the first parameter to new Promise((resolve, reject) => {}) is always a resolve function, so twitch is a confusing mislabel. Luckily, it won't matter as we'll be dispensing with the new Promise idiom when using Puppeteer 99.9% of the time.
So let's fix the code, keeping these points in mind:
await page.evaluate(twitch => {
const input = document.getElementById('user-trace-id');
input.focus();
input.value = twitch;
},
twitch
);
Note that I'm not assigning the return value to anything because there's nothing being returned by the evaluate() callback.
"Selecting, then doing something with the selected element" is such a common pattern that Puppeteer provides a handy method to shorten the above code:
await page.$eval("#user-trace-id", (input, twitch) => {
input.focus();
input.value = twitch;
},
twitch
);
Now, I can't run or reproduce your code as I don't have your extension, and I'm not sure what goal you're trying to achieve, but even the above code looks potentially problematic.
Usually, you want to use Puppeteer's page.type() method rather than a raw DOM input.value = ..., which doesn't fire any event handlers that might be attached to the input. Many inputs won't register such a change, and it's an untrusted event.
Also, it's weird that you'd have to .focus() on the input before setting its value. Usually focus is irrelevant to setting a value property, and the value will be set either way.
So there may be more work to do, but hopefully this will point you in the right direction by resolving the first layer of immediate issues at hand. If you're still stuck, I suggest taking a step back and providing context in your next question of what you're really trying to accomplish here in order to avoid an xy problem. There's a strong chance that there's a fundamentally better approach than this.
Why do I need to wrap resolve() with meaningless async function in node 10.16.0, but not in chrome? Is this node.js bug?
let shoot = async () => console.log('there shouldn\'t be race condition');
(async () => {
let c = 3;
while(c--) {
// Works also in node 10.16.0
// console.log(await new Promise(resolve => shoot = async (...args) => resolve(...args)));
// Works is chrome, but not in node 10.16.0?
console.log(await new Promise(resolve => shoot = resolve));
};
})();
(async () => {
await shoot(1);
await shoot(2);
await shoot(3);
})();
resolve() is not async
And calling resolve() (via shoot()) does not immediately triggers the related await (in the loop) - but instead queues up the event. Adding async/await gives chance to the event loop to wake up and consume the queue. In chrome await alone is enough and in node await needs to be coupled with actual async function. This kind of synchronization of tasks in not reliable and there are chances of calling the same resolve() twice.
This is an example of what NOT to do in javascript.
It’s a Node 10 (possible) bug or (probable) outdated behaviour* in the implementation of promises. According to the ECMAScript spec at the time of this writing, in
await shoot(1);
shoot(1) fulfills the promise created with new Promise(), which enqueues a job for each reaction in that promise’s fulfill reactions
await undefined (what shoot(1) returns) enqueues a job to continue after this statement, because undefined is converted to a fulfilled promise
The reaction in the promise’s fulfill reactions corresponding to the await in the first IIFE was added by PerformPromiseThen and it doesn’t involve any other jobs; it just continues inside that IIFE immediately.
In short, the next shoot = resolve should always run before execution continues after await shoot(n). The Node 12/current Chrome result is correct.
Normally, you shouldn’t come across this type of bug anyway: as I mentioned in the comments, relying on operations creating specific numbers of jobs/taking specific numbers of microticks for synchronization is bad design. If you wanted a sort of stream where each shoot() call always produces a loop iteration (even without the misleading await), something like this would be better:
let available;
(async () => {
let queue = new Queue();
while (true) {
await new Promise(resolve => {
available = value => {
resolve();
queue.enqueue(value);
};
});
available = null; // just an assertion, pretty much
while (!queue.isEmpty()) {
let value = queue.dequeue();
// process value
}
}
})();
shoot(1);
shoot(2);
shoot(3);
with an appropriate queue implementation. (Then you could look to async iterators to make consuming the queue neat.)
* not sure of the exact history here. fairly certain the ES spec used to reference microtasks, but they’re jobs now. current stable firefox matches node 10. await may take less time than it used to. this kind of thing is the reason for the following advice.
Your code is relying on a somewhat obscure timing issue involving await of a constant (a non-promise). That timing issue apparently does not behave the same in the two environments you have tested.
Each await shoot(n) is really just doing await resolve(n) which is not awaiting a promise. It's awaiting undefined since resolve() has no return value.
So, you're apparently seeing an implementation difference in event loop and promise implementation when you await a non-promise. You are apparently expecting await resolve() to somehow be asynchronous and allow your while() loop to run before running the next await shoot(n), but I'm not aware of a language requirement to do that and even if there is, it's an implementation detail that you probably should not write code that relies on.
I think it's basically just bad code design that relies on micro-details of scheduling of two jobs that are enqueued at about the same time. It's always safer to write the code in a way that enforces the proper sequencing rather than relying on micro-details of scheduler implementation - even if these details are in the specification and certainly if they are not.
node.js is perhaps more optimized or buggy (I don't know which) to not go back to the event loop when doing await on a constant. Or, if it does go back to the event loop, it goes in a prioritized fashion that keeps the current chain of code executing rather than letting other promises go next. In any case, for this code to work, it has to rely on some await someConstant behavior that isn't the same everywhere.
Wrapping resolve() forces the interpreter to go back to the event loop after each await shoot(n) because it is actually now awaiting a promise which gives the while() loop a chance to run and fill shoot with a new value before the next shoot(n) is called.
Synchronicity in js loops is still driving me up the wall.
What I want to do is fairly simple
async doAllTheThings(data, array) {
await array.forEach(entry => {
let val = //some algorithm using entry keys
let subVal = someFunc(/*more entry keys*/)
data[entry.Namekey] = `${val}/${subVal}`;
});
return data; //after data is modified
}
But I can't tell if that's actually safe or not. I simply don't like the simple loop pattern
for (i=0; i<arrayLength; i++) {
//do things
if (i === arrayLength-1) {
return
}
}
I wanted a better way to do it, but I can't tell if what I'm trying is working safely or not, or I simply haven't hit a data pattern that will trigger the race condition.
Or perhaps I'm overthinking it. The algorithm in the array consists solely of some MATH and assignment statements...and a small function call that itself also consists solely of more MATH and assignment statements. Those are supposedly fully synchronous across the board. But loops are weird sometimes.
The Question
Can you use await in that manner, outside the loop itself, to trigger the code to wait for the loop to complete? Or is the only safe way to accomplish this the older manner of simply checking where you are in the loop, and not returning until you hit the end, manually.
One of the best ways to handle async and loops is to put then on a promise and wait for Promise.all remember that await returns a Promise so you can do:
async function doAllTheThings(array) {
const promises = []
array.forEach((entry, index) => {
promises.push(new Promise((resolve) => {
setTimeout(() => resolve(entry + 1), 200 )
}))
});
return Promise.all(promises)
}
async function main () {
const arrayPlus1 = await doAllTheThings([1,2,3,4,5])
console.log(arrayPlus1.join(', '))
}
main().then(() => {
console.log('Done the async')
}).catch((err) => console.log(err))
Another option is to use generators but they are a little bit more complex so if you can just save your promises and wait for then that is an easier approach.
About the question at the end:
Can you use await in that manner, outside the loop itself, to trigger the code to wait for the loop to complete? Or is the only safe way to accomplish this the older manner of simply checking where you are in the loop, and not returning until you hit the end, manually.
All javascript loops are synchronous so the next line will wait for the loop to execute.
If you need to do some async code in loop a good approach is the promise approach above.
Another approach for async loops specially if you have to "pause" or get info from outside the loop is the iterator/generator approach.
Using Puppeteer, I would like to get all the elements on a page with a particular class name and then loop through and click each one.
Using jQuery, I can achieve this with:
var elements = $("a.showGoals").toArray();
for (i = 0; i < elements.length; i++) {
$(elements[i]).click();
}
How would I achieve this using Puppeteer?
Update
Tried out Chridam's answer below, but I couldn't get it to work (though the answer was helpful, so thanks due there), so I tried the following and this works:
await page.evaluate(() => {
let elements = $('a.showGoals').toArray();
for (i = 0; i < elements.length; i++) {
$(elements[i]).click();
}
});
Iterating puppeteer async methods in for loop vs. Array.map()/Array.forEach()
As all puppeteer methods are asynchronous it doesn't matter how we iterate over them. I've made a comparison and a rating of the most commonly recommended and used options.
For this purpose, I have created a React.Js example page with a lot of React buttons here (I just call it Lot Of React Buttons). Here (1) we are able set how many buttons to be rendered on the page; (2) we can activate the black buttons to turn green by clicking on them. I consider it an identical use case as the OP's, and it is also a general case of browser automation (we expect something to happen if we do something on the page).
Let's say our use case is:
Scenario outline: click all the buttons with the same selector
Given I have <no.> black buttons on the page
When I click on all of them
Then I should have <no.> green buttons on the page
There is a conservative and a rather extreme scenario. To click no. = 132 buttons is not a huge CPU task, no. = 1320 can take a bit of time.
I. Array.map
In general, if we only want to perform async methods like elementHandle.click in iteration, but we don't want to return a new array: it is a bad practice to use Array.map. Map method execution is going to finish before all the iteratees are executed completely because Array iteration methods execute the iteratees synchronously, but the puppeteer methods, the iteratees are: asynchronous.
Code example
const elHandleArray = await page.$$('button')
elHandleArray.map(async el => {
await el.click()
})
await page.screenshot({ path: 'clicks_map.png' })
await browser.close()
Specialties
returns another array
parallel execution inside the .map method
fast
132 buttons scenario result: ❌
Duration: 891 ms
By watching the browser in headful mode it looks like it works, but if we check when the page.screenshot happened: we can see the clicks were still in progress. It is due to the fact the Array.map cannot be awaited by default. It is only luck that the script had enough time to resolve all clicks on all elements until the browser was not closed.
1320 buttons scenario result: ❌
Duration: 6868 ms
If we increase the number of elements of the same selector we will run into the following error:
UnhandledPromiseRejectionWarning: Error: Node is either not visible or not an HTMLElement, because we already reached await page.screenshot() and await browser.close(): the async clicks are still in progress while the browser is already closed.
II. Array.forEach
All the iteratees will be executed, but forEach is going to return before all of them finish execution, which is not the desirable behavior in many cases with async functions. In terms of puppeteer it is a very similar case to Array.map, except: for Array.forEach does not return a new array.
Code example
const elHandleArray = await page.$$('button')
elHandleArray.forEach(async el => {
await el.click()
})
await page.screenshot({ path: 'clicks_foreach.png' })
await browser.close()
Specialties
parallel execution inside the .forEach method
fast
132 buttons scenario result: ❌
Duration: 1058 ms
By watching the browser in headful mode it looks like it works, but if we check when the page.screenshot happened: we can see the clicks were still in progress.
1320 buttons scenario result: ❌
Duration: 5111 ms
If we increase the number of elements with the same selector we will run into the following error:
UnhandledPromiseRejectionWarning: Error: Node is either not visible or not an HTMLElement, because we already reached await page.screenshot() and await browser.close(): the async clicks are still in progress while the browser is already closed.
III. page.$$eval + forEach
The best performing solution is a slightly modified version of bside's answer. The page.$$eval (page.$$eval(selector, pageFunction[, ...args])) runs Array.from(document.querySelectorAll(selector)) within the page and passes it as the first argument to pageFunction. It functions as a wrapper over forEach hence it can be awaited perfectly.
Code example
await page.$$eval('button', elHandles => elHandles.forEach(el => el.click()))
await page.screenshot({ path: 'clicks_eval_foreach.png' })
await browser.close()
Specialties
no side-effects of using async puppeteer method inside a .forEach method
parallel execution inside the .forEach method
extremely fast
132 buttons scenario result: ✅
Duration: 711 ms
By watching the browser in headful mode we see the effect is immediate, also the screenshot is taken only after every element has been clicked, every promise has been resolved.
1320 buttons scenario result: ✅
Duration: 3445 ms
Works just like in case of 132 buttons, extremely fast.
IV. for...of loop
The simplest option, not that fast and executed in sequence. The script won't go to page.screenshot until the loop is not finished.
Code example
const elHandleArray = await page.$$('button')
for (const el of elHandleArray) {
await el.click()
}
await page.screenshot({ path: 'clicks_for_of.png' })
await browser.close()
Specialties
async behavior works as expected by the first sight
execution in sequence inside the loop
slow
132 buttons scenario result: ✅
Duration: 2957 ms
By watching the browser in headful mode we can see the page clicks are happening in strict order, also the screenshot is taken only after every element has been clicked.
1320 buttons scenario result: ✅
Duration: 25 396 ms
Works just like in case of 132 buttons (but it takes more time).
Summary
Avoid using Array.map if you only want to perform async events and you aren't using the returned array, use forEach or for-of instead. ❌
Array.forEach is an option, but you need to wrap it so the next async method only starts after all promises are resolved inside the forEach. ❌
Combine Array.forEach with $$eval for best performance if the order of async events doesn't matter inside the iteration. ✅
Use a for/for...of loop if speed is not vital and if the order of the async events does matter inside the iteration. ✅
Sources / Recommended materials
Sebastien Chopin: JavaScript: async/await with forEach() (codeburst.io)
Antonio Val: Making array iteration easy when using async/await (Medium)
Using async/await with a forEach loop (Stackoverflow)
Await with array foreach containing async await (Stackoverflow)
Use page.evaluate to execute JS:
const puppeteer = require('puppeteer');
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
await page.evaluate(() => {
let elements = document.getElementsByClassName('showGoals');
for (let element of elements)
element.click();
});
// browser.close();
});
To get all elements, you should use page.$$ method, which is the same as [...document.querySelectorAll] (spread inside an array) from reqular browser API.
Then you could loop through it (map, for, whatever you like) and evaluate each link:
const getThemAll = await page.$$('a.showGoals')
getThemAll.forEach(async link => {
await page.evaluate(() => link.click())
})
Since you also want to do actions with the things you got, I'd recommend using page.$$eval which will do the same as above and run an evaluation function afterwards with each of the elements in the array in one line. For example:
await page.$$eval('a.showGoals', links => links.forEach(link => link.click()))
To explain the line above better, $$eval returns an array of links, then it executes a callback function with the links as argument then it runs through every link via forEach method and finally execute the click function in each one.
Check the official documentation too, they have good examples there.
page.$$() / elementHandle.click()
You can use page.$$() to create an ElementHandle array based on the given selector, and then you can use elementHandle.click() to click each element:
const elements = await page.$$('a.showGoals');
elements.forEach(async element => {
await element.click();
});
Note: Remember to await the click in an async function. Otherwise, you will receive the following error:
SyntaxError: await is only valid in async function
I have a browser application where I want to change some text in the UI to say that the page is loading, then run a long process, and once the process is complete to say that the page is finished loading.
Using the code written below I can get this to work when I call ProperlyUpdatesUIAsync, where the text is changed while the long process is running, and then once the long process is complete, it changes again to indicate that it is done.
However, when I use the DoesNotUpdateUIUntilEndAsync method, the UI does not get updated until after the long process is finished, never showing the "loading" message.
Am I misunderstanding how async/await works with JavaScript? Why does it work in the one case but not in the other?
async function ProperlyUpdatesUIAsync(numberOfImagesToLoad) {
$("#PageStatusLabel").text("Loading..");
await pauseExecutionAsync(2000);
$("#PageStatusLabel").text("Loaded");
}
// this method doesn't do anything other than wait for the specified
// time before allowing execution to continue
async function pauseExecutionAsync(timeToWaitMilliseconds) {
return new Promise(resolve => {
window.setTimeout(() => {
resolve(null);
}, timeToWaitMilliseconds);
});
}
async function DoesNotUpdateUIUntilEndAsync(numberOfImagesToLoad) {
$("#PageStatusLabel").text("Loading..");
await runLongProcessAsync();
$("#PageStatusLabel").text("Loaded");
}
async function runLongProcessAsync() {
// there is a for loop in here that takes a really long time
}
Edit:
I experimented with a few things and this new refactor is giving me the desired result, but I do not like it. I wrapped the long running loop in a setTimeout with a timeout setting of 10. With a value of 10, the UI is updated before running the loop. However, a value of 0 or even 1 does not allow the UI to update, and it continues to behave as if the timeout was not declared at all. 10 seems so arbitrary. Can I really rely on that working in every scenario? Shouldn't async/await defer execution until the UI is updated without my having to wrap everything in a timeout?
async function runLongProcessThatDoesNotBlockUIAsync() {
return new Promise(resolve => {
window.setTimeout(() => {
// there is a for loop in here that takes a really long time
resolve(null);
}, 10);
});
}
EDITED
The code in runLongProcessAsync() never yeilds/surrenders the thread for updates to take place.
try: -
<!DOCTYPE html>
<html>
<script type="text/javascript">
var keep;
async function DoesNotUpdateUIUntilEndAsync(numberOfImagesToLoad) {
document.getElementById("PageStatusLabel").innerHTML="Loading..";
p = new Promise((resolve) => {keep = resolve})
setTimeout(theRest,0); //let the Loading message appear
return p;
}
async function theRest(){
await runLongProcessAsync(); // Your await here is useless!
document.getElementById("PageStatusLabel").innerHTML="Loaded";
keep();
}
async function runLongProcessAsync() {
// there is a for loop in here that takes a really long time
for (var x=1; x<1000000000;x++){b=x^2}
}
</script>
<body onload="DoesNotUpdateUIUntilEndAsync(5)">
<p>Test</p>
<p id="PageStatusLabel"></p>
</body>
</html>
I'm not sure what you are attempting but my guess is you want Web Worker to give you another thread. Either that or you don't understand that "await" just gets rid of the need for callbacks. If your code is purely synchronous simply labelling "async" does nothing.