Generally, you cannot rely on await tempPage.click("a.orange")
to pause execution until tempPage
has "finish[ed] its tasks". For super simple code that executes synchronously, it may work. But in general, you cannot rely on it.
If the click triggers an Ajax operation, or starts a CSS animation, or starts a computation that cannot be immediately computed, or opens a new page, etc., then the result you are waiting for is asynchronous, and the .click
method will not wait for this asynchronous operation to complete.
What can you do? In some cases you may be able to hook into the code that is running on the page and wait for some event that matters to you. For instance, if you want to wait for an Ajax operation to be done and the code on the page uses jQuery, then you might use ajaxComplete
to detect when the operation is complete. If you cannot hook into any event system to detect when the operation is done, then you may need to poll the page to wait for evidence that the operation is done.
Here is an example that shows the issue:
const puppeteer = require('puppeteer');
function getResults(page) {
return page.evaluate(() => ({
clicked: window.clicked,
asynchronousResponse: window.asynchronousResponse,
}));
}
puppeteer.launch().then(async browser => {
const page = await browser.newPage();
await page.goto("https://example.com");
// We add a button to the page that will click later.
await page.evaluate(() => {
const button = document.createElement("button");
button.id = "myButton";
button.textContent = "My Button";
document.body.appendChild(button);
window.clicked = 0;
window.asynchronousResponse = 0;
button.addEventListener("click", () => {
// Synchronous operation
window.clicked++;
// Asynchronous operation.
setTimeout(() => {
window.asynchronousResponse++;
}, 1000);
});
});
console.log("before clicks", await getResults(page));
const button = await page.$("#myButton");
await button.click();
await button.click();
console.log("after clicks", await getResults(page));
await page.waitForFunction(() => window.asynchronousResponse === 2);
console.log("after wait", await getResults(page));
await browser.close();
});
The setTimeout
code simulates any kind of asynchronous operation started by the click.
When you run this code, you'll see on the console:
before click { clicked: 0, asynchronousResponse: 0 }
after click { clicked: 2, asynchronousResponse: 0 }
after wait { clicked: 2, asynchronousResponse: 2 }
You see that clicked
is immediately incremented twice by the two clicks. However, it takes a while before asynchronousResponse
is incremented. The statement await page.waitForFunction(() => window.asynchronousResponse === 2)
polls the page until the condition we are waiting for is realized.
You mentioned in a comment that the button is closing the tab. Opening and closing tabs are asynchronous operations. Here's an example:
puppeteer.launch().then(async browser => {
let pages = await browser.pages();
console.log("number of pages", pages.length);
const page = pages[0];
await page.goto("https://example.com");
await page.evaluate(() => {
window.open("https://example.com");
});
do {
pages = await browser.pages();
// For whatever reason, I need to have this here otherwise
// browser.pages() always returns the same value. And the loop
// never terminates.
await page.evaluate(() => {});
console.log("number of pages after evaluating open", pages.length);
} while (pages.length === 1);
let tempPage = pages[pages.length - 1];
// Add a button that will close the page when we click it.
tempPage.evaluate(() => {
const button = document.createElement("button");
button.id = "myButton";
button.textContent = "My Button";
document.body.appendChild(button);
window.clicked = 0;
window.asynchronousResponse = 0;
button.addEventListener("click", () => {
window.close();
});
});
const button = await tempPage.$("#myButton");
await button.click();
do {
pages = await browser.pages();
// For whatever reason, I need to have this here otherwise
// browser.pages() always returns the same value. And the loop
// never terminates.
await page.evaluate(() => {});
console.log("number of pages after click", pages.length);
} while (pages.length > 1);
await browser.close();
});
When I run the above, I get:
number of pages 1
number of pages after evaluating open 1
number of pages after evaluating open 1
number of pages after evaluating open 2
number of pages after click 2
number of pages after click 1
You can see it takes a bit before window.open()
and window.close()
have detectable effects.
In your comment you also wrote:
I thought await
was basically what turned an asynchronous function into a synchronous one
I would not say it turns asynchronous functions into synchronous ones. It makes the current code wait for an asynchronous operation's promise to be resolved or rejected. However, more importantly for the issue at hand here, the problem is that you have two virtual machines executing JavaScript code: there's Node which runs puppeteer
and the script that controls the browser, and there's the browser itself which has its own JavaScript virtual machine. Any await
that you use on the Node side affects only the Node code: it has no bearing on the code that runs in the browser.
It can get confusing when you see things like await page.evaluate(() => { some code; })
. It looks like it is all of one piece, and all executing in the same virtual machine, but it is not. puppeteer
takes the parameter passed to .evaluate
, serializes it, and sends it over to the browser, where it executes. Try adding something like await page.evaluate(() => { button.click(); });
in the script above, after const button = ...
. Something like this:
const button = await tempPage.$("#myButton");
await button.click();
await page.evaluate(() => { button.click(); });
In the script, button
is defined before page.evaluate
, but you'll get a ReferenceError
when page.evaluate
runs because button
is not defined on the browser side!