15

I'm scraping data from youtube and trying to get the number of comments. I'm trying to grab the element that contains the value, but in case the comments are disabled for a video, that element doesn't exist at all and waitForSelector() waits, I think, for about 30 seconds before ending the program. How can I tell puppeteer to wait for that element for, say, 5 seconds and if it doesn't exist, move on with the rest of the code?

Here's the code that I'm using-

await page.waitForSelector("yt-formatted-string.count-text.style-scope.ytd-comments-header-renderer")
let commentCount = await (await (await page.$("yt-formatted-string.count-text.style-scope.ytd-comments-header-renderer")).getProperty("textContent")).jsonValue()
Rohit
  • 1,385
  • 2
  • 15
  • 21
  • So you want to wait for 5 sec instead of 30 sec? – PySaad Aug 13 '19 at 05:42
  • 1
    according to the docs https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#browserwaitfortargetpredicate-options you can set timeout option in `waitForSelector` – Adam Kosmala Aug 13 '19 at 05:45

3 Answers3

31

Below code you can try for waiting -

await page.waitForSelector(yourSelector, {timeout: TIMEOUT});

Ex:

await page.waitForSelector(yourSelector, {timeout: 5000});

UPDATED:

To catch timeouterror and do something -

const {TimeoutError} = require('puppeteer/Errors');

try {
  await page.waitForSelector(yourSelector, {timeout: 5000});
} catch (e) {
  if (e instanceof TimeoutError) {
    // Do something if this is a timeout.
  }
}

Reference:

https://github.com/GoogleChrome/puppeteer/blob/master/docs/api.md#pagewaitforselectorselector-options

PySaad
  • 1,052
  • 16
  • 26
  • it worked, but it still stops execution when that timeout is done. how can I continue to execute the remaining code even though the timeout occurs? – Rohit Aug 13 '19 at 07:23
  • Updated answer, please check – PySaad Aug 13 '19 at 07:30
  • 1
    I did it using `page.evaluate()`, but thanks for the info on `TimeoutError`. I didn't know that. I'll also post the code as an answer here for anyone who might need it. Thanks again! – Rohit Aug 14 '19 at 06:57
5

Try code below, just add a timeout option

try {
    await page.waitForSelector("yt-formatted-string.count-text.style-scope.ytd-comments-header-renderer", {timeout: 5000 }); // 5s timeout
} catch (e) {
   // something wrong !!!!!!
}

let commentCount = await (await (await page.$("yt-formatted-string.count-text.style-scope.ytd-comments-header-renderer")).getProperty("textContent")).jsonValue()
lx1412
  • 1,160
  • 6
  • 14
1

I managed to run it in the browser and include it in page.evaluate(). Here's the code if anyone needs it in the future -

while(true){
    if(await page.evaluate(async () => { // scroll till there's no more room to scroll or the comment element shows up
        return await new Promise((resolve, reject) => {
           var scrolledHeight = 0  
           var distance = 100 
           var timer = setInterval(() => {
               var scrollHeight = document.documentElement.scrollHeight
               window.scrollBy(0, distance)
               scrolledHeight += distance
               if(scrolledHeight >= scrollHeight || document.querySelector("yt-formatted-string.count-text.style-scope.ytd-comments-header-renderer")){
                     clearInterval(timer)
                     resolve(true)
               }
           }, 500)
        })
    })){
        break
    }
}
let commentElement = await page.$("yt-formatted-string.count-text.style-scope.ytd-comments-header-renderer"), commentCount = null
if(commentElement !== null){
    commentCount = await (await commentElement.getProperty('innerHTML')).jsonValue()
}
Rohit
  • 1,385
  • 2
  • 15
  • 21