0

I've been trying to create a stock scraper on a couple of different websites as a personal project. Here recently I've dabbling in using Puppeteer to handle the scraping through a virtual headless browser. However, when trying to scrape Walmart, it continuously gives me an UnhandledPromiseRejectionWarning: TimeoutError: waiting for XPath. I've tried some debugging, but I don't understand what's going on. Every other website I use this syntax on has been working, so I figured I'd ask for some input.

async function scrapeWalmart(url) {
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto(url);

    await page.waitForXPath('//*[@id="blitzitem-container"]/div/div/div/div'); // Destructuring the url
    const [la] = await page.$x('//*[@id="blitzitem-container"]/div/div/div/div');
    const webText = await la.getProperty('textContent');
    const rawWebText = await webText.jsonValue();
    webTextString = rawWebText.toString();
    console.log("Stock on Walmart: \n");
    console.log(rawWebText + "\n");
    browser.close();
}
  • 1
    Welcome to SO! It's hard to help without a link to the page or a relevant snippet of the HTML/JS that can reproduce the problem. See [mcve]. Thanks. – ggorlen Jul 18 '21 at 03:35
  • 2
    Have you checked that you've actually got to the desired page? I suspect there is some sort of bot protection page which is why the script can't find the data. Try making a screenshot after `page.goto`. – Vaviloff Jul 18 '21 at 10:36
  • @Vaviloff Yeah I’m pretty sure that it’s the desired page. I thought about it being bot protected, & making a screenshot is a good idea. I didn’t even think about that. – ShiftyHJP Jul 18 '21 at 16:51
  • just try it with {headless: false} and see if it's there / works – pguardiario Jul 19 '21 at 01:18

0 Answers0