0

Please bear with me, I'm self taught and not great at writing code.

I'm trying to scrape some data from a webpage in Google Sheets but the page "lazy loads" so not all of the data is there immediately upon page load. The delay is a few seconds. My preference would be to use the importxml formula but that was only returning partial results for this reason. Next I tried writing a script to do this because I thought I could use Utilities.sleep to make the script pause long enough for the rest of the info on the page to load, but I'm getting the same results as I did with importxml formula. I wonder if I'm just putting Utilities.sleep in the wrong place in the code? I thought I would need it after UrlFetchApp.fetch(url) but before the match logic but I think that doesn't work because the the fetch is already completed. Is there a way to add a 5-10 second pause to let the URL load before doing the fetch? Anyone know if there's a way to do this within the importxml formula itself?

Thanks so much for your time and consideration!

https://www.expeditions.com/destinations/alaska

function lowprice(url) {
  var found, html, content = '';
  var response = UrlFetchApp.fetch(url);
  
  // Utilities.sleep(10*1000)
 
  // this one pulls all price formatted numbers
  //var regex = /(\$[0-9,]*)/g;

  // the ?<= means only pull after that and the ?<! means only pull before that
  var regex = /(?<="itinerary-card__price">)(\$[0-9,]*)(?<!<\/p>)/g 
  
  if (response) {
    html = response.getContentText();
    if (html) content = html.match(regex);
  }

  return content;
}
Clayton
  • 21
  • 1
  • 3
  • It's not about the delay, it's about the browser executing js. With automated systems like imortxml or urlfetch, there's no browser to run js in. – TheMaster Jul 08 '22 at 13:18
  • Thank you Master. Now that you say it, I had heard that before but I didn't think of the implications for this. Back to the drawing board. Also, thanks for the quick reply. – Clayton Jul 08 '22 at 13:28

0 Answers0