1

I have been using google sheet importxml to scrape seekingalpha for a long time. Recently I find it very difficult to scrape seekingalpha news.

For example to scrape the news from "https://seekingalpha.com/news/3611919-epic-makes-another-attempt-reversing-fortnite-app-store-ban" I use the below formula:

=JOIN(CHAR(13), importxml("https://seekingalpha.com/news/3611919-epic-makes-another-attempt-reversing-fortnite-app-store-ban","//div[@id='bullets_ul']"))

Most of the time, it will have the error: could not fetch url.

In the past, I could add "#1234" or other numbers to the url to "force" download again. But such do not seem to work recently.

Any idea to how I can make the news download more reliable?

ZygD
  • 22,092
  • 39
  • 79
  • 102
tangkb
  • 21
  • 3

1 Answers1

0

You're looking for a way to force IMPORTXML to refresh. There does not appear to be a keyboard shortcut for this, so you can either find an apps-script to do that for you, or do it this way. (I'm sure there are other ways to do it.)

  1. Add a checkbox to some cell (B1 here)
  2. Use this:
=JOIN(,importxml(REPT(" ",B1)&"https://seekingalpha.com/news/3611919-epic-makes-another-attempt-reversing-fortnite-app-store-ban","//div[@id='bullets_ul']"))

By changing the URL, the import will refresh, so when we uncheck or check the box, REPT appends 0 or 1 space in front of the URL.

No, we cannot use any function, such as RAND() (or any related function) or NOW().

General Grievance
  • 4,555
  • 31
  • 31
  • 45