0

I have built a web scraper using goquery. However it can only retrieve the metadata of the first 14 or 15 articles, since the remaining articles can only be seen after manually clicking on a "Load more" button.

The new articles are not loaded asynchronously, because I can actually find their texts appearing under the "View page source" tab after they become visible (according to my limited knowledge on the "asynchronous" topic). So I guess that's a plus.

How can I deal with this problem? What are my options to scrape beyond the initial 15 articles?

Mydrive1997
  • 49
  • 1
  • 6
  • 1
    If the articles are only visually hidden, why do you stick to the visible ones? Put them all in a list and do your own paging. Or do you mean it's only the next 15 articles preloaded or something? – Peter Krebs May 07 '21 at 12:54
  • Please, share some of the code you've wrote or maybe the website you're trying to scrape data from (we might be able to help you more) – nicolasassi May 07 '21 at 20:57
  • The URL is: https://cointelegraph.com/tags/markets As you see, it only display 14-15 articles at first, so my code can only scrape those. The remaining article are NOT visually hidden, they are not even there if I did not click the "Load more" button first. – Mydrive1997 May 10 '21 at 02:18

0 Answers0