0

I ideally want to access the API from this website, but since I am struggling to do that, I have decided to try and scrape the page instead. I am starting at this page: https://fantasy.sixnationsrugby.com/#/welcome/login Where I plan to log in and then scrape the data.

The code I have below seems to work for every other website I test with, apart from this one. And I can't seem to pull anything, no text, forms, etc literally nothing works? As an example I just want to scrape the main header title 'Let's Go! Log in to your account'

  def scrape
    require 'rubygems'
    require 'mechanize'

    agent = Mechanize.new

    page = agent.get('https://fantasy.sixnationsrugby.com/#/welcome/login')
    header_title = page.search('div.fs-box-header-title').text.strip
    @output = header_title
  end 

Is it something to do with how the page is rendered? Thanks

ldthompson
  • 29
  • 6
  • Disable JS in your browser and visit the page. That's why. – max Feb 07 '23 at 15:27
  • So the page basically doesn't load? Does that mean I won't be able to scrape at all? – ldthompson Feb 07 '23 at 15:33
  • 3
    Simple web scrapers are not browsers. Its just a HTTP client and a HTML parser cobbled together and doesn't run JS or have a render tree or a DOM or any of the other things needed to actually render this. While you could do this by automating a browser (for example with selenium) web scraping is always extremely fragile (read a shit-show) and you should ask yourself if the data isn't available through a public API instead or if the project is actually a good idea. – max Feb 07 '23 at 16:00

0 Answers0