0

I'm trying to parse this website table contents, I tried using below program

    import requests
from bs4 import BeautifulSoup

url = "http://www.espncricinfo.com/rankings/content/page/211270.html"
page = requests.get(url)
soup = BeautifulSoup(page.text,"html.parser")
batsman_type = (soup.find_all('h3'))[0].text
ret = []
row = {}


for tr in soup.find_all("tr"):

        tds = tr.find_all("td")
        if len(tds) > 0 :
            row = {'Rank':tds[0].text,'Name': tds[1].text, 'Country' : tds[2].text, 'Rating': tds[3].text}
            ret.append(row)
            row = {}
ret.append(row)
print(ret)

when I print td it returns none value, how can I fix this in order to get all the contents of table?, If you need to any other question please feel free to ask

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
steve
  • 139
  • 2
  • 13
  • Possible duplicate of [How can I parse a dynamic page using Python?](https://stackoverflow.com/questions/36225844/how-can-i-parse-a-dynamic-page-using-python) – OneCricketeer Feb 06 '18 at 14:28
  • @cricket_007 isn't it possible using only Beautifulsoup – steve Feb 06 '18 at 14:31
  • No. I'm pretty sure that page is loaded with Javascript. Beautifulsoup can't parse something Requests won't load – OneCricketeer Feb 06 '18 at 14:35
  • You can use `BeautifulSoup` to parse the data, but you **can't** use `requests` module to get that data. You'll have to use some other tools. (Like Selenium). – Keyur Potdar Feb 06 '18 at 14:40
  • I'm getting error warnings.warn('Selenium support for PhantomJS has been deprecated, please use headless ' – steve Feb 06 '18 at 17:10

0 Answers0