1

I am trying to scrape a player stats table for NBA stats using requests and BeautifulSoup, but the response I am getting is not same as what I see using "Inspect Element"

The div containing this table is has class attribute: class="nba-stat-table__overflow. However, whenever I run the following code I get an empty list:

table = soup.find_all('div',attrs={'class="nba-stat-table__overflow'})

Here is my full code:

import os
import pandas as pd
import numpy as np
from bs4 import BeautifulSoup
import requests

url = 'https://stats.nba.com/players/boxscores/?Season=2018-19&SeasonType=Regular%20Season'
response = requests.get(url)
soup = BeautifulSoup(response.content,'html.parser')
table = soup.find_all('div',attrs={'class="nba-stat-table__overflow'})
j3ff
  • 5,719
  • 8
  • 38
  • 51

1 Answers1

0

Basically the page is load via JavaScript, So bs4 or requests modules will not be able to render the JavaScript on the fly.

You should use selenium or requests_html modules to render the JS, But i noticed that the website is using API, which can be used to fetch the data, So I've called it and extracted the data.

Check My previous Answer which explain for you how to fetch the API.

import requests
import pandas as pd

params = {
    "Counter": "1000",
    "DateFrom": "",
    "DateTo": "",
    "Direction": "DESC",
    "LeagueID": "00",
    "PlayerOrTeam": "P",
    "Season": "2018-19",
    "SeasonType": "Regular Season",
    "Sorter": "DATE"
}


headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0',
    "x-nba-stats-origin": "stats",
    "x-nba-stats-token": "true",
    "Referer": "https://stats.nba.com/players/boxscores/?Season=2018-19&SeasonType=Regular%20Season"
}


def main(url):
    r = requests.get(url, params=params, headers=headers).json()
    goal = []
    for item in r['resultSets']:
        df = pd.DataFrame(item['rowSet'], columns=item['headers'])
        goal.append(df)

    new = pd.concat(goal)
    print(new)
    new.to_csv("data.csv", index=False)


main("https://stats.nba.com/stats/leaguegamelog")

Output: View-Online

enter image description here

  • Thanks for the solution Ahmed, but can you explain me why my code was not working? Also how your code works? Can you share a link where I can learn more scraping? – Shantanu Bisht Apr 17 '20 at 12:36
  • Please explain to me why I was getting an empty list? Because I have watched a lot of video tutorials on scraping and did exactly the same thing but got an empty list even when i was scraping rottentomatoes.com – Shantanu Bisht Apr 17 '20 at 12:38
  • @ShantanuBisht answer updated. if my answer helps you, feel free to [accept and upvote](https://meta.stackexchange.com/a/5235/734454) if you liked. – αԋɱҽԃ αмєяιcαη Apr 17 '20 at 12:42
  • @Ahmed thanks a lot and sorry for bugging you with questions but can you tell me how to know if the website is loaded via javascript and using API? – Shantanu Bisht Apr 17 '20 at 12:48
  • @ShantanuBisht you welcome, read that answer and you will understand [check](https://stackoverflow.com/questions/61044616/using-find-all-function-returns-an-unexpected-result-set/61045691#61045691) – αԋɱҽԃ αмєяιcαη Apr 17 '20 at 12:53