3

I've been having a bit of trouble scraping data from the stats.nba site. I've done this a few times so not sure what's changed up, but wanted to see if anyone else was having the same problem.

I usually just use jsonlite with the request url like so:

fromJSON("http://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Per36&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2016-17&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=")

R just seems to get stuck running the code. Interestingly, I can still easily scrape from the nba's d-league website.

fromJSON("http://stats.nbadleague.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=20&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Per36&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2016-17&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=")

Anyone else having this issue?

intern
  • 325
  • 1
  • 2
  • 12

2 Answers2

0

Try this

library(httr)
library(rjson)
url = "http://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=&DateTo=&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=Per36&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2016-17&SeasonSegment=&SeasonType=Regular+Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight="
agent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/56.0.2924.87 Safari/537.36"
data = GET(url, user_agent(agent))
fromJSON(content(data, type="text"))
Shane Kao
  • 185
  • 1
  • 11
0

I messed around with this for HOURS. The best guess I can venture is that it has something to do with the "/error" (see picture) redirect on the nba stats url which does not occur on the d-league url.

enter image description here

The code I wrote that works, involved reading the json as text first using readLines() then passing the result into fromJSON()

library(jsonlite)
jsonTxt <- readLines("https://stats.nba.com/stats/leaguedashplayerstats?College=&Conference=&Country=&DateFrom=01/01/2017&DateTo=09/30/2017&Division=&DraftPick=&DraftYear=&GameScope=&GameSegment=&Height=&LastNGames=0&LeagueID=00&Location=&MeasureType=Base&Month=0&OpponentTeamID=0&Outcome=&PORound=0&PaceAdjust=N&PerMode=PerGame&Period=0&PlayerExperience=&PlayerPosition=&PlusMinus=N&Rank=N&Season=2016-17&SeasonSegment=&SeasonType=Regular%20Season&ShotClockRange=&StarterBench=&TeamID=0&VsConference=&VsDivision=&Weight=")
json <- fromJSON(txt = jsonTxt)

colnames(json$resultSets)
pim
  • 12,019
  • 6
  • 66
  • 69