0

I've looked around at a lot of examples scraping ESPN fantasy football leagues. I am very new to web-scraping, but have looked into this extensively before posting because of that. I am having trouble accessing my league and getting anything useful though. I gather you should pass cookies on the request to identify yourself accessing a private league.

import requests
from bs4 import BeautifulSoup

page = requests.get('https://fantasy.espn.com/football/league?leagueId=########',
                    cookies={'SWID': '#######', 'espn_s2': '#######'}
)
soup = BeautifulSoup(page.text, 'html.parser')
test = soup.find_all(class_ = 'team-scores')

print(len(test))
print(type(test))
print(test)

0

class 'bs4.element.ResultSet'

[]

While based on some posts referenced in this article, https://stmorse.github.io/journal/espn-fantasy-python.html, and the article itself, cookies appear important to pass into this, performing the request without the cookies gets the same result. I compared the soup if cookies were used and not used, they came out as equivalent.

I know there are API out there to use on ESPN, but I cannot manage to make any of the code work for me. I was hoping to scrape the team names then take results from each team and run every possible schedule for the team to get a distribution of outcomes and see how lucky or unlucky each team was in my league. I was also curious about doing this with Yahoo. At this point I could easily manually take the data because its not too much, but I would like a more generalizable form.

Any advice or help would be much appreciated for an unexperienced web-scraper.

Feil Narley
  • 3
  • 1
  • 3
  • I'm not sure if there is any data you want in your page.text, please print it out and have a look – dabingsou Dec 20 '19 at 01:32
  • @FeilNarley. I'm willing to help you out, however I don't have a fantasy team in ESPN this year (only did NFL.com platform, and have had no issue access that data). So I would need your credentials (log in, etc.) can you send me the league ID? Secondly, if you're accessing an API, usually it responds with json format, so using beautifulsoup to parse html will return nothing. – chitown88 Dec 20 '19 at 11:31
  • Looks like I don't need credentials, just league ID. – chitown88 Dec 20 '19 at 15:41

1 Answers1

0

You'd have to share your league ID for me to test, but here's some code to do some data manipulation of a league. Basically you'll get the data returned as a json format, then need to parse through that to calculate out wins/losses based on weekly points. Then you can sort create a final table to compare the regular season records to what the overall records would be and see which teams performed above/below based on schedule:

import requests
import pandas as pd



s = requests.Session()
r = s.get('https://www.espn.com')

swid = s.cookies.get_dict()['SWID']


league_id = 31181


url = 'https://fantasy.espn.com/apis/v3/games/ffl/seasons/2019/segments/0/leagues/%s' %league_id


r = requests.get(url, cookies={"swid": swid}).json()

#Get Team IDs
teamId = {}
for team in r['teams']:
    teamId[team['id']] = team['location'].strip() + ' ' + team['nickname'].strip()


#Get each team's weekly points and calculate their head-to-head records
weeklyPoints = {}
r = requests.get(url, cookies={"swid": swid}, params={"view": "mMatchup"}).json()

weeklyPts = pd.DataFrame()
for each in r['schedule']:
    #each = r['schedule'][0]

    week = each['matchupPeriodId']
    if week >= 14:
        continue

    homeTm = teamId[each['home']['teamId']]
    homeTmPts = each['home']['totalPoints']

    try:
        awayTm = teamId[each['away']['teamId']]
        awayTmPts = each['away']['totalPoints']
    except:
        homeTmPts = 'BYE'
        continue

    temp_df = pd.DataFrame(list(zip([homeTm, awayTm], [homeTmPts, awayTmPts], [week, week])), columns=['team','pts','week'])

    if homeTmPts > awayTmPts:
        temp_df.loc[0,'win'] = 1
        temp_df.loc[0,'loss'] = 0
        temp_df.loc[0,'tie'] = 0

        temp_df.loc[1,'win'] = 0
        temp_df.loc[1,'loss'] = 1
        temp_df.loc[1,'tie'] = 0

    elif homeTmPts < awayTmPts:
        temp_df.loc[0,'win'] = 0
        temp_df.loc[0,'loss'] = 1
        temp_df.loc[0,'tie'] = 0

        temp_df.loc[1,'win'] = 1
        temp_df.loc[1,'loss'] = 0
        temp_df.loc[1,'tie'] = 0

    elif homeTmPts == awayTmPts:
        temp_df.loc[0,'win'] = 0
        temp_df.loc[0,'loss'] = 0
        temp_df.loc[0,'tie'] = 1

        temp_df.loc[1,'win'] = 0
        temp_df.loc[1,'loss'] = 0
        temp_df.loc[1,'tie'] = 1

    weeklyPts = weeklyPts.append(temp_df, sort=True).reset_index(drop=True)

weeklyPts['win'] = weeklyPts.groupby(['team'])['win'].cumsum()
weeklyPts['loss'] = weeklyPts.groupby(['team'])['loss'].cumsum()
weeklyPts['tie'] = weeklyPts.groupby(['team'])['tie'].cumsum()



# Calculate each teams record compared to all other teams points week to week
cumWeeklyRecord = {}   
for week in weeklyPts[weeklyPts['pts'] > 0]['week'].unique():
    df = weeklyPts[weeklyPts['week'] == week]

    cumWeeklyRecord[week] = {}
    for idx, row in df.iterrows():
        team = row['team']
        pts = row['pts']
        win = len(df[df['pts'] < pts])
        loss = len(df[df['pts'] > pts])
        tie = len(df[df['pts'] == pts])

        cumWeeklyRecord[week][team] = {}
        cumWeeklyRecord[week][team]['win'] = win
        cumWeeklyRecord[week][team]['loss'] = loss
        cumWeeklyRecord[week][team]['tie'] = tie-1

# Combine those cumluative records to get an overall season record      
overallRecord = {}     
for each in cumWeeklyRecord.items():
    for team in each[1].keys():
        if team not in overallRecord.keys():
            overallRecord[team] = {} 

        win = each[1][team]['win']
        loss = each[1][team]['loss']
        tie = each[1][team]['tie']

        if 'win' not in overallRecord[team].keys():
            overallRecord[team]['win'] = win
        else:
            overallRecord[team]['win'] += win

        if 'loss' not in overallRecord[team].keys():
            overallRecord[team]['loss'] = loss
        else:
            overallRecord[team]['loss'] += loss

        if 'tie' not in overallRecord[team].keys():
            overallRecord[team]['tie'] = tie
        else:
            overallRecord[team]['tie'] += tie


# Little cleaning up of the data nd calculating win %
overallRecord_df = pd.DataFrame(overallRecord).T
overallRecord_df = overallRecord_df.rename_axis('team').reset_index()
overallRecord_df = overallRecord_df.rename(columns={'win':'overall_win', 'loss':'overall_loss','tie':'overall_tie'})
overallRecord_df['overall_win%'] = overallRecord_df['overall_win'] / (overallRecord_df['overall_win'] + overallRecord_df['overall_loss'] + overallRecord_df['overall_tie'])
overallRecord_df['overall_rank'] = overallRecord_df['overall_win%'].rank(ascending=False, method='min')




regularSeasRecord = weeklyPts[weeklyPts['week'] == 13][['team','win','loss', 'tie']]
regularSeasRecord['win%'] = regularSeasRecord['win'] / (regularSeasRecord['win'] + regularSeasRecord['loss'] + regularSeasRecord['tie'])
regularSeasRecord['rank'] = regularSeasRecord['win%'].rank(ascending=False, method='min')



final_df = overallRecord_df.merge(regularSeasRecord, how='left', on=['team'])

Output:

print (final_df.sort_values('rank').to_string())
                      team  overall_loss  overall_tie  overall_win  overall_win%  overall_rank   win  loss  tie      win%  rank
0             Luck Dynasty            39            0          104      0.727273           1.0  12.0   1.0  0.0  0.923077   1.0
10     Warsaw Widow Makers            48            0           95      0.664336           3.0  10.0   3.0  0.0  0.769231   2.0
2              Team Powell            60            0           83      0.580420           5.0   8.0   5.0  0.0  0.615385   3.0
1               Team White            46            0           97      0.678322           2.0   7.0   6.0  0.0  0.538462   4.0
3   The SouthWest Slingers            55            0           88      0.615385           4.0   7.0   6.0  0.0  0.538462   4.0
5               U MAD BRO?            71            0           72      0.503497           6.0   7.0   6.0  0.0  0.538462   4.0
11            Team Troxell            88            0           55      0.384615           9.0   7.0   6.0  0.0  0.538462   4.0
6          Organized Chaos            72            0           71      0.496503           7.0   6.0   7.0  0.0  0.461538   8.0
7         Jobobes Jabronis            88            0           55      0.384615           9.0   6.0   7.0  0.0  0.461538   8.0
4             Killa Bees!!            98            0           45      0.314685          11.0   4.0   9.0  0.0  0.307692  10.0
9             Faceless Men            86            0           57      0.398601           8.0   3.0  10.0  0.0  0.230769  11.0
8     Rollin with Mahomies           107            0           36      0.251748          12.0   1.0  12.0  0.0  0.076923  12.0
chitown88
  • 27,527
  • 4
  • 30
  • 59
  • I did have to add in the espn_s2 cookie because the league is private. Thank you so much for the help. If you have any more interest in this, I made a chart showing the distribution of possible outcomes here (https://imgur.com/gallery/cyfqxXK). – Feil Narley Dec 25 '19 at 15:23
  • Oh nice! Thanks. – chitown88 Dec 25 '19 at 18:01