2

I'm working with data on the frequency that college basketball teams take 2's and 3's. I plan on multiplying the frequency at which they take 3's by 3, and adding it to the frquency at which they take 2's by 2. A function will be doing this. That function will be put into a much larger function later, but that shouldn't raise any constraints (I don't think).

Here are the first 10 rows of the pandas dataset:

Team    3PtTakeRate 2PtTakeRate
Savannah St 0.577   0.423
Quinnipiac  0.538   0.462
Citadel     0.536   0.464
Villanova   0.535   0.465
Winthrop    0.527   0.473
Longwood    0.501   0.499
Elon        0.500   0.500
Auburn      0.496   0.504
Campbell    0.490   0.510
N Dakota St 0.482   0.518
N Hampshire 0.481   0.519

If it matters, I loaded the data from a csv file with this:

TeamShotChoices = pd.read_csv("NCAAExpValue.csv",sep=',')

Here's what my function looks like:

def PtsPerSuccess(Team):
    TeamPts = ((TeamShotChoices.loc[TeamShotChoices['Team']==Team,'3PtTakeRate']) * 3) + ((TeamShotChoices.loc[TeamShotChoices['Team']==Team,'2PtTakeRate']) * 2)
    return TeamPts

The Team argument will be the team name in quotes. For the record, in the larger function, this argument will be getting pulled from a list of strings, and will need to find this value for the a team AND the following time.... but I should be able to use [i] and [i + 1] and indices. So, again, should be fine....

When I run this function, for example:

PtsPerSuccess('Savannah St')

what I get is this:

0    2.577
dtype: float64

I'm going to be using the 2.577 as a number that I multiply by, and then using that resulting product in an if statement to determine winners of simulated games. So the way this is returning won't work.

What I'm confused by is why it is giving me all of that information. I don't want the 0 (which is the row number), and I don't want the dtype. I just want the function, in this case, to return 2.577.

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
Cornel Westside
  • 117
  • 1
  • 11

1 Answers1

0

The simple reason is that what you're basically performing transformations on a pandas object, which normally contains multiple values. pandas doesn't know that each value of Team is unique (how could it?), so it assumes that the selection and multiplication operations result in another result also containing multiple values.

To disrupt your code the least, you can just change your return statement to return TeamPts[0].

gmds
  • 19,325
  • 4
  • 32
  • 58
  • The thing you proposed at the end will probably work. Thanks for that.... Will let you know if it breaks when I put it into a larger function.... As for the first point: But I'm directly telling it that, in the row, the team value has to equal the argument. So shouldn't it just pull the the right column values from that row? Not really sure why that method isn't working. – Cornel Westside Apr 13 '19 at 01:38