I have a DataFrame imported from a csv file called test_data that's structured like this:
School Name | Test Score | Students in Grade 11 | Students in Grade 12 |
---|---|---|---|
PS 1 | Advanced | 170 | 106 |
PS 1 | Basic | 87 | 53 |
PS 2 | Advanced | 141 | 120 |
PS 2 | Basic | 168 | 167 |
I wanted the data for each school to be in one row only, so I used .pivot_table() to do this:
test_data_by_grade_11 = test_data.pivot_table("Students in Grade 11", ["School Name"], "Test Score")
test_data_by_grade_12 = test_data.pivot_table(("Students in Grade 12", ["School Name"], "Test Score")
test_data_by_grade_11.rename(columns = {"Advanced":"Grade 11 Advanced", "Basic":"Grade 11 Basic"}, inplace = True)
test_data_by_grade_12.rename(columns = {"Advanced":"Grade 12 Advanced", "Basic":"Grade 12 Basic"}, inplace = True)
And I successfully got two tables like this with the word "Test Score" above the headers of the table:
School Name | Grade 11 Advanced | Grade 11 Basic |
---|---|---|
PS 1 | 170 | 87 |
PS 2 | 141 | 168 |
School Name | Grade 12 Advanced | Grade 12 Basic |
---|---|---|
PS 1 | 106 | 53 |
PS 2 | 120 | 167 |
I wanted to merge these pivot tables back together to create a table like this:
School Name | Grade 11 Advanced | Grade 11 Basic | Grade 12 Advanced | Grade 12 Basic |
---|---|---|---|---|
PS 1 | 170 | 87 | 106 | 53 |
PS 2 | 141 | 168 | 120 | 167 |
However, when I try to merge them, I get really skewed data and duplicates that don't match the .csv file, like this:
School Name | Grade 11 Advanced | Grade 11 Basic | Grade 12 Advanced | Grade 12 Basic |
---|---|---|---|---|
PS 1 | 170 | 87 | 87 | 105 |
PS 2 | 170 | 87 | 87 | 136 |
PS 3 | 170 | 87 | 87 | 85 |
PS 4 | 170 | 87 | 87 | 96 |
(The Grade 12 Basic values DO exist in the file, but definitely don't match up with the correct rows).
As a result of these duplicates, the output returns almost 5 million rows, whereas the original .csv file has around 6500. Not sure what's going on here or what I'm missing - I suspect it's something to do with the left_on or right_on properties but I've played around with those and haven't gotten the desired table yet.