1

Trying to convert the indices of one column, into their own columns.

From:

[country, series, year, value]

[USA, A, 1994, 700000]
[USA, B, 1994, 701231]
[UK, A, 1994, 600000]
[UK, B, 1994, 601231]
[China, A, 1994, 6512312]
[China, B, 1994, 6432112]
To:

[country, A, B, year]

[USA, 700000, 701231, 1994]
[UK, 600000, 601231, 1994]
[China, 6512312, 6432112, 1994]

Have tried stacking/unstacking, pivoting, melting, groupby, etc..

I am fairly certain that one of these methods is the key to reorganizing this, I just cant seem to get it right.

End goal is to make a 3d scatterplot wherein one axis is year, another is A and another is B; while country will be indicated by color

Update: Using Pivot, I am able to get the data to Look the right way, but the same issue arises when I try to plot it.

Because 'Year' and 'Country' are indexed, they cannot be identified as elements eligible for axes.

Expected 'A' or 'B' but got 'year'/'country'

Community
  • 1
  • 1
Alex
  • 188
  • 11
  • Have tried multi-indexing as well – Alex Sep 16 '19 at 22:07
  • if you are just hoping to plot, why not separate the data into two data sets and then plot them? otherwise I think you would need melt, reshape to do what you are describing. – Aiden Zhao Sep 16 '19 at 22:12
  • 1
    seems like `pivot_table` to me. `df.pivot_table(index=['country', 'year'], columns='series', values='value')`? – ALollz Sep 16 '19 at 22:12
  • @ALollz Pivoting as you described works well for getting the df to look correct, but when I go to plot it, because Year and Country are now indices, it is unable to add these elements to the plot. Specifically, it expects 'A' or 'B' but gets 'year'. – Alex Sep 16 '19 at 22:26
  • @YongkangZhao I have other plots that display pieces of the information above, but none that combine it all into one 3D plot. Is this what you mean by splitting the sets?/is there a way to create a 3D scatter plot that pulls elements from two separate dataframes?? – Alex Sep 16 '19 at 22:28

1 Answers1

0

Pivot the data, then reset the index and drop the resulting additional column being used as the new index column.

df.pivot_table(index=['country', 'year'], columns='series', values='value')
df.reset_index()

(optional) df.drop(column = 'series')

Pivoting as @ALollz described worked for getting the Data frame to look correct.

Encountered the error described above when trying to graph it, due to 'year' and 'country' being indexed.

Simply reseting the index AFTER running the pivot, resolved the index issue and returned 'country' and 'year' to normal columns, while leaving A and B as new columns. 'Series' will be the new index column, which can be dropped.

Alex
  • 188
  • 11