Rearrange data in table

Question

I have a following DataFrame:

dis Country Price 0 0.8 US 500 1 0.8 England 1000 2 0.8 Spain 1500 3 0.8 Portugal 600 4 0.8 Germany 900 5 0.9 US 2200 6 0.9 England 3000 7 0.9 Spain 600 8 0.9 Portugal 1000 9 0.9 Germany 4000

Though I'd like to rearrange it in the following way:

dis US England Spain Portugal Germany 0.8 500 1000 1500 600 900 0.9 2200 3000 600 1000 4000

I'd be happy for some ideas how to solve this problem.

@Vivek, numpy only supports numeric values and therefor suggesting it's transpose isn't really an answer here. I like rtmh's question, are you using pandas? — kpie, Jul 14 '16 at 02:03
Please, see this SO http://stackoverflow.com/questions/4937491/matrix-transpose-in-python @kpie: Sorry! I just suggested a general solution. — SilverSurfer, Jul 14 '16 at 02:06

AChampion · Accepted Answer · 2016-07-14T02:13:34.267

2

Assuming pandas, you can use set_index and unstack to do what you are looking to do, as long as there are no repeats in the indexes:

>>> import pandas as pd
>>> df = pd.DataFrame({'dis': [0.8, 0.8, 0.9, 0.9], 'Country':['US', 'England', 'US', 'England'], 'Price':[500, 1000, 1500, 2000]})
>>> df
    Country Price   dis
0   US      500     0.8
1   England 1000    0.8
2   US      1500    0.9
3   England 2000    0.9
>>> df.set_index(['dis', 'Country']).unstack()
        Price
Country England US
dis     
0.8     1000    500
0.9     2000    1500

edited Jul 14 '16 at 02:13

answered Jul 14 '16 at 02:08

AChampion

29,683
4
59
75

What about if I repeat indexes? I just encountered that problem. – Monica Jul 18 '16 at 02:12
What do you expect the output to look like when you have repeat indices? – AChampion Jul 18 '16 at 04:01

score 1 · Answer 2 · answered Jul 14 '16 at 02:11

Assuming that you already know the row and column names given of your output table and that your input is a text file of tab seperated values I would do something like this,

afile = open("input.csv","r")
content = [k.split("\t") for k in afile.read().slit("\n")]
#If you already have a list of lists these first 2 lines are unnecessary.

output = {}
for k in content:
    if not(k[1] in output.keys):
        output[k[1]] = {}
    output[k[1]][k[2]] = k[3]

print(output)

Rearrange data in table

2 Answers2