2

I have a following DataFrame:

dis Country Price 0 0.8 US 500 1 0.8 England 1000 2 0.8 Spain 1500 3 0.8 Portugal 600 4 0.8 Germany 900 5 0.9 US 2200 6 0.9 England 3000 7 0.9 Spain 600 8 0.9 Portugal 1000 9 0.9 Germany 4000

Though I'd like to rearrange it in the following way:

dis US England Spain Portugal Germany
0.8 500 1000 1500 600 900 0.9 2200 3000 600 1000 4000

I'd be happy for some ideas how to solve this problem.

Monica
  • 1,030
  • 3
  • 17
  • 37
  • are you using pandas? if so, consider adding the pandas tag – rtmh Jul 14 '16 at 02:00
  • @Vivek, numpy only supports numeric values and therefor suggesting it's transpose isn't really an answer here. I like rtmh's question, are you using pandas? – kpie Jul 14 '16 at 02:03
  • Please, see this SO http://stackoverflow.com/questions/4937491/matrix-transpose-in-python @kpie: Sorry! I just suggested a general solution. – SilverSurfer Jul 14 '16 at 02:06

2 Answers2

2

Assuming pandas, you can use set_index and unstack to do what you are looking to do, as long as there are no repeats in the indexes:

>>> import pandas as pd
>>> df = pd.DataFrame({'dis': [0.8, 0.8, 0.9, 0.9], 'Country':['US', 'England', 'US', 'England'], 'Price':[500, 1000, 1500, 2000]})
>>> df
    Country Price   dis
0   US      500     0.8
1   England 1000    0.8
2   US      1500    0.9
3   England 2000    0.9
>>> df.set_index(['dis', 'Country']).unstack()
        Price
Country England US
dis     
0.8     1000    500
0.9     2000    1500
AChampion
  • 29,683
  • 4
  • 59
  • 75
1

Assuming that you already know the row and column names given of your output table and that your input is a text file of tab seperated values I would do something like this,

afile = open("input.csv","r")
content = [k.split("\t") for k in afile.read().slit("\n")]
#If you already have a list of lists these first 2 lines are unnecessary.

output = {}
for k in content:
    if not(k[1] in output.keys):
        output[k[1]] = {}
    output[k[1]][k[2]] = k[3]

print(output)
kpie
  • 9,588
  • 5
  • 28
  • 50