1

I've a table which has two columns, the first one is the indice of the site, and the second is the number of states per hour during 24 hours. Thus for each site, I've 24(lines)x2(columns) data. How can I transpose the second column (24 lines data per site) into the line which contains 24+1 columns with site indice.

I've found a naive solution by using loop to create new line:

for i in range(numberOfsites):
    i1 = i*24
    i2 = i1 + 24
    newLine = DataFrame(sitesData,index=list(range(i1,i2)), columns=["siteState"]).T

could anyone propose another easier solution? Thank you in advance.

Original data: (thanks to Andy Hayden for the data)

   site_index state
0           1     a
1           1     b
2           1     a
3           2     a
4           2     a
5           2     b

Desired data:

              0  1  2
  site_index
  1           a  b  a
  2           a  a  b
perigee
  • 9,438
  • 11
  • 31
  • 35
  • What is `sitesData` (why can't you just transpose that)? I'm struggling to understand exactly what you're asking here, for one thing, the code you've written only creates only one DataFrame (newLine), presumably that's not what you want. Could you give an example with some fake data? – Andy Hayden Jun 11 '13 at 14:29
  • Since the sitesData contains two columns, first one is the site indice, and the second is the siteState. sitesData contains n site information, thus has n*24 lines data. For each site, it has 24 lines in the siteData. I cannot simply transpose the sitesData. – perigee Jun 11 '13 at 14:33
  • I've given some dummy data in my answer, as I that's my best guess of what you are asking :) – Andy Hayden Jun 11 '13 at 14:57
  • 1
    Please could you edit your question to show both your _actual_ input data and your _expected_ output data? – Sylvain Leroux Jun 11 '13 at 15:14

1 Answers1

2

Although this feels a little hacky, you could use a groupby:

In [11]: df
Out[11]:
   site_index state
0           1     a
1           1     b
2           1     a
3           2     a
4           2     a
5           2     b

In [12]: g = df.groupby('site_index')

In [13]: g.apply(lambda x: x.state.reset_index(drop=True).T)
Out[13]:
            0  1  2
site_index
1           a  b  a
2           a  a  b

This may also be possible using unstack...

Andy Hayden
  • 359,921
  • 101
  • 625
  • 535