python pandas create a list group by value

Question

I have a dataframe in python:

    pID     sID     time 
0   2133    152414  2018-06-16
1   1721    152912  2018-06-17
2   2264    152912  2018-06-18

I want to create a new table with sID as the key and list of pID:

        pID time
152414 2133 2018-06-16
152912 1721 2018-06-17
       2264 2018-06-18

What is the best way to do it without iterating over all the dataframe? I tried:

df.pivot(index='sID', columns=['pID', 'time'])

But got:

ValueError: all arrays must be same length

For these table of 3 rows Thanks!

@mxmt it doesn't help:: df.set_index('sID') returns a dataframe with 3 rows : meaning there are 2 rows with the index 152912 , and I get a KeyError. I need to have a dataframe with only two rows — oren_isp, Aug 05 '18 at 12:10

Maksim Terpilowski · Accepted Answer · 2018-08-05T12:34:29.750

0

Try this:

import io
import pandas as pd

f = io.StringIO('''
2133    152414  2018-06-16
1721    152912  2018-06-17
2264    152912  2018-06-18''')

df = pd.read_csv(f, sep='\s+', header=None, names=['pID', 'sID', 'date'])
df.set_index(['sID', 'pID'])

edited Aug 05 '18 at 12:34

answered Aug 05 '18 at 12:27

Maksim Terpilowski

823
7
12

I get this table but now what i do `pIDs = df[152912]` I get `KeyError: 152912` instead of a table of 2 rows – oren_isp Aug 05 '18 at 12:57
@oren_ISP You should read Pandas manual as you have no understanding of indexing at all. Use `df.loc[152912]` – Maksim Terpilowski Aug 05 '18 at 14:45
to my understanding, the difference between `df[i]` and `df.loc[i]` is that the latter support editing. Isn't this the case? – oren_isp Aug 07 '18 at 05:47

python pandas create a list group by value

1 Answers1