4
df.explode(['X'])


ValueError: column must be a scalar

Hi anyone could advice on this?

buhtz
  • 10,774
  • 18
  • 76
  • 149
Tonz
  • 177
  • 1
  • 2
  • 11
  • 5
    Why did you think to use `['x']` and not `'x'`? – cs95 Apr 20 '20 at 00:03
  • @Tonz I know this is an old question but if you feel an answer solved the problem, please mark it as 'accepted' by clicking the green check mark. This helps keep the focus on older SO questions which still don't have answers. – delocalizer Dec 10 '21 at 09:44

4 Answers4

12

You can supply a list or tuple of column names, but only with pandas >= 1.3.0: see https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html

New in version 1.3.0: Multi-column explode

If you see this ValueError you must be using an older version of pandas

delocalizer
  • 398
  • 3
  • 11
3

Use df.explode('X') instead of df.explode(['X']). Example on pandas explode page explains this.

1

For those of you working with Pandas < 1.3, the following logic executes a multi-column explode and is reasonably efficient. Just need to replace the name of cols you want to explode.

def explode(df):
    df['tmp']=df.apply(lambda row: list(zip(row[col1],row[col2])), axis=1) 
    df=df.explode('tmp')
    df[[col1,col2]]=pd.DataFrame(df['tmp'].tolist(), index=df.index)
    df.drop(columns='tmp', inplace=True)
    return df
0

If you look at the function signature for explode, it has to be a scalar column name (either a str or a tuple), and you are passing a list.

Example

df = pd.DataFrame(index=['a', 'b'], 
                  data={'col1': [[10, 11]], 'col2': [[1, 2]]})

>>>df.explode('col1')                                                                                              
  col1    col2
a   10  [1, 2]
a   11  [1, 2]
b   10  [1, 2]
b   11  [1, 2]

>>>df.explode(['col1'])
ValueError: column must be a scalar
Eric Truett
  • 2,970
  • 1
  • 16
  • 21