df.explode(['X'])
ValueError: column must be a scalar
Hi anyone could advice on this?
df.explode(['X'])
ValueError: column must be a scalar
Hi anyone could advice on this?
You can supply a list or tuple of column names, but only with pandas >= 1.3.0: see https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.explode.html
New in version 1.3.0: Multi-column explode
If you see this ValueError
you must be using an older version of pandas
Use df.explode('X')
instead of df.explode(['X'])
. Example on pandas explode page explains this.
For those of you working with Pandas < 1.3, the following logic executes a multi-column explode and is reasonably efficient. Just need to replace the name of cols you want to explode.
def explode(df):
df['tmp']=df.apply(lambda row: list(zip(row[col1],row[col2])), axis=1)
df=df.explode('tmp')
df[[col1,col2]]=pd.DataFrame(df['tmp'].tolist(), index=df.index)
df.drop(columns='tmp', inplace=True)
return df
If you look at the function signature for explode
, it has to be a scalar column name (either a str
or a tuple
), and you are passing a list
.
Example
df = pd.DataFrame(index=['a', 'b'],
data={'col1': [[10, 11]], 'col2': [[1, 2]]})
>>>df.explode('col1')
col1 col2
a 10 [1, 2]
a 11 [1, 2]
b 10 [1, 2]
b 11 [1, 2]
>>>df.explode(['col1'])
ValueError: column must be a scalar