0

my df has a column names_text which is a list of lists. I want transform the lists within names_text into rows. I want each list within the nested lists in column names_text to form a unique row.

Representative data:

d = [['aa',  None, 'xx', [['ps', 'ps1'], ['ps22', 'ps2'], ['ps33', 'ps3']]],
     [None, 'tt', 'jjjj', [['pppp', 'pppp1'], ['pppp22', 'pppp2']]],
     [None, 'uu', None, [['oo', 'oo1'], ['oo', 'oo2'], ['oo45', 'oo2'], ['oo4', 'oo3']]],

c = ['col1','col2','col3','names_text']
df = pd.DataFrame(d,columns=c)

print(df)

   col1  col2  col3                                       names_text
0    aa  None    xx            [[ps, ps1], [ps22, ps2], [ps33, ps3]]
1  None    tt  jjjj                 [[pppp, pppp1], [pppp22, pppp2]]
2  None    uu  None  [[oo, oo1], [oo, oo2], [oo45, oo2], [oo4, oo3]]

desired output:

d = [['aa',  None, 'xx', ['ps', 'ps1']],
     ['aa',  None, 'xx', ['ps22', 'ps2']],
     ['aa',  None, 'xx', ['ps33', 'ps3']],
     [None, 'tt', 'jjjj', ['pppp', 'pppp1']],
     [None, 'tt', 'jjjj', ['pppp22', 'pppp2']],
     [None, 'uu', None, ['oo', 'oo1']],
     [None, 'uu', None, ['oo', 'oo2']],
     [None, 'uu', None, ['oo45', 'oo2']],
     [None, 'uu', None, ['oo4', 'oo3']]]

c = ['col1','col2','col3','names_text']
df = pd.DataFrame(d,columns=c)

print(df)

   col1  col2  col3       names_text
0    aa  None    xx        [ps, ps1]
1    aa  None    xx      [ps22, ps2]
2    aa  None    xx      [ps33, ps3]
3  None    tt  jjjj    [pppp, pppp1]
4  None    tt  jjjj  [pppp22, pppp2]
5  None    uu  None        [oo, oo1]
6  None    uu  None        [oo, oo2]
7  None    uu  None      [oo45, oo2]
8  None    uu  None       [oo4, oo3]

Whenever i do df.explode('names_text').reset_index(drop=True) it not only creates rows as intended but also creates around 700 new columns, which makes the resulting dataframe too large to be read into memory and i need to abort the process.

Question: I thought exploding should only create new rows but why does it create columns?

id345678
  • 97
  • 1
  • 3
  • 21

0 Answers0