1

Given a list of list with unknown size of the inner list, e.g.:

>>> import pandas as pd
>>> lol = [[1,2,3], [3,1,1], [3,2], [1], [2,3,4]]
>>> sr = pd.Series(lol)
>>> sr
0    [1, 2, 3]
1    [3, 1, 1]
2       [3, 2]
3          [1]
4    [2, 3, 4]
dtype: object

How to split the list into 3 lists? If the list has less than 3 populate the list with None

The goal is to get a dataframe with 3 columns from the 3 lists, i.e.:

   0    1    2
0  1  2.0  3.0
1  3  1.0  1.0
2  3  2.0  NaN
3  1  NaN  NaN
4  2  3.0  4.0

I've tried doing this:

lol = [[1,2,3], [3,1,1], [3,2], [1], [2,3,4]]
sr = pd.Series(lol)

df = []
n = 3
for row in sr:
    while len(row) < n:
        row.append(None)
    df.append(row)

df = pd.DataFrame(df)
df

[out]:

    0   1   2
0   1   2.0 3.0
1   3   1.0 1.0
2   3   2.0 NaN
3   1   NaN NaN
4   2   3.0 4.0

Is there a simpler way to achieve the same dataframe?

Is there an easier way to achieve the same final dataframe if the n is unknown before hand?

Is doing max(len(row) for row in sr) the only way?

alvas
  • 115,346
  • 109
  • 446
  • 738

3 Answers3

2

Use

In [149]: sr.apply(pd.Series)
Out[149]:
     0    1    2
0  1.0  2.0  3.0
1  3.0  1.0  1.0
2  3.0  2.0  NaN
3  1.0  NaN  NaN
4  2.0  3.0  4.0
Zero
  • 74,117
  • 18
  • 147
  • 154
2

Convert Series to numpy array and then to list.

df = pd.DataFrame(sr.values.tolist())
print (df)

   0    1    2
0  1  2.0  3.0
1  3  1.0  1.0
2  3  2.0  NaN
3  1  NaN  NaN
4  2  3.0  4.0

If input is nested list, better is piRSquared's solution.

jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

The pd.DataFrame constructor can handle that fine.

lol = [[1,2,3], [3,1,1], [3,2], [1], [2,3,4]]

pd.DataFrame(lol)

   0    1    2
0  1  2.0  3.0
1  3  1.0  1.0
2  3  2.0  NaN
3  1  NaN  NaN
4  2  3.0  4.0
piRSquared
  • 285,575
  • 57
  • 475
  • 624