1

I have a series with 5 value

    0 A
    1 A
    2 B
    3 C
    4 E

And I have a dataframe A with n columns.

Is there any way I can make a dataframe that dimension is (5*n). All columns is made up by the series and column names are the same as dataframe A?

For example:

dataframe A looks like

      col1    col2
0        1       3
1        2       2
2        2       2
3        2       6
4        4       2

and new dataframe looks like

      col1    col2
0        A       A
1        A       A
2        B       B
3        C       C
4        E       E

The best solution I came up with for now is to make a copy of A and use loop to change values of new dataframe column by column.

thanks for any kind of help!

error
  • 2,356
  • 3
  • 23
  • 25
Lee Tom
  • 93
  • 9

2 Answers2

2

Use concat:

df = pd.concat([s] * len(df.columns), 1, keys=df.columns)
print (df)
  col1 col2
0    A    A
1    A    A
2    B    B
3    C    C
4    E    E

Or if need faster solution use numpy.repeat + numpy.reshape:

l = len(df.columns)
df = pd.DataFrame(np.repeat(s,l ).reshape(-1,l), columns=df.columns, index=df.index)
print (df)
  col1 col2
0    A    A
1    A    A
2    B    B
3    C    C
4    E    E

Or simplier:

l = len(df.columns)
df = pd.DataFrame(np.column_stack([s] * l), columns=df.columns, index=df.index)
print (df)
  col1 col2
0    A    A
1    A    A
2    B    B
3    C    C
4    E    E

Timings:

np.random.seed(123)

L = list('abcdefghijklmno') 
s = pd.Series(np.random.choice(L, 100))

df = pd.DataFrame(np.random.randint(100, size=(100, 100))).add_prefix('col')

print (df)

In [161]: %timeit pd.concat([s] * len(df.columns), 1, keys=df.columns)
100 loops, best of 3: 2.84 ms per loop

In [162]: %timeit pd.DataFrame(np.repeat(s.values,len(df.columns)).reshape(-1,len(df.columns)), columns=df.columns, index=df.index)
1000 loops, best of 3: 199 µs per loop

In [163]: %timeit pd.DataFrame(np.column_stack([s] * len(df.columns)), columns=df.columns, index=df.index)
1000 loops, best of 3: 1 ms per loop

In [164]: %timeit pd.DataFrame({k : s for k in df.columns})
100 loops, best of 3: 2.33 ms per loop
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
1

DataFrame constructor with a dict comp.

pd.DataFrame({k : df1.Col for k in df2.columns})

  col1 col2
0    A    A
1    A    A
2    B    B
3    C    C
4    E    E
cs95
  • 379,657
  • 97
  • 704
  • 746