23

I have defined an empty data frame with

df = pd.DataFrame(columns=['Name', 'Weight', 'Sample'])

and want to append rows in a for loop like this:

for key in my_dict:
   ...
   row = {'Name':key, 'Weight':wg, 'Sample':sm}
   df = pd.concat(row, axis=1, ignore_index=True) 

But I get this error

cannot concatenate object of type '<class 'str'>'; only Series and DataFrame objs are valid

If I use df = df.append(row, ignore_index=True), it works but it seems that append is deprecated. So, I want to use concat(). How can I fix that?

mahmood
  • 23,197
  • 49
  • 147
  • 242
  • 3
    it would more efficient if you collect all the dicts in a list and concatenate once. –  Feb 15 '22 at 19:47

3 Answers3

30

You can transform your dict in pandas DataFrame

import pandas as pd
df = pd.DataFrame(columns=['Name', 'Weight', 'Sample'])
for key in my_dict:
  ...
  #transform your dic in DataFrame
  new_df = pd.DataFrame([row])
  df = pd.concat([df, new_df], axis=0, ignore_index=True)
Mael_Jourdain
  • 446
  • 4
  • 4
2

Concat needs a list of series or df objects as first argument.

import pandas as pd

my_dict = {'the_key': 'the_value'}

for key in my_dict:
   row = {'Name': 'name_test', 'Weight':'weight_test', 'Sample':'sample_test'}
   df = pd.concat([pd.DataFrame(row, index=[key])], axis=1, ignore_index=True) 

print(df)
         0          1           2
the_key name_test   weight_test sample_test
Matthew Borish
  • 3,016
  • 2
  • 13
  • 25
0

As user7864386 suggested, the most efficient way would be to collect the dicts and to concatenate them later, but if you for some reason have to add rows in a loop, a more efficient way would be .loc, because that way you don't have to turn your dict into a single-row DataFrame first:

    df.loc[len(df),:] = row

It's rather hard to benchmark this properly, because %timeit of that row will grow the DataFrame and make the call slower over time, while the alternative

    pd.concat([df, pd.DataFrame(row)], axis=0, ignore_index=True)

does not mutate df, and df = ... can't be %timeited as it causes an UnboundLocalError. Running one %timeit before and one after the other one makes me assume a speed advantage of a factor of 2, though.

Marius Wallraff
  • 391
  • 1
  • 5
  • 11