I have a dictionary dict
of dataframes such as:
{
‘table_1’: name color type
Banana Yellow Fruit,
‘another_table_1’: city state country
Atlanta Georgia United States,
‘and_another_table_1’: firstname middlename lastname
John Patrick Snow,
‘table_2’: name color type
Red Apple Fruit,
‘another_table_2’: city state country
Arlington Virginia United States,
‘and_another_table_2’: firstname middlename lastname
Alex Justin Brown,
‘table_3’: name color type
Lettuce Green Vegetable,
‘another_table_3’: city state country
Dallas Texas United States,
‘and_another_table_3’: firstname middlename lastname
Michael Alex Smith }
I would like to merge these dataframes together based on their names so that in the end I will have only 3 dataframes:
table
name color type
Banana Yellow Fruit
Red Apple Fruit
Lettuce Green Vegetable
another_table
city state country
Atlanta Georgia United States
Arlington Virginia United States
Dallas Texas United States
and_another_table
firstname middlename lastname
John Patrick Snow
Alex Justin Brown
Michael Alex Smith
Based on my initial research it seems like this should be possible with Python:
- By using
.split
, dictionary comprehension anditertools.groupby
to group together dataframes inside the dictionary based on key names - Creating dictionary of dictionaries with these grouped results
- Using
pandas.concat
function to loop through these dictionaries and group dataframes together
I don't have a lot of experience with Python and I am a bit lost on how to actually code this.
I have reviewed How to group similar items in a list? and Merge dataframes in a dictionary posts but they were not as helpful because in my case name length of dataframes varies.
Also I do not want to hardcode any dataframe names, because there are more than a 1000 of them.