0

I have 35 csv files and i want to merge all the files together on 'Id' column. Is there any way to merge all? I can manually do like this by uploading each file and then defining into datafame

pd.merge(df_c1, df_c2, on='uuid')

But curious if there is any smart way?

s_khan92
  • 969
  • 8
  • 21
  • Does this answer your question? [What's the fastest way to merge multiple csv files by column?](https://stackoverflow.com/questions/18140542/whats-the-fastest-way-to-merge-multiple-csv-files-by-column) – Suraj Mar 23 '20 at 18:26
  • unfortunately not because i want to merge them on identical column "uuid" – s_khan92 Mar 23 '20 at 18:29
  • What do you mean by _manually_ ? Can't you just use a loop? Please clarify what exactly the issue is. – AMC Mar 23 '20 at 21:51

1 Answers1

1

credit to @cs95 for Pandas Merging 101

### read / create data frames
df_c1 = pd.DataFrame({'uuid': ['A', 'B', 'C', 'D'], 'valueA': np.random.randn(4)})    
df_c2 = pd.DataFrame({'uuid': ['B', 'D', 'E', 'F'], 'valueB': np.random.randn(4)})
df_c3 = pd.DataFrame({'uuid': ['D', 'E', 'J', 'C'], 'valueC': np.ones(4)})

### list of data frames
dfs = [df_c1, df_c2, df_c3]

The following could then be used to concat:

pd.concat([df.set_index('uuid') for df in dfs], axis = 1) #.reset_index() could be used to make uuid a column again

Lastly, I could add to the solution by reading in multiple csv with something like this:

import pandas as pd 
import glob
import os

df_list = []

# note: this method assumes all of your csv files are in a single folder
path = '<insert your file path here>'

all_files = glob.glob(os.path.join(path, '*.csv'))

for file in all_files:
    df1 = pd.read_csv(file)
    df_list.append(df1)

concatenated_df = pd.concat([df for df in df_list], axis = 1) #note use axis = 0 to append row wise
tlk27
  • 309
  • 2
  • 10
  • Thanks. Its really stupid question but i am getting `SyntaxError: invalid syntax` for for loop... I am really unable to solve this – s_khan92 Mar 23 '20 at 23:18
  • Not a dumb question at all. I missed an underscore in `all_files` when I initially wrote the for loop -- that may fix your issue if you copied it directly otherwise glad to help with more information – tlk27 Mar 24 '20 at 00:30