-1

So I have 366 CSV files and I want to copy their second columns and write them into a new CSV file. Need a code for this job. I tried some codes available here but nothing works. please help.

Ali Ajaz
  • 59
  • 8
  • What have you tried so far? – TheLazyScripter Oct 18 '19 at 23:21
  • Please update this question to provide the work demonstrating your effort so users can assist you in ironing out the bugs leading to your failure. You'll have much better luck finding assistance since this community isn't for requesting people do your work for you. The work you provide should not only demonstrate the attempts you've made, but also clearly describe the failure you need help overcoming. – Julian Oct 19 '19 at 00:05

4 Answers4

1

Assuming all the 2nd columns are the same length, you could simply loop through all the files. Read them, save the 2nd column to memory and construct a new df along the way.

filenames = ['test.csv', ....]

new_df = pd.DataFrame()

for filename in filenames:
    df = pd.read_csv(filename)
    second_column = df.iloc[:, 1]
    new_df[f'SECOND_COLUMN_{filename.upper()}'] = second_column
    del(df)

new_df.to_csv('new_csv.csv', index=False)
Dominik Sajovic
  • 603
  • 1
  • 8
  • 16
1

This can accomplished with glob and pandas:

import glob
import pandas as pd

mylist = [f for f in glob.glob("*.csv")]
df = pd.read_csv(mylist[0]) #create the dataframe from the first csv
df = pd.DataFrame(df.iloc[:,1]) #only keep 2nd column
for x in mylist[1:]: #loop through the rest of the csv files doing the same
    t = pd.read_csv(x)
    colName = pd.DataFrame(t.iloc[:,1]).columns
    df[colName] = pd.DataFrame(t.iloc[:,1])
    df.to_csv('output.csv', index=False)
Ian-Fogelman
  • 1,595
  • 1
  • 9
  • 15
1
    filenames = glob.glob(r'D:/CSV_FOLDER' + "/*.csv")

    new_df = pd.DataFrame()

    for filename in filenames:
        df = pd.read_csv(filename)
        second_column = df.iloc[:, 1]
        new_df[f'SECOND_COLUMN_{filename.upper()}'] = second_column
        del(df)

    new_df.to_csv('new_csv.csv', index=False)
Ali Ajaz
  • 59
  • 8
0
    import glob
    import pandas as pd

    mylist = [f for f in glob.glob("*.csv")]
    df = pd.read_csv(csvList[0]) #create the dataframe from the first csv
    df = pd.DataFrame(df.iloc[:,0]) #only keep 2nd column
    for x in mylist[1:]: #loop through the rest of the csv files doing the same
        t = pd.read_csv(x)
        colName = pd.DataFrame(t.iloc[:,0]).columns
        df[colName] = pd.DataFrame(t.iloc[:,0])
        df.to_csv('output.csv', index=False)
Ali Ajaz
  • 59
  • 8