So I have 366 CSV files and I want to copy their second columns and write them into a new CSV file. Need a code for this job. I tried some codes available here but nothing works. please help.
Asked
Active
Viewed 731 times
-1
-
What have you tried so far? – TheLazyScripter Oct 18 '19 at 23:21
-
Please update this question to provide the work demonstrating your effort so users can assist you in ironing out the bugs leading to your failure. You'll have much better luck finding assistance since this community isn't for requesting people do your work for you. The work you provide should not only demonstrate the attempts you've made, but also clearly describe the failure you need help overcoming. – Julian Oct 19 '19 at 00:05
4 Answers
1
Assuming all the 2nd columns are the same length, you could simply loop through all the files. Read them, save the 2nd column to memory and construct a new df along the way.
filenames = ['test.csv', ....]
new_df = pd.DataFrame()
for filename in filenames:
df = pd.read_csv(filename)
second_column = df.iloc[:, 1]
new_df[f'SECOND_COLUMN_{filename.upper()}'] = second_column
del(df)
new_df.to_csv('new_csv.csv', index=False)

Dominik Sajovic
- 603
- 1
- 8
- 16
-
Hey Dominik, Thanks for sharing the code. Tweaked it for all files in the folder; works fine. – Ali Ajaz Oct 20 '19 at 02:41
-
Hi Ali, I would greatly appreciate an upvote to the answer if I have been of help, thank you. :) – Dominik Sajovic Oct 20 '19 at 20:34
-
I will for sure once I get >15 reputation points. Wont forget. Thanks Again. – Ali Ajaz Oct 20 '19 at 23:16
1
This can accomplished with glob and pandas:
import glob
import pandas as pd
mylist = [f for f in glob.glob("*.csv")]
df = pd.read_csv(mylist[0]) #create the dataframe from the first csv
df = pd.DataFrame(df.iloc[:,1]) #only keep 2nd column
for x in mylist[1:]: #loop through the rest of the csv files doing the same
t = pd.read_csv(x)
colName = pd.DataFrame(t.iloc[:,1]).columns
df[colName] = pd.DataFrame(t.iloc[:,1])
df.to_csv('output.csv', index=False)

Ian-Fogelman
- 1,595
- 1
- 9
- 15
-
thanks for sharing the code. there was a mismatch between csvList and mylist. however, it just runs for the very first file. – Ali Ajaz Oct 19 '19 at 21:12
-
1
filenames = glob.glob(r'D:/CSV_FOLDER' + "/*.csv")
new_df = pd.DataFrame()
for filename in filenames:
df = pd.read_csv(filename)
second_column = df.iloc[:, 1]
new_df[f'SECOND_COLUMN_{filename.upper()}'] = second_column
del(df)
new_df.to_csv('new_csv.csv', index=False)

Ali Ajaz
- 59
- 8
-
So I tried this code for multiple files in one folder and it works perfectly fine. – Ali Ajaz Oct 20 '19 at 02:40
0
import glob
import pandas as pd
mylist = [f for f in glob.glob("*.csv")]
df = pd.read_csv(csvList[0]) #create the dataframe from the first csv
df = pd.DataFrame(df.iloc[:,0]) #only keep 2nd column
for x in mylist[1:]: #loop through the rest of the csv files doing the same
t = pd.read_csv(x)
colName = pd.DataFrame(t.iloc[:,0]).columns
df[colName] = pd.DataFrame(t.iloc[:,0])
df.to_csv('output.csv', index=False)

Ali Ajaz
- 59
- 8