1

I would like to merge specific columns from two csv files and use the filename as the column header.In this example, I want to merge the third column from each file into a single data frame. the csv files have the same number of rows and columns.

Sample data sets:

File1.csv

V1,V2,V3,V4
1,1986,64,61

File2.csv

V1,V2,V3,V4
1,1990,100,61

Expected Result:

"File1","File2"
64,100

Here's my script:

my.file.list <- list.files(pattern = "*.csv")
my.list <- lapply(X = my.file.list, FUN = function(x) {
        read.csv(x, header=TRUE,colClasses = c("NULL", "NULL", "numeric",    "NULL"), sep = ",")[,1]
    })
my.df <- do.call("cbind", my.list)

How do I add the column headers based from the file names?

I tried this:

sub('.csv','',basename(my.file.list),fixed=TRUE)

but I don't know how to add this as headers.

I'll appreciate any help.

Lyndz
  • 347
  • 1
  • 13
  • 30

2 Answers2

1
my.file.list <- list.files(pattern = "*.csv")
my.list <- list()
for (i in 1:length(my.file.list)) {
    df <- read.csv(my.file.list[[i]], header=TRUE, sep=",")["V3"]
    names(df) <- paste0("FILE", i)
    my.list[[i]] <- df
}
my.df <- do.call("cbind", my.list)
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
0

@Tim Biegeleisen Many thanks for the help. I got the idea now. Here's the improve version of your answer that I can use for files with different filenames.

 my.file.list <- list.files(pattern = "*.csv")
 my.list <- list()
 for (i in 1:length(my.file.list)) {
   df <- read.csv(my.file.list[[i]], header=TRUE, sep=",")["V3"]
    names(df) <-paste0(sub('.csv','',basename(my.file.list[i]),fixed=TRUE), i)
 my.list[[i]] <- df
 }
 my.df <- do.call("cbind", my.list)
Lyndz
  • 347
  • 1
  • 13
  • 30