1

I have 15 dataframes, that I have merged together. 15 dataframe

  1. Here I'm loading my files.
data_files <- list.files()  # Identify file names
                       

for(i in 1:length(data_files)) {                              
  assign(paste0(substr(data_files[i],1,nchar(data_files[i])-4)),             
           read_excel(paste0("",
                           data_files[i])))
}

Problem, they have the same columns names. That's why i want to rename the columns with the following code :

colnames(COMMUNITY)
 [1] "OBJECTID"                   "SOURCE_ID"                  "mean"                       "LMiIndex Fixed 450000 RS"  
 [5] "LMiZScore Fixed 450000 RS"  "LMiPValue Fixed 450000 RS"  "COType Fixed 450000 RS"     "NNeighbors Fixed 450000 RS"
 [9] "ZTransform Fixed 450000 RS" "SpatialLag Fixed 450000 RS"

colnames(COMMUNITY) <-paste("PREFIX",colnames(COMMUNITY),sep="-")
  

I would like to do this to my 15 dataframes, so I tried this :

List_df_EU = list(COMMUNITY,CSR_STRATEGY, EMISSIONS,ENV_PILLAR,ESGCOMBINED,ESGCONTROVERSIES,
                  ESGSCORE,GOV_PILLAR,HUMANRIGHTS,INNOVATION,MANAGEMENT,PRODUCT_RESP, RESSOURCE_USE, SOC_PILLAR, WORKFORCE)


for(i in 1:length(List_df_EU)) {                              
  colnames(List_df_EU[i]) <-paste("AS",colnames(List_df_EU[i]),sep="_")
  
}

It doesn't work, and, I don't know how to retrieve dataframe name, in order to put him as PREFIX of columns.
I could do it for each dataframe seperately, but it would take a long time, and would not be very clever. Even after many web researches, I never found something that was automated.

After that, I use the following line of code to merge, It actually works, but as expected every colnames are identical.

Merged_file <- purrr::reduce(List_df_EU, dplyr::left_join, by = 'OBJECTID', suffix = c(".x", ".y"))
  • 1
    We have no sample data to work with, so there's a bit more inference and less likelihood that you'll get a great answer (quickly). It would help immensely if you reduced this from a problem with 15 frames and umpteen columns to just 3 small frames, perhaps a few rows each, and just enough columns in each that we get the point. The show what you want the output to be. Once you figure it out for 2-3 frames, expanding to 15 should be straight-forward. – r2evans Mar 18 '22 at 13:31

3 Answers3

1

First, refer to elements of the list with double square brackets, like so List_df_EU[[i]] (List_df_EU[i] is a sub-list of 1 element, not the element itself).

Second, we could create List_df_EU with tibble::lst() instead of list(), so that elements are automatically named. Then, "AS" can be replaced with names(List_df_EU)[i].

List_df_EU <- tibble::lst(....)

for(i in 1:length(List_df_EU)) {                              
  colnames(List_df_EU[[i]]) <- paste(
    names(List_df_EU)[i], colnames(List_df_EU[[i]]), sep = "_")
}

Edit
To allow the subsequent join on OBJECTID, we could rename all columns but OBJECTID, for instance using dplyr that has a nice interface for this:

for(i in 1:length(List_df_EU)) {                              
  List_df_EU[[i]] <- dplyr::rename_with(
    List_df_EU[[i]],
    ~ paste(names(List_df_EU)[i], .x, sep = "_"),
    .cols = - OBJECTID
  )
}
Aurèle
  • 12,545
  • 1
  • 31
  • 49
  • Hello it works perfectly fine, so thank you. I'm just a little bit confused, because obviously `Merged_file <- purrr::reduce(List_df_EU, dplyr::left_join, by = "OBJECTID", suffix = c(".x", ".y"))` doens't work since `OBJECTID` does not exist anymore. ` Warning message: Unknown or uninitialised column: `OBJECTID`. ` What should be the expression calling for each `OBJECTID` in each datafram of the list `List_df_EU` ? I tried something like `colnames(List_df_EU[[i]])` but cant figure out how to get an array of each OBJECTID, or figure out a kind of regular expression. – Mathias Lauber Mar 18 '22 at 14:28
  • Sorry I had missed that part. See my edit. – Aurèle Mar 18 '22 at 15:06
  • 1
    Ok thats great ! It makes it so much easier not to rename that column. Well thank you for your help. I had a bad time trying to join dataframes with renamed OBJECTID. – Mathias Lauber Mar 18 '22 at 15:21
0

The easiest may be to bring them all into the same columns but add a column that indicates what file they came from. You could also pivot_wider and separate them again, at that point.

This function is for filling in the column that will be used to identify the source file.

library(tidyverse)
library(data.table)
add_name <- function(flnm) {
  fread(flmn) %>%
  mutate(filename = basename(flmn))
}

Use this to collect the files and build the data frame.

mergedDF <- list.files(urlOrObject) %>%
  map_df(~add_name(.))

Let me know if you have any questions.

Kat
  • 15,669
  • 3
  • 18
  • 51
0
  1. Name your list, then you can get the name prefix:

    List_df_EU = list(COMMUNITY = COMMUNITY,CSR_STRATEGY = CSR_STRATEGY ...)

  2. to set the colnames there is a [] missing:

    colnames(List_df_EU[[i]]) <- ...

Dharman
  • 30,962
  • 25
  • 85
  • 135
Lucas
  • 409
  • 2
  • 10