2

I have a file named 'schema'. Based on the file, I need to rename other data frames. For example, 'Var1' of TableA needs to be renamed to 'Col1'. Similarly, VarA of TableA needs to be renamed to ColA. In short, all variables listed in 'FROM' colume of schema needs to be renamed to column 'To'.

Schema <- read.table(header = TRUE, text = 
                  'Tables   From    To
                A   Var1    Col1
                A   Var2    Col2
                A   Var3    Col3
                B   VarA    ColA
                B   VarB    ColB
                B   VarC    ColC
                ')


A <- data.frame(Var1 = 1:3,
                  Var2 = 2:4,
                  Var3 = 3:5)

B <- data.frame(VarA = 1:3,
                VarB = 2:4,
                VarC = 3:5)
john
  • 1,026
  • 8
  • 19

3 Answers3

2

We could use match:

lapply(list(A = A, B = B), function(i){
  setNames(i, Schema$To[ match(names(i), Schema$From) ])
 })

# $A
#   Col1 Col2 Col3
# 1    1    2    3
# 2    2    3    4
# 3    3    4    5
# 
# $B
#   ColA ColB ColC
# 1    1    2    3
# 2    2    3    4
# 3    3    4    5

Or:

Anew <- setNames(A, Schema$To[ match(names(A), Schema$From) ])
Bnew <- setNames(B, Schema$To[ match(names(B), Schema$From) ])

Or list2env:

list2env(lapply(list(A = A, B = B), function(i){
  setNames(i, Schema$To[ match(names(i), Schema$From) ])
  }), envir = globalenv())

Edit: When there is no match Schema then use keep column name as is:

list2env(lapply(list(A = A, B = B), function(i){
  # check if there is a match, if not keep name unchaged
  x <- as.character(Schema$To[ match(names(i), Schema$From) ])
  ix <- which(is.na(x))
  x[ ix ] <- names(i)[ ix ]
  # retunr with updated names
  setNames(i, x)
  }), envir = globalenv())
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • It works great. But it returns lists. Can it be shown to actual dfs A and B? – john Dec 04 '19 at 11:02
  • @john see edit, I prefer not to repeat code, hence used loop - `lapply` – zx8754 Dec 04 '19 at 11:04
  • This is what I want. I need it in loop. But your solution returns list. Is there anyway to incorporate it in actual dfs? – john Dec 04 '19 at 11:05
  • Thanks. It works great. Only issue - if a column name does not exist in schema which needs to be renamed, so name of the variable becomes blank. Can we keep names of these variables as it is? – john Dec 04 '19 at 16:50
  • @john see edit, we just need to add a check if there is a match. – zx8754 Dec 04 '19 at 21:29
1
  • The following code can extract retrieve the name of tables (A and B) from Schema and to the name replacement task:
r <- Map(function(v) function(v) {
  r <- get(v)
  names(r)[names(r) %in% Schema$From] <- as.character(Schema$To[Schema$From %in% names(r)])
  assign(v,r)},
    as.character(unique(Schema$Tables)))

which gives

> r
$A
  Col1 Col2 Col3
1    1    2    3
2    2    3    4
3    3    4    5

$B
  ColA ColB ColC
1    1    2    3
2    2    3    4
3    3    4    5
  • If you don't want result as list, you can do something like
list2env(Map(function(v) {
  r <- get(v)
  names(r)[names(r) %in% Schema$From] <- as.character(Schema$To[Schema$From %in% names(r)])
  assign(v,r)},
  as.character(unique(Schema$Tables))),envir = .GlobalEnv)

or

for (v in as.character(unique(Schema$Tables))) {
  r <- get(v)
  names(r)[names(r) %in% Schema$From] <- as.character(Schema$To[Schema$From %in% names(r)])
  assign(v,r)
}

then you will keep your object A and B

> A
  Col1 Col2 Col3
1    1    2    3
2    2    3    4
3    3    4    5
> B
  ColA ColB ColC
1    1    2    3
2    2    3    4
3    3    4    5
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1
lut <- setNames(as.character(Schema$To), Schema$From)

setNames(A, lut[names(A)])

  Col1 Col2 Col3
1    1    2    3
2    2    3    4
3    3    4    5

setNames(B, lut[names(B)])

  ColA ColB ColC
1    1    2    3
2    2    3    4
3    3    4    5
s_baldur
  • 29,441
  • 4
  • 36
  • 69