1

I have data from ACS Census that has a metadata and tables with coded column names, I need to have the column names changed to the one in the metadata so it makes sense when read. Is there a way to replace the coded column name to actual by linking the table with the metadata.

I have used dplyr package to rename the column one by one, but I cant do it for every table as there are like 32000 columns for a single table(dataframe)

Any help would be appreciated, thanks.

Phil
  • 7,287
  • 3
  • 36
  • 66
Pream
  • 517
  • 4
  • 10

3 Answers3

5

We can use rename and evaluate a named vector

library(dplyr)
df2 <- rename(df2, !!! set_names(df1$original, df1$new))
df2
#  A  B  C
#1 1  6 11
#2 2  7 12
#3 3  8 13
#4 4  9 14
#5 5 10 15

data

df1 <-data.frame(original  = c('a', 'b', 'c'), new = c('A', 'B', 'C'),
      stringsAsFactors = FALSE)
df2 <- data.frame(a = 1:5, b =6:10, c = 11:15)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • Akrun could you please explain the role !!! plays in this solution? – Eric Jun 03 '20 at 23:31
  • 1
    @Mcmahoon89 thanks, it is used for splicing the `list` and then evaluate it. According to `help("!!!")` `The big-bang operator !!! forces-splice a list of objects. The elements of the list are spliced in place, meaning that they each become one single argument.` – akrun Jun 03 '20 at 23:33
  • 1
    Thank you Akrun for this solution and explanation. I know it isn't the place, but one day I'd love to learn what resources you used to develop your vast knowledge of R, and in particular your command of the Tidyverse package. – Eric Jun 03 '20 at 23:47
  • @Mcmahoon89 it is some part practise and some related to simple changes in lifestyle. I wish I could give more info without labelling as crazy :=) – akrun Jun 04 '20 at 20:55
2

If quasiquotation is inconvenient, you can overwrite the existing names with a vector. Start with a trivial example - the band_instruments in dplyr

library(dplyr)
foo <- band_instruments

foo
# A tibble: 3 x 2
name  plays 
<chr> <chr> 
1 John  guitar
2 Paul  bass  
3 Keith guitar

Overwrite with a character vector

names(foo) <- c("Moniker", "Jams.On")

R > foo
# A tibble: 3 x 2
Moniker Jams.On
<chr>   <chr>  
1 John    guitar 
2 Paul    bass   
3 Keith   guitar 

So if you have a translation table, even one which has some extra info in it:

foo <- band_instruments
trTbl <- tibble(Names.Now = c("plays", "name", "shoes"),
                Names.Desired = c("Jams.On", "Moniker", "boots"))

replaceVec  <-  tibble(Names.Now =  names(band_instruments)) %>% 
  left_join(trTbl, by = "Names.Now") %>% 
  pull(Names.Desired)
names(foo) <- replaceVec

foo

# A tibble: 3 x 2
  Moniker Jams.On
  <chr>   <chr>  
1 John    guitar 
2 Paul    bass   
3 Keith   guitar 
David T
  • 1,993
  • 10
  • 18
2

You could use the good ol' match approach.

dat  ## before
#   X1 X4 X3 X2
# 1  1  4  7 10
# 2  2  5  8 11
# 3  3  6  9 12

names(dat) <- am$label[match(names(dat), am$code)]

dat  ## after
#   wage hh.size age no.children
# 1    1       4   7          10
# 2    2       5   8          11
# 3    3       6   9          12

Data:

dat <- structure(list(X1=1:3, X2=4:6, X3=7:9, X4=10:12), class="data.frame", 
                 row.names=c(NA, -3L))

am <- structure(list(code=c("X1", "X2", "X3", "X4"), 
                     label=c("age", "wage", "no.children", "hh.size")), 
                class="data.frame", row.names=c(NA, -4L))
jay.sf
  • 60,139
  • 8
  • 53
  • 110