0

I need to find the column called "L1_Class" but only using the 'node' object defined first. Then, I'd like to add a column called "L1" that contains the same vector as "L1_Class". All in one short line if possible.

library(dplyr)

node <- "L1" # (in real life this is a folder name in a working directory) "../Mapping/VegType/L1" 

# test dataframe
dat <- data.frame(L1_Class = c(rep("veg", 3), 
                               rep("rock", 4), 
                               rep("snow", 2)),
                  L2_tree = c(rep("conifer", 3), 
                               rep("cottonwood", 4), 
                               rep("snow", 2)),
                  L2_shrub = c(rep("alder", 3), 
                              rep("willow", 4), 
                              rep("snow", 2)),
                  L3_tree = c(rep("alder", 3), 
                               rep("willow", 4), 
                               rep("snow", 2)))

# this is what I need to add only using a function.
dat$L1 <- dat$L1_Class

# end dataframe (i hope).
head(dat)

# I need to find the column called "L1_Class" but only using the 'node' object. 
# Then, I'd like to add a column called "L1" that contains the same vector as "L1_Class"
# All in one short line if possible.

dat_extra_col <- dat %>%
         mutate( across(starts_with(gsub('_[^_]*$', "", node, fixed = TRUE))), .fns = list(node = ~gsub('_[^_]*$', "", ., fixed = TRUE)))

# or worse

dat_extra_col 
dat %>% mutate(!!node := across(starts_with(gsub('_[^_]*$', "", node, fixed = TRUE))))

Thanks for any help!

  • Is it always the first part of the column name? And does it need to be able to handle duplicates (e.g. L2_shrub and L2_tree)? – Tjn25 Oct 08 '21 at 23:18

3 Answers3

2

You can use -

dat[[node]] <- dat[[paste0(node, '_Class')]]

dat

#  L1_Class    L2_tree L2_shrub L3_tree   L1
#1      veg    conifer    alder   alder  veg
#2      veg    conifer    alder   alder  veg
#3      veg    conifer    alder   alder  veg
#4     rock cottonwood   willow  willow rock
#5     rock cottonwood   willow  willow rock
#6     rock cottonwood   willow  willow rock
#7     rock cottonwood   willow  willow rock
#8     snow       snow     snow    snow snow
#9     snow       snow     snow    snow snow

This creates a new column named node copying the data from node + _Class.

Or using grep if not all variables have _Class as suffix.

dat[[node]] <- dat[[grep(node, names(dat))]]
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
1

With dplyr you can do this

dat %>% mutate(!!node := !!as.name(paste0(node, "_Class")))

output

# A tibble: 9 x 5
  L1_Class L2_tree    L2_shrub L3_tree L1   
  <chr>    <chr>      <chr>    <chr>   <chr>
1 veg      conifer    alder    alder   veg  
2 veg      conifer    alder    alder   veg  
3 veg      conifer    alder    alder   veg  
4 rock     cottonwood willow   willow  rock 
5 rock     cottonwood willow   willow  rock 
6 rock     cottonwood willow   willow  rock 
7 rock     cottonwood willow   willow  rock 
8 snow     snow       snow     snow    snow 
9 snow     snow       snow     snow    snow 
Marek Fiołka
  • 4,825
  • 1
  • 5
  • 20
0

If it is always the first part of the column name and does not need to handle duplicates then you can use substr() to select the first n characters and match() to find the column of interest.

dat[[node]] <- dat[,!is.na(match(substr(colnames(dat), 1, nchar(node)), node))]
Tjn25
  • 685
  • 5
  • 18