0

How can I rename the header in R, based on information from the current header?

My df look likes:

header: (index; d__Bacteria.p__Actinobacteriota.c__Actinobacteria ; d__Bacteria.p__Bacteroidota.c__Bacteroidia )

Row 1: (BF13A; 0; 14572)

Row 2: (BF13B; 0; 24215)

etc

I want to rename the columns with taxonomy information (d__Bacteria.) to only have the information after c__

header: (index; Actinobacteria ; Bacteroidia )

Row 1: (BF13A; 0; 14572)

Row 2: (BF13B; 0; 24215)

ect.

Btw I have more columns than two with taxonomic information, so the solution such work on bigger df as well.

Peter
  • 11,500
  • 5
  • 21
  • 31
jonsor
  • 1
  • Welcome to stack overflow. Please make your question reproducible by including your data as an object in the question. With small datasets like the ones in thise question it is easy to paste in `df <- data.frame(var1 = c(…), …). This makes it easier for others to test and verify solutions. [MRE] provides guidance. – Peter Apr 13 '21 at 16:37

1 Answers1

0

You can rename the columns using tidyverse. Something like below,

library(tidyverse)
df %>% 
  rename(
    Actinobacteria = d__Bacteria.p__Actinobacteriota.c__Actinobacteria,
    Bacteroidia = d__Bacteria.p__Bacteroidota.c__Bacteroidia
    )

This can also be done using base functions

names(df)[names(df) == "d__Bacteria.p__Actinobacteriota.c__Actinobacteria"] <- "Actinobacteria"
names(df)[names(df) == "d__Bacteria.p__Bacteroidota.c__Bacteroidia"] <- "Bacteroidia"

Instead of setting the new column names, you can modify the old names like this

new_df <- df  %>% 
  setNames(substring(names(.),regexpr(".c_", names(.)) + 4)) 
> colnames(new_df)
[1] "Actinobacteria" "Bacteroidia"
pratap
  • 538
  • 1
  • 5
  • 11