0

I have a dataset dtAU where the columns names are the following:

...
"SUPG..SU.Product.Group"  
"SUPC..SU.Industry.Code" 
"SU_CAT..SU.Category"               
"FREQUENCY..Frequency" 
"TIME_PERIOD..Time.Period" 
...

I want to get the colnames as "SUPG", "SUPC", etc... so to extract only the character before the ".." and assign them as column names.

When I try this

test <- str_split(colnames(dtAU), "[..]")

I got :

List of 11
 $ : chr [1:3] "ï" "" "DATAFLOW"
 $ : chr [1:5] "SUPG" "" "SU" "Product" ...
 $ : chr [1:5] "SUPC" "" "SU" "Industry" ...
 $ : chr [1:4] "SU_CAT" "" "SU" "Category"
 $ : chr [1:3] "FREQUENCY" "" "Frequency"
 $ : chr [1:4] "TIME_PERIOD" "" "Time" "Period"
 $ : chr "OBS_VALUE"
 $ : chr [1:4] "UNIT_MEASURE" "" "Observation" "Comment"
 $ : chr [1:5] "UNIT_MULT" "" "Unit" "of" ...
 $ : chr [1:4] "OBS_STATUS" "" "Observation" "Comment"
 $ : chr [1:4] "OBS_COMMENT" "" "Observation" "Comment"

But I do not know how to retrieve as column names the first part of each character chain

zx8754
  • 52,746
  • 12
  • 114
  • 209
IRT
  • 209
  • 2
  • 11
  • You can use just `sub("[..].*", "", colnames(dtAU) )` instead. See this post https://stackoverflow.com/questions/33683862/first-entry-from-string-split. – JKupzig Mar 11 '22 at 10:36

2 Answers2

2

A possible solution;

library(tidyverse)

n <- c("SUPG..SU.Product.Group",
"SUPC..SU.Industry.Code",
"SU_CAT..SU.Category",               
"FREQUENCY..Frequency", 
"TIME_PERIOD..Time.Period") 

n %>% 
  str_remove("\\.\\..*")
#> [1] "SUPG"        "SUPC"        "SU_CAT"      "FREQUENCY"   "TIME_PERIOD"

Now, to assign the new colnames to your dataframe dtAU, just do the following:

names(dtAU) <- names(dtAU) %>% str_remove("\\.\\..*")
PaulS
  • 21,159
  • 2
  • 9
  • 26
  • 1
    Yes, but then I need to assign a new column name for each column, so that the previous column SUPG..SU.Procut.Group is named now "SUPG". – IRT Mar 11 '22 at 10:38
  • Just edited my answer to contemplate your need, @IRT. – PaulS Mar 11 '22 at 10:41
1

You can do:

gsub('[..].*', '', names(dtAU)) -> names(dtAU)
gsub('\\..*', '', names(dtAU)) -> names(dtAU)

or if you want to use strsplit:

sapply(strsplit(names(dtAU), split = '\\.'), `[[`, 1) -> names(dtAU)
AlexB
  • 3,061
  • 2
  • 17
  • 19