21

I have a dataframe like this:

  V1      V2      V3 
1  1 3423086 3423685 
2  1 3467184 3467723 
3  1 4115236 4115672 
4  1 5202437 5203057 
5  2 7132558 7133089 
6  2 7448688 7449283 

I want to change the V1 column and add chr before the number. Just like this:

  V1      V2      V3 
1  chr1 3423086 3423685 
2  chr1 3467184 3467723 
3  chr1 4115236 4115672 
4  chr1 5202437 5203057 
5  chr2 7132558 7133089 
6  chr2 7448688 7449283 

Is there a way to do this in R?

Bhargav Rao
  • 50,140
  • 28
  • 121
  • 140
Lisann
  • 5,705
  • 14
  • 41
  • 50

3 Answers3

38

The regex pattern "^" (outside any character-class brackets) represents the point just before the first character of a "character"-class item (aka "string" in other computer languages). This just replaces the beginning of each "character" element in vector with a stem of "chr". It implicitly coerces a "numeric" input value to "character" so alters the mode of the result.

> dat$V1 <- sub("^", "chr", dat$V1 )
> dat
    V1      V2      V3
1 chr1 3423086 3423685
2 chr1 3467184 3467723
3 chr1 4115236 4115672
4 chr1 5202437 5203057
5 chr2 7132558 7133089
6 chr2 7448688 7449283

Could, of course, have used paste("chr", dat$V1, sep=""), but I thought a regex solution might be neater.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • 7
    What did the sledgehammer say to the nut? I'll smash you into `paste`. – Andrie Nov 09 '11 at 13:35
  • If I need to add `chr` in after numbers? ie., `1chr, 2chr, etc.` – ah bon Sep 01 '21 at 10:36
  • 1
    @ah bon `paste(dat$V1, "chr", ...` would seem to be the obvious modification of the second solution. And `sub("$", "chr", dat$V1)` would be the corresponding modification of the first solution, noting that the pattern '$' in the second suggestion is the regex end-of-string marker rather than the R extraction operator, – IRTFM Sep 01 '21 at 14:36
7

sprintf is a lot more powerful than plain concatenation.

dat$V1 <- sprintf('chr%i', dat$V1)
YvanR
  • 365
  • 2
  • 6
4

We can also use interaction:

df$V1 <- interaction( "chr", df$V1, sep = "")
df

Or using sqldf:

library(sqldf)    
df$V1 <- as.character(df$V1)
df$V1 <- sqldf("select 'chr'|| V1 as V1 from df") 
mpalanco
  • 12,960
  • 2
  • 59
  • 67
  • How do I use interaction on multiple columns? `df[,2:3] <- interaction(df[2:3], "addtext", sep ="")` throws out sort error – Vasim Dec 02 '16 at 06:15