2

I have a data frame that looks like this

library(tidyverse)

data=data.frame(POS=c(172367,10), SNP=c("ATCG","AG"), QUAL=c(30,20))
data
#>      POS  SNP QUAL
#> 1 172367 ATCG   30
#> 2     10   AG   20

Created on 2022-02-02 by the reprex package (v2.0.1)

and I want to make it look like this

   POS     SNP    QUAL
   172367  A      30
   172368  T      30
   172369  C      30
   172370  G      30
   10      A      20
   11      G      20

I want to break the multistring into rows with single string and then change the position as well.

Any help is highly appreciated

lovalery
  • 4,524
  • 3
  • 14
  • 28
LDT
  • 2,856
  • 2
  • 15
  • 32

1 Answers1

2

You can do:

library(dplyr)
library(tidyr)

data %>%
  separate_rows(SNP, sep = "(?<=[ACGT])") %>%
  mutate(POS = ave(POS, POS, FUN = \(x) x + seq_along(x) - 1))

# A tibble: 6 x 3
     POS SNP    QUAL
   <dbl> <chr> <dbl>
1 172367 A        30
2 172368 T        30
3 172369 C        30
4 172370 G        30
5     10 A        20
6     11 G        20
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56