0

Introduction and problem

I have multiple (>2) GRanges objects. I want to find those ranges that are shared by x% or more of all GRanges.

Example data

I will provide some example data as dataframes, let's say we want to find those ranges that are shared by 66.7% (2/3) or more.

gr1 <- data.frame(seqnames = rep('chr1', 3),
                  start = c(1, 10, 20), 
                  end = c(3, 17, 30))


gr2 <- data.frame(seqnames = rep('chr1', 3),
                  start = c(2, 11, 31), 
                  end = c(3, 19, 35))

gr3 <- data.frame(seqnames = rep('chr1', 3),
                  start = c(2, 16, 37), 
                  end = c(3, 22, 40)

Shown are the dataframes:

Shown are the dataframes

Output wanted

A Granges output. In the example the algorithm should find:

chr1 2 - 3 Reason: (2-3 is found in gr1, gr2 and gr3, 1 only found in gr1) chr1 11 - 22 Reason: (11-17 is found in gr1 and gr2, 10 only in gr1 ,18-19 in gr2 and gr3, 20 -22 in gr1 and gr3)

What I have done

I know how to find query hits found in all (100%) GRanges, see R overlap multiple GRanges with findOverlaps()

kjetil b halvorsen
  • 1,206
  • 2
  • 18
  • 28
dshandel
  • 1
  • 1

1 Answers1

0

I asked this same question in the Bioconductor website and it was answered correctly there: https://support.bioconductor.org/p/9148540/

dshandel
  • 1
  • 1
  • Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 27 '22 at 23:23