1

I have 2 data sets each containing Start,End,and Chromosome column names. I want to compare the values from the two files and see if there are any regions that dont overlap( taking into account start,end,and chrom positions)and include them on a list using R. What is the best way to run through all the data points from both files and compare them

File Example 1:

Start   End Chr
0   4   1
26  31  2
48  55  3

File Example 2:

Chr Start   Stop
1   0.779727    4.836056
1   0.852863    3.700089
2   5.334127    21.181346
2   6.218477    6.734267

Thanks

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • 1
    Have you looked at the BioConductor package: 'IRanges'? I also see a GenomicRanges package listed but generally this sort of problem has already been address in 'IRanges' – IRTFM Jun 13 '12 at 18:54
  • The findOverlap() function from the GenomicRanges package seems to do the trick thanks! – user1454470 Jun 13 '12 at 19:26
  • @DWin Sounds like you have the Answer there. Care to provide as such so we can close this one down? – Gavin Simpson Jun 13 '12 at 20:03
  • I'm hoping @user1454470 will do that. He has the actual use case, which is not really apparent in the material above. I was throwing out a hunch, which I am happy to see was fruitfully explored. (The GenomicRanges package has a lot of dependencies, so new users should be patient while it installs.) – IRTFM Jun 13 '12 at 20:28

0 Answers0