0

I have intervals in a data frame df like follows:

df=data.frame(Id=rep("A1",23),start=c(11176,11176,11176,11176,11176,11176,11176,11177,11177,11177,11177,11177,11177,11178,11178,11179,11179,11179,11233,11233,11233,11233,11233),end=c(11205,11206,11206,11206,11206,11206,11207,11206,11206,11208,11206,11208,11209,11206,11206,11203,11204,11204,11263,11263,11263,11263,11264)) 

Also I have query position as "11180". Now I would like get the overlapping intervals starting from intervals in which 11180 is part of it. I mean my query position(11180) lies between the first interval in df "11176" and 111205. So I take this interval (11176 and 111205) and find intervals that overlap 11176 and 111205. Later the intervals that overlap the overlapping intervals of 11176 and 111205 and so on until there is no overlap found and return that interval. I expect my code it to give 11179 and 11204 as last overlapping interval but somehow it reports only 11178 and 11206. Below is my code

    for(i in 1:dim(df)[1])
{
  final_start=temp_start
  final_end=temp_end
 if((findInterval(final_end,c(df$start[i],df$end[i]),rightmost.closed = T,left.open = T)==1L) || (findInterval(final_start,c(df$start[i],df$end[i]),rightmost.closed = T,left.open = T)==1L))
   {
    final_start=df$start[i]
    final_end=df$end[i]
    print(final_start)
    print(final_end)
   } 
}

the above code takes the query range as input and stores the start and end in temp_start and temp_end. Later it checks the whether the temp_end or temp_start is within the range of the intervals in data frame df Any guidance would be very useful to me. Thanks in advance. Let me know if you need any more details.

alexis_laz
  • 12,884
  • 4
  • 27
  • 37
Carol
  • 367
  • 2
  • 3
  • 18
  • 1
    Are you trying to find all intervals of "df" that overlap with 11180? Try looking at "IRanges" package -- e.g. look at the `subjectHits` of `findOverlaps(IRanges(11180, 11180), IRanges(df$start, df$end))` which contains the rows of intervals that overlap with 11180. – alexis_laz Jun 10 '16 at 19:20
  • 1
    another very good overlap-function comes with `data.table`, called `foverlaps()`. Check out [this link](http://www.rdocumentation.org/packages/data.table/functions/foverlaps) for some great examples to wrap your head around the concept. – Ratnanil Jun 11 '16 at 21:08
  • @alexis_laz I am trying to look the intervals that overlap 11180 and sub-sequentially the other intervals that overlap 11180's intervals and so on until there is no overlaps reported. But my code was able to cover only th half the distance. – Carol Jun 13 '16 at 07:21
  • @Carol : I guess, using package "IRanges", something like `df[subjectHits(findOverlaps(IRanges(11180, 11180), IRanges(df$start, df$end))), ]` seems to fit your description? You might, also, find `reduce` (from the same package) helpful. – alexis_laz Jun 13 '16 at 12:30

0 Answers0