I have one GRanges object with thousands of score columns (tomap), and another with regions of interest and no metadata (roi). I am trying to map the max score from each column in tomap to the corresponding interval in roi.
I also want to retain the names of the score columns (in my real data these are meaningful names and not genralizable like score1, score2 etc...). I can do it for specific columns but am struggling to generalize it to every column.
Here is what I've got so far:
library(GenomicRanges)
tomap <- GRanges(
seqnames = Rle(c("chr1"), c(10)),
ranges = IRanges(1:10*10, end = 1:10*10+5),
score1 = runif(10),score2=runif(10),score3=runif(10),score4=runif(10),score5=runif(10))
roi <- GRanges(
seqnames = Rle(c("chr1"), c(5)),
ranges = IRanges(1:5*20 + floor(runif(5)*4), width = 10))
hits <- findOverlaps(roi, tomap, ignore.strand = TRUE)
ans<-roi
mcols(ans) <- aggregate(tomap, hits, score1=max(score1), score2= max(score2))
ans
#GRanges object with 5 ranges and 3 metadata columns:
# seqnames ranges strand | grouping score1 score2
# <Rle> <IRanges> <Rle> | <ManyToManyGrouping> <numeric> <numeric>
# [1] chr1 22-31 * | 2,3 0.326366489753127 0.925836584065109
# [2] chr1 42-51 * | 4,5 0.92806151532568 0.897841389290988
# [3] chr1 62-71 * | 6,7 0.980487102875486 0.940743445185944
# [4] chr1 83-92 * | 8,9 0.798293181695044 0.381754550151527
# [5] chr1 101-110 * | 10 0.872806148370728 0.953412540955469
As you can see, this works when I specify each score column individually, but how do I do this for thousands of columns?