I have dataframe which has 253 rows(locations on a chromosome in Mbps) and 1 column (Allele score at each location). I need to produce a dataframe which contains the mean of the allele score at every 0.5 Mbps on the chromosome. Please help with R code that can do this. thanks.
Asked
Active
Viewed 206 times
-3
-
2Please read [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask). Stack Overflow is not a code-writing service. – cmaher Mar 26 '18 at 16:42
-
Generally a minimum reproducible example would be preferred. Question: Do you have an 'interval' column? If not, can you generate one? Then you can just restructure with ddply. – SeldomSeenSlim Mar 26 '18 at 16:43
1 Answers
0
The picture in this case is adequate to construct an answer but not adequate to support testing. You should learn to post data in a form that doesn't require re-entry by hand. (That's why you are accumulating negative votes.)
The basic R strategy would be to use cut
to create a grouping variable and then use a loop construct to accumulate and apply the mean function. Presumably this is in a dataframe which I will assume is named something specific like my_alleles
:
tapply( my_alleles$Allele_score, # act on this vector
# in groups defined by this factor
cut(my_alleles$Location,
breaks=seq(0, max(my_alleles$Location), by=0.5)
),
# with this function
FUN=mean)

IRTFM
- 258,963
- 21
- 364
- 487