1

I want to generate random frequencies (i.e. frequencies have to equal to 1) to simulate gene frequencies in a population using R. My solution is:

freq<- function(x,y)  #random frequency generator, no defined distribution
    {rn<-sample(1:y,x)
    sum <- sum(rn)
    for(i in 1:x){
    RG[i]=rn[i]/sum}
    return(RG)
    }

Any suggestions on how to constrain the sum to a particular value (e.g. the random numbers have to equal 100) before division?

2 Answers2

1

Maybe try this:

  1. Generate sample from uniform distribution
  2. sort values
  3. add 0 and 1
  4. Use values as cut-off points for values

This might not be clear so here is the example!

set.seed(1)
x <- sort(runif(10))
x
## [1] 0.06178627 0.20168193 0.26550866 0.37212390 0.57285336 0.62911404 0.66079779 0.89838968 0.90820779 0.94467527
x <- c(0,x,1)
y <- diff(x)
y
## [1] 0.061786270 0.139895661 0.063826732 0.106615236 0.200729464 0.056260681 0.031683749 0.237591892 0.009818105 0.036467479 0.055324731
sum(y)
## [1] 1
bartektartanus
  • 15,284
  • 6
  • 74
  • 102
0

Try this:

 freq <- function(x,y)  #random frequency generator, no defined distribution
{ 
  ds <- 50             #desired sum

  # assuming an appropriate range and relation between ds, x and y

  for(i in 1:100) {
    rn0 <- sample(1:y,x-1)  
    if(sum(rn0) < ds) break()
  }

  rn <- c(rn0, ds - sum(rn0))             #desired freq

  sum <- sum(rn)
  RG <- 1:x
  for(i in 1:x){
    RG[i]=rn[i]/sum}
  return(RG)
}
Shambho
  • 3,250
  • 1
  • 24
  • 37