4

Goal

I would like to sample from a bi-variate uniform distributions with specified correlation coefficient in java.

Question

  • What method can I use to implement such multivariate uniform distribution?

or

  • Is there an existing package that would implement such thing so that I don't have to reinvent the wheel?

What I've got so far

The packages mvtnorm in R allow to sample from a multivariate normal distribution with specified correlation coefficients. I thought that understanding their method may help me out either by doing something similar with uniform distributions or by repeating their work and using copulas to transform the multivariate normal into a multivariate uniform (as I did in R there).

The source code is written in Fortran and I don't speak Fortran! The code is based on this paper by Genz and Bretz but it is too math heavy for me.

Neil Lunn
  • 148,042
  • 36
  • 346
  • 317
Remi.b
  • 17,389
  • 28
  • 87
  • 168
  • You can use `apache.commons.math` for not reinventing the wheel, although you don't have directly a multivariate uniform distribution; you've got e.g a [multivariate normal distribution](http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/distribution/MultivariateNormalDistribution.html) and a [uniform distribution](http://commons.apache.org/proper/commons-math/apidocs/org/apache/commons/math4/distribution/UniformRealDistribution.html). Anyhow, you can reuse code to build it. ... – lrnzcig May 13 '17 at 14:47
  • ... [continued] You should extend the `AbstractMultivariateRealDistribution` object, copy most of the of the stuff from the `UniformRealDistribution`. The difficult part is as you mention combining the two, but maybe you can translate the R code you mention for using copulas? – lrnzcig May 13 '17 at 14:48

1 Answers1

0

I have an idea. Typically, you generate U(0,1) by generating, say, 32 random bits and dividing by 232, getting back one float. For two U(0,1) you generate two 32bit values, divide, get two floats back. So far so good. Such bi-variate generator would be uncorrelated, very simple to check etc

Suppose you build it your bi-variate generator in following way. Inside, you get two random 32bit integers, and then produce two U(0,1) with shared parts. Say, you take 24bits from first integer and 24bits for second integer, but upper (or lower, or middle, or ...) 8bits would be the same (taken from first integer and copied to second) for both of them.

Clearly, those two U(0,1) would be correlated. We could write them as

U(0,1)0 = a0 + b

U(0,1)1 = a1 + b

I omit some coefficients etc for simplicity. Each one is U(0,1) with mean of 1/2 and variance of 1/12. Now you have to compute Pearson correlation as

r = ( E[U(0,1)0 U(0,1)1] - 1/4 ) / sqrt(1/12)2

Using expansion above it should be easy after some algebra to compute r and compare with one you want. You may vary size of the correlated part b, as well as its position (high bits, low bits, somewhere in the middle) to fit desired r.

Realistically speaking, there should be infinite possibilities to have the same r but different sampling code and different bi-variate distributions. You might want to add more constrains in the future

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Severin Pappadeux
  • 18,636
  • 3
  • 38
  • 64