4

Let's say I have the output of a monte-carlo simulation of one variable over several different iterations (think millions). For each iteration, I have the values of the variable at each point in time (ranging from t=1 to t=365).

I would like to produce the following plot: For each point in time, t, on the x axis and for each possible value "y" in a given range, set the color of x,y to "k" where "k" is a count of how many observations are within a vicinity of distance "d" to x,y.

I know you can easily make density heatmaps for 1D data, but is there a good package for doing this on 2 dimensions? Do I have to use kriging?

Edit: The data structure is currently a matrix.

                                     data matrix

                                      day number
             [,1]    [,2]         [,3]      [,4]       [,5]      ... [,365]
iteration    [1,]    0.000213   0.001218    0.000151   0.000108  ... 0.000101
             [2,]    0.000314   0.000281    0.000117   0.000103  ... 0.000305
             [3,]    0.000314   0.000281    0.000117   0.000103  ... 0.000305
             [4,]    0.000171   0.000155    0.000141   0.000219  ... 0.000201
              .
              .
              .
     [100000000,]    0.000141   0.000148    0.000144   0.000226  ... 0.000188

I want to, for each "day" have the pixels running vertically across that "day" to represent the probability density of the iteration's values for that day in color. The result should look like a heatmap.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
Alfalfa
  • 131
  • 7
  • Try the `image` function, or `geom_tile`/`geom_raster` in ggplot2. – shadowtalker Jun 13 '15 at 15:59
  • @ssdecontrol if I were to use the image function, how do I handle the x,y coordinates for which I have no data? Would I have to basically construct the x,y matrix for the plot myself by interpolating? I was hoping something existed to do this out of the box... – Alfalfa Jun 13 '15 at 16:16
  • 1
    @Alfalfa missing data is missing data in any context. If you have "holes" in your matrix then you need to handle them. If you mean that your x axis should be something other than "1, 2, 3, 4" then you can just construct the axis yourself. – shadowtalker Jun 13 '15 at 16:17
  • 1
    @Alfalfa there are also some examples [here](http://www.r-bloggers.com/5-ways-to-do-2d-histograms-in-r/) – shadowtalker Jun 13 '15 at 16:19
  • @ssdecontrol the hexbinplot seems like a perfect fit for what I'd like to do. Many thanks! – Alfalfa Jun 13 '15 at 21:25

2 Answers2

3

Here is one solution to what I think you are after.

  1. Generate data.

    myData <- mapply(rnorm, 1000, 200, mean=seq(-50,50,0.5))

This is a matrix with 1000 rows (observations) and 201 time points. In each time point the mean of data there shifts gradually from -50 to 50. By 0.5 each time.

  1. Get densities.

    myDensities <- apply(myData, 2, density, from=-500, to=500)

This will give you a list of densities for each column. In order for them to be plottable side by side we specified the ranges (from -500 to 500) manually.

  1. Obtain density values from the list.

    Ys <- sapply(myDensities, "[", "y")

This is again a list. You need to get a matrix from that.

  1. Get matrix from list.

    img <- do.call(cbind, Ys)

This simply combines all Ys elements by column.

  1. Plot.

    filled.contour(x=1:ncol(img), y=myDensities[[1]]$x, t(img))

I use filled.contour for that. But you can look around for other 2-D plot functions. I also used values obtained from the densities D[[1]]$x.

And here is the result:

densities

The shift from -50 to 50 is visible.

Not sure if this can work well with millions of time points. But plotting million probably makes little sense since you will in any case by limited by the number of pixels. Some kind of pre-processing might be necessary.

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
  • 1
    Luckily I don't have millions of time points but simply millions of observations per time point. So this will work perfectly. I'll try this out along with the hexbinplot suggested in the comments. Thanks! – Alfalfa Jun 13 '15 at 21:26
  • 1
    Any time filled.contour is the answer then `lattice::levelplot` is also an option. – IRTFM Jun 13 '15 at 21:29
0

Another way to present data over time is to create a video.

The following uses the same matrix data as Karolis:

library(av)

myData <- mapply(rnorm, 1000, 200, mean=seq(-50,50,0.5))

# create function that includes a for loop, the output from 
# each iteration of the for loop will become one frame in
# the animation.
make_plot <- function(myData){
  
  xrange = range(myData)
  
  for(i in seq_along(myData[1,])){
    
    d <- density(myData[,i],
                 bandwidth = 45) # returns the density data
    plot(d, 
         xlim=xrange, 
         ylim=c(0, 0.003), 
         main = paste("Density, day:",i))
 
  }
}

# create video
av_capture_graphics(make_plot(myData),
                    output = "Density change over time.mp4",
                    width = 720,
                    height = 480,
                    framerate = 120)
duncanw
  • 161
  • 4