0

I am working on a dataset with a single regressor, and d many response variables. While working with that, I have encountered a dataframe of correlations, with 1 + dC2 = 1 + d(d-1)/2 which exactly looks like the ones that can be found here. I'm inserting an example of such a dataframe:

   X      r_{1,2}     r_{1,3}      r_{1,4}      r_{1,5}      r_{2,3}      r_{2,4}      r_{.,.}
1 21    0.6002993    0.923644    0.8184414    0.3721132    0.9337539    0.6569090    .........
2 22    0.6498641    0.358339    0.9511748    0.1091543    0.6651190    0.9960394    .........
3 23    0.6825716    0.117533    0.8900186    0.9256916    0.9253819    0.6588873    .........
4 24    0.8280786    0.324110    0.6634117    0.7292685    0.0207334    0.9122315    .........
5 25    0.9520840    0.642721    0.5012283    0.2722650    0.2582217    0.3901019    .........
6 26    0.8714017    0.157062    0.8675581    0.5384571    0.6053657    0.5930488    .........
.  .        .
.  .        .
.  .        .

The dataframe has the first column to be the time variable X, and any other column with title r_{i,j} contains the time-varying correlations between the i-th and j-th variables.

Now, I am trying to make a matrix of plots of time-varying correlations for every pair of variables, using ggplot2 or similar packages. This is probably not very difficult to do manually, but since I have d many variables, the process has to be automated. Essentially, my desired plot looks like the following:

2

Note that:

  • For each plot, the X axis is the same, i.e., the variable X in the dataframe, with axis texts in the far bottom.
  • The Y-scale in each plot should be same, i.e., 0 to 1.
  • Lastly, there should be row and column names like 1,2,3,... in both the rows and columns.

Any help will be appreciated. Please let me know if any clarification is neede. Thanks!

JRC
  • 161
  • 1
  • 11
  • 2
    Would `ggpairs` from the `GGally` package help? [Documentation](https://ggobi.github.io/ggally/reference/ggpairs.html) – Seth Jul 30 '23 at 03:20
  • This looks great. I'll have a detailed look at it. – JRC Jul 30 '23 at 04:43

1 Answers1

2

This is a bit different from a standard pairs plot, such as you might produce with GGally::ggpairs, since the x axis is constant across all plots.

If you want to build this yourself then you could automate using the patchwork package.

First, work out the row and column numbers that are going to appear in your plot, and what their names would be if they were columns in your data frame, even for the blank spaces in your plot that do not have a corresponding column in your data frame:

library(ggplot2)
library(patchwork)

n <- which(choose(seq(20), 2) == ncol(df) - 1)

pairs <- apply(expand.grid(seq(n-1) + 1, seq(n - 1)), 1, function(x) {
  paste0("r_{", x[2], ",", x[1], "}") })

pairs <- c(t(matrix(pairs, nrow = n - 1)))

pairs
#>  [1] "r_{1,2}" "r_{2,2}" "r_{3,2}" "r_{4,2}" "r_{1,3}" "r_{2,3}" "r_{3,3}"
#>  [8] "r_{4,3}" "r_{1,4}" "r_{2,4}" "r_{3,4}" "r_{4,4}" "r_{1,5}" "r_{2,5}"
#> [15] "r_{3,5}" "r_{4,5}"

Now we can create a plot for member of pairs. If it does not exist as a column in the data frame, we create an empty plot. If it is on the top row or the rightmost column, we add a single facet. If it is on the bottom or right, we use axis text, and remove it otherwise:

plots <- lapply(pairs, function(i) {
  is_bottom <- grepl(paste0(",", n), i)
  is_right  <- grepl(paste0(n - 1, ","), i)
  is_left   <- grepl("1,", i)
  is_top    <- grepl(",2", i)
  is_col <- i %in% names(df)
  rowname <- sub("^r_\\{(\\d+),.*$", "\\1", i)
  colname <- sub("^.*,(\\d+).*$", "\\1", i)
  if(!is_col) {
    p <- ggplot(df, aes(X, runif(length(x)))) + geom_blank()
  } else {
    p <- ggplot(df, aes(X, .data[[i]])) +
         geom_line(linewidth = 1.5, color = "gray50", lineend = "round")
  }
  p <- p + scale_y_continuous(NULL, limits = c(0, 1)) +
           scale_x_continuous(NULL, 1:10) +
           theme_classic() 
  
  f <- quote(.~.)
  if(is_top) f[3] <- rowname
  if(is_right) f[2] <- colname
  if(is_top | is_right) p <- p + facet_grid(f)

  p + theme(axis.text.x = if(is_bottom) element_text()  else  element_blank(),
          axis.title.x = if(is_bottom) element_text()  else  element_blank(),
          axis.text.y = if(is_left) element_text()  else  element_blank(),
          panel.border = if(is_col) element_rect(fill = NA) else element_blank(),
          axis.ticks = if(is_col) element_line() else element_blank(),
          axis.line = element_blank())
})

To plot it, we can do:

Reduce(`+`, plots) +
  plot_annotation(caption = "X", theme = theme(
    plot.caption = element_text(hjust = 0.5, size = 12)))

enter image description here


EDIT

To plot a facet with the column name above each panel, change the creation of plots in the above code to:

plots <- lapply(pairs, function(i) {
  is_bottom <- grepl(paste0(",", n), i)
  is_right  <- grepl(paste0(n - 1, ","), i)
  is_left   <- grepl("1,", i)
  is_top    <- grepl(",2", i)
  is_col <- i %in% names(df)
  if(!is_col) {
    p <- ggplot(df, aes(X, runif(length(X)))) + geom_blank()
  } else {
    p <- ggplot(df, aes(X, .data[[i]])) +
      geom_line(linewidth = 1.5, color = "gray50", lineend = "round")
  }
  p <- p + scale_y_continuous(NULL, limits = c(0, 1)) +
    scale_x_continuous(NULL, 1:10) +
    theme_classic() 
  
  f <- quote(.~.)
  f[3] <- i
  if(is_col) p <- p + facet_grid(f)
  
  p + theme(axis.text.x = if(is_bottom) element_text()  else  element_blank(),
            axis.title.x = if(is_bottom) element_text()  else  element_blank(),
            axis.text.y = if(is_left) element_text()  else  element_blank(),
            panel.border = if(is_col) element_rect(fill = NA) else element_blank(),
            axis.ticks = if(is_col) element_line() else element_blank(),
            axis.line = element_blank())
})

And when plotted, this produces

enter image description here


Data used

set.seed(1)

df <- cbind(X = 1:10,
      replicate(10, sort(runif(10))) |>
      as.data.frame() |>
      setNames(combn(1:5, 2, FUN = \(x) paste0("r_{", x[1], ",", x[2], "}"))))

df
#>     X    r_{1,2}   r_{1,3}    r_{1,4}   r_{1,5}   r_{2,3}    r_{2,4}    r_{2,5}
#> 1   1 0.06178627 0.1765568 0.01339033 0.1079436 0.0233312 0.07067905 0.08424691
#> 2   2 0.20168193 0.2059746 0.12555510 0.1862176 0.4772301 0.09946616 0.25801678
#> 3   3 0.26550866 0.3800352 0.21214252 0.4112744 0.5297196 0.24479728 0.29360337
#> 4   4 0.37212390 0.3841037 0.26722067 0.4820801 0.5530363 0.31627171 0.33239467
#> 5   5 0.57285336 0.4976992 0.34034900 0.4935413 0.6470602 0.40683019 0.45906573
#> 6   6 0.62911404 0.6870228 0.38238796 0.5995658 0.6927316 0.43809711 0.47854525
#> 7   7 0.66079779 0.7176185 0.38611409 0.6684667 0.7323137 0.47761962 0.65087047
#> 8   8 0.89838968 0.7698414 0.65167377 0.7237109 0.7829328 0.51863426 0.76631067
#> 9   9 0.90820779 0.7774452 0.86969085 0.7942399 0.7893562 0.66200508 0.87532133
#> 10 10 0.94467527 0.9919061 0.93470523 0.8273733 0.8209463 0.86120948 0.91287592
#>      r_{3,4}   r_{3,5}    r_{4,5}
#> 1  0.3337749 0.1216919 0.05893438
#> 2  0.3390729 0.1433044 0.23962942
#> 3  0.3466835 0.2026923 0.41008408
#> 4  0.3899895 0.2454885 0.45527445
#> 5  0.4763512 0.3253522 0.60493329
#> 6  0.7773207 0.3999944 0.64228826
#> 7  0.8394404 0.4346595 0.77891468
#> 8  0.8643395 0.7111212 0.79730883
#> 9  0.8921983 0.7125147 0.81087024
#> 10 0.9606180 0.7570871 0.87626921

Created on 2023-07-30 with reprex v2.0.2

Allan Cameron
  • 147,086
  • 7
  • 49
  • 87
  • Incredible! Thanks a lot. However, I have the following query: Is it possible to write the names of the pairs of variables on the top of every plot in a way exactly similar to what we see in case we use `facet_wrap` or `facet_grid`? – JRC Jul 30 '23 at 10:41
  • 1
    @JRC do you mean a strip at the top of each of the 10 panels, or four along the top and four along the bottom? – Allan Cameron Jul 30 '23 at 11:14
  • I was actually looking for the first kind that you mentioned. That is, a strip at the top of each of the 10 panels saying "r_{i,j}" for the plot lying in the intersection of the i-th row and j-th column. I'm not really sure how to do that. It'll be really helpful if you could help. Thanks! – JRC Jul 30 '23 at 16:54
  • @JRC ah! My bad luck to spend so long on the wrong guess. Anyway, it's a quick fix - see my edit – Allan Cameron Jul 30 '23 at 17:20
  • 1
    Absolutely wonderful. Thanks a lot for your time and efforts. – JRC Jul 30 '23 at 19:02