6

i just tried to use pivot_longer on a 2D Matrix to get "tidy" data for ggplot. So far that was pretty straight forward with reshape2::melt

library(tidyverse)
library(reshape2)

x <- c(1, 2, 3, 4)
y <- c(1, 2, 3)

Data      <- matrix(round(rnorm(12, 10, 4)), nrow = 4, ncol = 3)
melt_data <- reshape2::melt(Data)

ggplot2::ggplot(meltvec, ggplot2::aes(x = Var1, y = Var2, fill = value)) +
   geom_tile()

However, pivot_longer needs a tibble or data.frame. So i came up with following function:

matrix_longer <- function(.data){
  stopifnot(is.matrix(.data),
            !is.data.frame(.data))

  .data <- as.data.frame(.data)
  names(.data) <- 1:ncol(.data)

  .data$Var1 =1:nrow(.data)

   pivot_longer(.data,cols = as.character( 1:(ncol(.data)-1)), names_to = "Var2", values_to = "value") %>% 
     arrange(Var2) %>% 
     mutate(Var2=as.numeric(Var2))
 }

And it produces the same output

own_data <- matrix_longer(Data)

ggplot2::ggplot(own_data, ggplot2::aes(x = Var1, y = Var2, fill = value)) +
   geom_tile()

all(own_data==melt_data)

The question is: Is there a better solution? Should/Can i just stick with reshape2::melt? Is it a bad idea to use .data?

SebSta
  • 476
  • 2
  • 12
  • Is there a reason why you want to start with a matrix and not a data.frame/tibble? After all, tibbles are the format advocated for ggplot. Regarding `.data` I think it depends on whether you use `.data` from the rlang package: https://rlang.r-lib.org/reference/tidyeval-data.html – robertdj Feb 28 '20 at 10:30
  • I start with a matrix, because thats the format i have at hand. I have to transform it for pivot_longer anyways, so i would prefer to do all of that in one function call. – SebSta Feb 28 '20 at 10:36

1 Answers1

15

To get a three-column dataframe of row and column indices and values from a matrix you can simply use as.data.frame.table():

set.seed(9)
Data <- matrix(round(rnorm(12, 10, 4)), nrow = 4, ncol = 3)

as.data.frame.table(Data, responseName = "value")

   Var1 Var2 value
1     A    A    10
2     B    A     9
3     C    A    17
4     D    A     7
5     A    B    10
6     B    B     0
7     C    B    14
8     D    B     7
9     A    C    17
10    B    C    11
11    C    C     9
12    D    C    14

If you want the indices to be integers rather than alphanumeric values (factors by default), you can do:

library(dplyr)

as.data.frame.table(Data, responseName = "value") %>%
  mutate_if(is.factor, as.integer)

   Var1 Var2 value
1     1    1    10
2     2    1     9
3     3    1    17
4     4    1     7
5     1    2    10
6     2    2     0
7     3    2    14
8     4    2     7
9     1    3    17
10    2    3    11
11    3    3     9
12    4    3    14
Ritchie Sacramento
  • 29,890
  • 4
  • 48
  • 56