Create dataframe from a matrix

Question

How to get a data frame with the same data as an already existing matrix has?

A simplified example of my matrix:

mat <- matrix(c(0, 0.5, 1, 0.1, 0.2, 0.3, 0.3, 0.4, 0.5),
              ncol = 3, nrow = 3,
              dimnames = list(NULL, c("time", "C_0", "C_1")))

> mat
     time C_0 C_1
[1,]  0.0 0.1 0.3
[2,]  0.5 0.2 0.4
[3,]  1.0 0.3 0.5

I would like to create a data frame that looks like this:

     name   time   val
1    C_0    0.0    0.1
2    C_0    0.5    0.2
3    C_0    1.0    0.3
4    C_1    0.0    0.3
5    C_1    0.5    0.4
6    C_1    1.0    0.5

All my attempts are quite clumsy, for example:

data.frame(cbind(c(rep("C_1", 3), rep("C_2", 3)),
                 rbind(cbind(mat[,"time"], mat[,"C_0"]),
                       cbind(mat[,"time"], mat[,"C_1"]))))

Does anyone have an idea of how to do this more elegantly? Please note that my real data has a few more columns (40 columns).

You are looking for `?melt`. Also search for `[r] reshape2` in the SO search box — Ricardo Saporta, Apr 08 '13 at 17:22
More info here: [www.statmethods.net/management/reshape.html](http://www.statmethods.net/management/reshape.html). — Backlin, Apr 08 '13 at 17:23

score 53 · Accepted Answer · edited Feb 03 '20 at 18:56

If you change your time column into row names, then you can use as.data.frame(as.table(mat)) for simple cases like this.

Example:

data <- c(0.1, 0.2, 0.3, 0.3, 0.4, 0.5)
dimnames <- list(time=c(0, 0.5, 1), name=c("C_0", "C_1"))
mat <- matrix(data, ncol=2, nrow=3, dimnames=dimnames)
as.data.frame(as.table(mat))
  time name Freq
1    0  C_0  0.1
2  0.5  C_0  0.2
3    1  C_0  0.3
4    0  C_1  0.3
5  0.5  C_1  0.4
6    1  C_1  0.5

In this case time and name are both factors. You may want to convert time back to numeric, or it may not matter.

score 12 · Answer 2 · answered Apr 08 '13 at 17:37

12

You can use stack from the base package. But, you need first to coerce your matrix to a data.frame and to reorder the columns once the data is stacked.

mat <- as.data.frame(mat)
res <- data.frame(time= mat$time,stack(mat,select=-time))
res[,c(3,1,2)]

  ind time values
1 C_0  0.0    0.1
2 C_0  0.5    0.2
3 C_0  1.0    0.3
4 C_1  0.0    0.3
5 C_1  0.5    0.4
6 C_1  1.0    0.5

Note that stack is generally more efficient than the reshape2 package.

answered Apr 08 '13 at 17:37

agstudy

119,832
17
199
261

1

I prefer the `stack` solution, also because I won't need to load `reshape2` – user1981275 Apr 08 '13 at 17:58
1

(Technically, `stack` is in the `utils` package, loaded by default in interactive R sessions.) – r2evans Aug 29 '19 at 17:00

Josh O'Brien · Answer 3 · 2013-04-08T17:37:00.913

6

melt() from the reshape2 package gets you close ...

library(reshape2)
(res <- melt(as.data.frame(mat), id="time"))
#   time variable value
# 1  0.0      C_0   0.1
# 2  0.5      C_0   0.2
# 3  1.0      C_0   0.3
# 4  0.0      C_1   0.3
# 5  0.5      C_1   0.4
# 6  1.0      C_1   0.5

... although you may want to post-process its results to get your preferred column names and ordering.

setNames(res[c("variable", "time", "value")], c("name", "time", "val"))
#   name time val
# 1  C_0  0.0 0.1
# 2  C_0  0.5 0.2
# 3  C_0  1.0 0.3
# 4  C_1  0.0 0.3
# 5  C_1  0.5 0.4
# 6  C_1  1.0 0.5

edited Apr 08 '13 at 17:37

answered Apr 08 '13 at 17:31

Josh O'Brien

159,210
26
366
455

Thanks! I tried `melt(mat, id="time")` after the comments above but couldn't figure out that I need `as.data.frame(mat)` – user1981275 Apr 08 '13 at 17:39

score 1 · Answer 4 · answered Mar 10 '18 at 02:29

1

Using dplyr and tidyr:

library(dplyr)
library(tidyr)

df <- as_data_frame(mat) %>%      # convert the matrix to a data frame
  gather(name, val, C_0:C_1) %>%  # convert the data frame from wide to long
  select(name, time, val)         # reorder the columns

df
# A tibble: 6 x 3
   name  time   val
  <chr> <dbl> <dbl>
1   C_0   0.0   0.1
2   C_0   0.5   0.2
3   C_0   1.0   0.3
4   C_1   0.0   0.3
5   C_1   0.5   0.4
6   C_1   1.0   0.5

answered Mar 10 '18 at 02:29

sbha

9,802
2
74
62

1

Any tips for how to do this on one of the triangles of a matrix? I have a mirrored matrix so I end up with duplicate values. using upper.tri strips out the names. – jzadra May 31 '18 at 18:19
1

Hmm - not sure off the top of my head. That might be better broken off as its own separate question with some example data. – sbha May 31 '18 at 22:21

score 0 · Answer 5 · edited Aug 01 '23 at 15:01

0

I've found the following "cheat" to work very neatly and error-free

dimnames <- list(time=c(0, 0.5, 1), name=c("C_0", "C_1"))
mat <- matrix(data, ncol=2, nrow=3, dimnames=dimnames)
head(mat, 2) #this returns the number of rows indicated in a data frame format
df <- data.frame(head(mat, 2)) #"data.frame" might not be necessary

Et voila!

edited Aug 01 '23 at 15:01

Amit Kohli

2,860
2
24
44

answered Dec 28 '16 at 20:30

Gregory

121
1
5

score 0 · Answer 6 · answered Mar 30 '22 at 00:09

A Renewed Approach using pivot_longer from the tidyr package and dplyr syntax. For get the same output

> mat <- matrix(c(0, 0.5, 1, 0.1, 0.2, 0.3, 0.3, 0.4, 0.5),
+               ncol = 3, nrow = 3,
+               dimnames = list(NULL, c("time", "C_0", "C_1")))
> mat
     time C_0 C_1
[1,]  0.0 0.1 0.3
[2,]  0.5 0.2 0.4
[3,]  1.0 0.3 0.5

mat %>% as_tibble() %>%
  pivot_longer(cols=-time) %>%
  select(name,time,value) %>% 
  arrange(name)  

# A tibble: 6 x 3
  name   time value
  <chr> <dbl> <dbl>
1 C_0     0     0.1
2 C_0     0.5   0.2
3 C_0     1     0.3
4 C_1     0     0.3
5 C_1     0.5   0.4
6 C_1     1     0.5

Create dataframe from a matrix

6 Answers6

Linked

Related