0

I am trying to find Wanted data structure from the current data structure. I know the schematics of the expected data structure partially. The wanted data structure includes one more list(...) and factor class. Current data structure

> print(dat.m)

         [,1] [,2]
ave_max  150   61
ave       60    0
lepo      41    0

dat.m <- structure(c(150L, 60L, 41L, 61L, 0L, 0L), .Dim = c(3L, 2L), .Dimnames = list(
    c("ave_max", "ave", "lepo"), NULL))

Wanted data structure

> print(dat.m)

     Vars    M1    M2 
1 ave_max   150    61 
2 ave        60     0 
3 lepo       41     0 

I know it is schematically something close to the following where unknown structure(c(...) and row.names = c(...)

structure(list(Vars = structure(c(...), .Label = c("ave_max", 
"ave", "lepo"), class = "factor"), M1 = c(150, 60, 
41), M2 = c(61, 0, 0)), .Names = c("Vars", "ave_max", "ave", 
"lepo"), class = "data.frame", row.names = c(...))

R: 3.4.0 (backports)
OS: Debian 8.7

Léo Léopold Hertz 준영
  • 134,464
  • 179
  • 445
  • 697

2 Answers2

1

We can use tidyverse

library(tidyverse)
dat.m %>% 
    as.data.frame() %>% 
    rownames_to_column('Vars') %>%
    rename(M1 = V1, M2 = V2)
#     Vars  M1 M2
#1 ave_max 150 61
#2     ave  60  0
#3    lepo  41  0

If we need to use data.table

library(data.table)
setnames(setDT(as.data.frame(dat.m), keep.rownames = TRUE), c('Vars', 'M1', 'M2'))[]
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 1
    @LéoLéopoldHertz준영 I updated with a `data.table` method. I am not sure why you get that message. I have data.table_1.10.5 and the devel version of dplyr – akrun May 21 '17 at 17:23
  • Assume you have `n` amount of events, here only two. How can you expand `setnames(setDT(as.data.frame(...))` to cover `M1`, `M2`, ..., `Mn`? – Léo Léopold Hertz 준영 May 21 '17 at 20:25
  • 1
    @LéoLéopoldHertz준영 You can do `setnames(setDT(as.data.frame(...)), c('Vars', paste0('M', seq_len(n))))` – akrun May 22 '17 at 04:12
1

If you don't insist on M1, M2, etc. as column names, there is an even shorter data.table solution:

library(data.table)   # CRAN version 1.10.4 used
as.data.table(dat.m, keep.rownames = "Vars")
#      Vars  V1 V2
#1: ave_max 150 61
#2:     ave  60  0
#3:    lepo  41  0

If you do insist on M1, M2, etc. as column names and your matrix dat.m has many columns, the columns can be renamed:

DT <- as.data.table(dat.m, keep.rownames = "Vars")
setnames(DT, stringr::str_replace(names(DT), "^V(?=\\d+$)", "M"))
DT
#      Vars  M1 M2
#1: ave_max 150 61
#2:     ave  60  0
#3:    lepo  41  0

The regular expression uses a look-ahead assertion to ensure that only columns starting with V and immediately followed and ended by at least one digit are changed. Others like Vars, V, V17b, VV3 aren't touched.


If your matrix has many columns and the purpose of your operation is not just to have nice column headers for printing, you may consider to reshape your data from wide to long form. The long form is preferred by ggplotfor instance.

DT_long <- melt(as.data.table(dat.m, keep.rownames = "Vars"), id.vars = "Vars")
DT_long
#      Vars variable value
#1: ave_max       V1   150
#2:     ave       V1    60
#3:    lepo       V1    41
#4: ave_max       V2    61
#5:     ave       V2     0
#6:    lepo       V2     0

In long form, it is often easier to manipulate your data, for instance, to rename the columns:

DT_long[, variable := stringr::str_replace(variable, "^V", "M")]
DT_long
#      Vars variable value
#1: ave_max       M1   150
#2:     ave       M1    60
#3:    lepo       M1    41
#4: ave_max       M2    61
#5:     ave       M2     0
#6:    lepo       M2     0

Finally, you can reshape from long to wide form again

dcast(DT_long, Vars ~ ...)
#      Vars  M1 M2
#1:     ave  60  0
#2: ave_max 150 61
#3:    lepo  41  0

Note that the cast formula recognizes two special variables: . and .... . represents no variable; ... represents all variables not otherwise mentioned in formula. (See ?data.table::dcast for details).

Uwe
  • 41,420
  • 11
  • 90
  • 134
  • 1
    @LéoLéopoldHertz준영 I've added an explanation of `...`. – Uwe May 23 '17 at 09:23
  • 1
    Difficult to answer properly without seeing the data (See [mcve]). Please, post a new question and include the result of `dput(DT_long)`. Thank you. – Uwe May 23 '17 at 09:56