-1

I am trying to apply dcast on long table, continua from the thread answer How to get this data structure in R? Code

dat.m <- structure(c(150L, 60L, 41L, 61L, 0L, 0L), .Dim = c(3L, 2L), .Dimnames = list(
    c("ave_max", "ave", "lepo"), NULL))
library("ggplot2")
library("data.table")
dat.m <- melt(as.data.table(dat.m, keep.rownames = "Vars"), id.vars = "Vars") # https://stackoverflow.com/a/44128640/54964
dat.m

print("New step")
# http://stackoverflow.com/a/44090815/54964
minmax <- dat.m[dat.m$Vars %in% c("ave_max","lepo"), ]
absol  <- dat.m[dat.m$Vars %in% c("ave"), ]
#minm   <- dcast(minmax, Vars ~ variable)
minm   <- dcast(minmax, Vars ~ ...)
absol <- merge(absol, minm, by = "Vars", all.x = T)

absol

#Test function    
ggplot(absol, aes(x = Vars, y = value, fill = variable)) +
       geom_bar(stat = "identity") +
       geom_errorbar(aes(ymin = lepo, ymax = ave_max), width = .25)

Output

dcast, melt

      Vars variable value
1: ave_max       V1   150
2:     ave       V1    60
3:    lepo       V1    41
4: ave_max       V2    61
5:     ave       V2     0
6:    lepo       V2     0
[1] "New step"
   Vars variable value V1 V2
1:  ave       V1    60 NA NA
2:  ave       V2     0 NA NA
Error in FUN(X[[i]], ...) : object 'lepo' not found
Calls: <Anonymous> ... by_layer -> f -> <Anonymous> -> f -> lapply -> FUN -> FUN
Execution halted

Expected output: to pass the test function ggplot

Testing Uwe's proposal

Aim is to get to this data structure

dat.m <- structure(c(150L, 60L, 41L, 61L, 0L, 0L), .Dim = c(3L, 2L), .Dimnames = list(c("ave_max", "ave", "lepo"), NULL)) 

from this data structure

dat.m <- structure(list(ave_max = c(15L, 6L), ave = c(6L, NA), lepo = c(4L, NA)), .Names = c("ave_max", "ave", "lepo"), class = "data.frame", row.names = c(NA, -2L))

Attempts

dat.m <- structure(list(ave_max = c(15L, 6L), ave = c(6L, NA), lepo = c(4L, NA)), .Names = c("ave_max", "ave", "lepo"), class = "data.frame", row.names = c(NA, -2L))

# ...
  1. Code and output

    dat.m <- setDT(dat.m)
    

    Output wrong

            ave_max      ave      lepo
    1:           15        6         4
    2:            6       NA        NA
    Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
      $ ave_max: int  15 6
      $ ave    : int  6 NA
      $ lepo   : int  4 NA
      - attr(*, ".internal.selfref")=<externalptr> 
    
  2. Code and output

    dat.m <- as.matrix(dcast(melt(setDT(dat.m), measure.vars = names(dat.m)), variable ~ rowid(variable))[, variable := NULL]); 
    dimnames(dat.m) <- list(names(dat.m), NULL);
    

    Output wrong

     Error in `:=`(variable, NULL) : 
    Check that is.data.table(DT) == TRUE. Otherwise, := and `:=`(...) are defined for use in j, once only and in particular ways. 
     See help(":=").
    

R: 3.4.0 (backports)
OS: Debian 8.7.

Léo Léopold Hertz 준영
  • 134,464
  • 179
  • 445
  • 697
  • 1
    Please, remove the call to `library("reshape2")` as `data.table` has it's own fast implementations of `melt()` and `dcast()`. – Uwe May 23 '17 at 10:12
  • And also see my [answer](https://stackoverflow.com/a/44131736/3817004) to your question [How to do histograms of this row-column table in R ggplot?](https://stackoverflow.com/q/44030346/3817004). – Uwe May 23 '17 at 10:14
  • It is very difficult to answer your questions if you (1) edit essential parts of your Q, (2) use the same variable name `dat.m` for different objects. Better to stay with `dat.m` for a matrix object and `dat.df` for a data.frame object. My answer below has been tested to produce the result as shown from the given data. – Uwe May 23 '17 at 16:40

2 Answers2

1

The OP has supplied data as a matrix:

dat.m <- structure(c(150L, 60L, 41L, 61L, 0L, 0L), .Dim = c(3L, 2L), .Dimnames = list(
  c("ave_max", "ave", "lepo"), NULL))

#    dat.m
#        [,1] [,2]
#ave_max  150   61
#ave       60    0
#lepo      41    0
class(dat.m)
#[1] "matrix"

For this data set, the OP wants to use ggplot2 to create a bar chart with error bars where the height of the bars is given by the values of ave and the lower and upper limits of the error bars by lepo and ave_max, resp., in each column.

As ggplot2 expects data to be supplied as data.frame the data needs to be transformed. For this, data.table is used:

library(data.table)   # CRAN version 1.10.4 used

# convert to data.table & transpose
transposed <- dcast(melt(as.data.table(dat.m, keep.rownames = "Vars"), 
                         id.vars = "Vars"), variable ~ ...)
setnames(transposed, "variable", "Vars")

library(ggplot2)
ggplot(transposed, aes(x = Vars, y = ave, ymin = lepo, ymax = ave_max)) +
  geom_col() +
  geom_errorbar(width = .25)
Uwe
  • 41,420
  • 11
  • 90
  • 134
  • 1
    @LéoLéopoldHertz준영 Try `dat.m2 <- as.matrix(dcast(melt(setDT(dat.df), measure.vars = names(dat.df)), variable ~ rowid(variable))[, variable := NULL]); dimnames(dat.m2) <- list(names(dat.df), NULL); dat.m2` – Uwe May 23 '17 at 14:54
  • 1
    @LéoLéopoldHertz준영 How is `DT` created? What returns `str(DT)`? – Uwe May 23 '17 at 15:15
  • @LéoLéopoldHertz준영 You need to coerce the data.frame to data.table using `setDT(dat.m)`. – Uwe May 23 '17 at 15:45
1

The OP has edited his question and is supplying the data as a data.frame:

dat.df <- structure(list(ave_max = c(15L, 6L), ave = c(6L, NA), lepo = c(4L, NA)), 
                    .Names = c("ave_max", "ave", "lepo"), class = "data.frame", 
                    row.names = c(NA, -2L))

dat.df
#  ave_max ave lepo
#1      15   6    4
#2       6  NA   NA
class(dat.df)
#[1] "data.frame"

He is now asking to transform this data.frame into a matrix which is similar to the one used as input data in this answer.

This can be achieved by using data.table:

library(data.table)   # CRAN version 1.10.4 used
# transpose the input data frame, use rowid() to create columns,
# remove a character column to ensure matrix will be of type integer,
# finally, coerce to matrix
dat.m2 <- as.matrix(
  data.table::dcast(
    data.table::melt(setDT(dat.df), measure.vars = names(dat.df)), 
    variable ~ rowid(variable)
  )[, variable := NULL]
)
# add row names, remove column names
dimnames(dat.m2) <- list(names(dat.df), NULL)

dat.m2
#        [,1] [,2]
#ave_max   15    6
#ave        6   NA
#lepo       4   NA

str(dat.m2)
# int [1:3, 1:2] 15 6 4 6 NA NA
# - attr(*, "dimnames")=List of 2
#  ..$ : chr [1:3] "ave_max" "ave" "lepo"
#  ..$ : NULL

class(dat.m2)
#[1] "matrix"

Edit: I've amended above code to use the double colon operator to explicitely state the namespace from which melt() and dcast() should be taken. Normally, this wouldn't be necessary as data.table is already loaded. However, the OP is reporting issues which might be caused by package reshape2 being loaded after data.table. The data.table package has it's own faster implementations of reshape2::dcast() and reshape2::melt(). When both packages have been loaded for some reason name clashes might occur.

Uwe
  • 41,420
  • 11
  • 90
  • 134
  • 1
    Make sure that the input data set is a data.table or a data.frame which is coerced by using `setDT()`. In addition, I respectfully suggest to disclose the full context of your various questions. For instance, the link you cited is not about using `data.table` within a function but within a function which is part of a package. – Uwe May 23 '17 at 20:31
  • Found my mistake: `library(reshape2)` in another function, complicating the case. Maybe, I should start using some API to limit such events in the future; etc to say which packages are acceptable in functions. Is that possible? – Léo Léopold Hertz 준영 May 23 '17 at 22:16
  • 1
    Where possible, I prefer to use the double colon operator, e.g., `stringr::str_replace()` rather than loading the whole package, e.g., `library(stringr)` and somewhere else in the code `str_replace()`. So, it might help to write `data.table::melt()`or `data.table::dcast()` just in case `library(reshape2)` has been loaded. – Uwe May 23 '17 at 23:04