Very well written question.
I see you made example based on ?as.data.cube
examples so I will try to answer your question using that examples too
# Original example goes as follows
library(data.cube)
library(data.table)
set.seed(1L)
dt = CJ(color = c("green","yellow","red"),
year = 2011:2015,
status = c("active","inactive","archived","removed"))[sample(30)]
dt[, "value" := sample(4:7/2, nrow(dt), TRUE)]
dc = as.data.cube(
x = dt, id.vars = c("color","year","status"),
measure.vars = "value",
hierarchies = sapply(c("color","year","status"),
function(x) list(setNames(list(character()), x)),
simplify=FALSE)
)
str(dc)
Your error seems to be raised when checking validity of hierarchies.
Unfortunately it is not very meaningful error, I created issue #18 so this will get improved one day.
So lets compare hierarchies from manual and those created in your example.
sapply(c("color","year","status"),
function(x) list(setNames(list(character()), x)),
simplify=FALSE) -> h
str(h)
#List of 3
# $ color :List of 1
# ..$ :List of 1
# .. ..$ color: chr(0)
# $ year :List of 1
# ..$ :List of 1
# .. ..$ year: chr(0)
# $ status:List of 1
# ..$ :List of 1
# .. ..$ status: chr(0)
hierarchies = list(time <- list("year, month"), color <- list("color"),
status <- list("status"))
str(hierarchies)
#List of 3
# $ :List of 1
# ..$ : chr "year, month"
# $ :List of 1
# ..$ : chr "color"
# $ :List of 1
# ..$ : chr "status"
We can see that hierarchies in manual is a list of named elements, and your example is a list of unnamed elements.
I believed you misused <-
where =
should be used. <-
are not always equal to =
operator. You can read more about exactly such case in 3.1.3.1 Assignment <-
vs =
.
So lets see if fixing that is sufficient
hierarchies = list(time = list(c("year, month")), color = list("color"),
status = list("status"))
dc <- as.data.cube(dt, id.vars = c("color", "year", "month", "status"),
measure.vars = "value",
hierarchies = hierarchies)
We still have the same error, so names while were required, where not the root cause of the issue. After taking closer look I see now you want to build time dimension not having primary key for it.
Important note that you cannot pass multiple column names as single string thus
"year, month"
should be written as
c("year","month")
Still we need time dimension primary key to be single field, to which year and month will be just attributes.
So lets make primary key for time dimension then, as our time dimension has year-month granularity we will create key on that granularity.
library(data.table)
set.seed(42)
dt <- CJ(color = c("green","yellow","red"),
year = 2011:2015,
month = 1:12,
status = c("active","inactive","archived","removed")
)[sample(600)
][, yearmonth:=sprintf("%04d%02d", year, month) # this ensure four numbers for year and 2 numbers for month
]
dt[, "value" := sample(4:7/2, nrow(dt), TRUE)]
Now lets do hierarchies, note that year
has been changed to yearmonth
.
In below hierarchies a vector of values c("year","month")
means that those attributes are dependent on yearmonth
. Please see more examples in ?as.data.cube
for more complex cases of hierarchies.
hierarchies = list(
color = list(color = list(color = character())),
yearmonth = list(yearmonth = list(yearmonth = c("year","month"))),
status = list(status = list(status = character()))
)
dc = as.data.cube(
x = dt, id.vars = c("color","yearmonth","status"),
measure.vars = "value",
hierarchies = hierarchies
)
str(dc)
Our data.cube
has been successfully created. Lets try to query it using key of yearmonth
dc[, .(yearmonth=201105L)] -> d
as.data.table(d)
dc[, .(yearmonth=201105L), drop=FALSE] -> d
as.data.table(d)
Now try to query it using attributes of dimension, a year, and a month, and both
dc[, .(year=2011L)] -> d
as.data.table(d) # note that dimension is not being dropped because it still have more than 1 value
dc[, .(month=5L)] -> d
as.data.table(d)
dc[, .(year=2011L, month=5L)] -> d
as.data.table(d) # here dimension has been dropped because there was only single element in that dimension, you can of course use `drop=FALSE` if needed.
Hope that helps, good luck!