-1

I was reading the zoo FAQs, and came across something that I found surprising.

A "zoo" object may be (1) a numeric vector, (2) a numeric matrix or (3) a factor but may not contain both a numeric vector and factor.

Is it unreasonable to expect this to hold? And what are the reasons that this cannot be implemented in zoo? Basically, I would like to think of a zoo object as a dataframe with time ordering.

tchakravarty
  • 10,736
  • 12
  • 72
  • 116
  • Any chance you can show us some code? Like, how you create a zoo object and how you would like to create one? – Spacedman Dec 28 '12 at 10:45
  • 3
    zoo was intended to generalize `"ts"` class in R to irregularly spaced series with arbitrary index class. `"ts"` class is also based on matrices. One of the reasons to stick to matrices is that operations on matrices in R are much faster than on data.frames. If your non-numeric data represents IDs of some sort then they probably identify separate series anyways. In that case, the `split=` arg in `read.zoo` handles that. Workarounds include separate objects for each class, converting factors to numeric (and maintaining the level information elsewhere) or using some other representation. – G. Grothendieck Dec 28 '12 at 14:01
  • @G.Grothendieck Thanks Gabor. Your answer is exactly the kind of background I was looking for. If you make it an answer, I will mark it. – tchakravarty Dec 28 '12 at 17:02
  • 2
    The background @G.Grothendieck provided is in the zoo vignette, [*zoo: An S3 Class and Methods for Indexed Totally Ordered Observations* (PDF)](http://cran.r-project.org/web/packages/zoo/vignettes/zoo.pdf). – Joshua Ulrich Dec 28 '12 at 17:51
  • @JoshuaUlrich Thanks. I have been reading all the zoo docs, since I anticipate using it extensively in an upcoming project. I will read the vignette as well. – tchakravarty Dec 28 '12 at 17:55
  • @fgnu : If no-one else bothers, you're welcome to collect Gabor's comment and Josh Ulrich's documentation link and post them as an answer yourself, if there's information there that's not in any of the other answers (although it would be polite to wait a few hours and see if they want to do it themselves) – Ben Bolker Dec 28 '12 at 18:06

2 Answers2

10

zoo objects are a matrix with an index attribute. Therefore, you cannot mix types in zoo for the same reason you cannot mix types in a matrix (i.e. a matrix is just a vector with a dim attribute and you can't mix types in a vector).

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
4

You write

Basically, I would like to think of a zoo object as a dataframe with time ordering.

and you are simply off-base here. "Wishing alone" does not make it so. In a nutshell, zoo and xts can cope with a numeric matrix (or vector as special case, both really are vectors with/without dimension attributes) and the factor is already a stretch.

For all the years that zoo existed, data.frame was never a supported data type and will never be due to internal architectural and implementation choices. Performance on data.frame objects is also worse.

But you could consider data.table as an alternative.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • "Is it unreasonable to expect this to hold? And what are the reasons that this cannot be implemented in `zoo`?" – tchakravarty Dec 28 '12 at 16:22
  • Yes there are. But if you find a way, feel free to send a patch. – Dirk Eddelbuettel Dec 28 '12 at 16:28
  • You are really original which is why it is your birthright to ask a question that has been asked a 100+ times on r-help, r-sig-finance, SO, ... and expect us all to jump in an repeat it all again, just for you as your are so special. Matrix it is at the core of zoo (and xts). – Dirk Eddelbuettel Dec 28 '12 at 17:04
  • Dirk, with all due respect, you chose to answer an unoriginal question. You could have left out all the irrelevant stuff and answered tersely, or even left a link in the commentspace. I guess I never saw the point of answering with animosity and disrespect, so characteristic of the R-list. – tchakravarty Dec 28 '12 at 17:09
  • Dirk, didn't mean to dredge up an ancient question, but would it be right to say that what I asked in this question is exactly what the Python `pandas.DataFrame` provides with its mixed `dtypes` array, and the `index` attribute? Not meaning to rake up an argument, simply looking for your opinion to the extent that you might be familiar with the `pandas.DataFrame` object. Thanks, as always. – tchakravarty Apr 12 '15 at 14:18