10

I want to convert a numeric variable to POSIXct using anytime. My issue is that anytime(<numeric>) converts the input variable as well - I want to keep it.

Simple example:

library(anytime)
t_num <- 1529734500
anytime(t_num)
# [1] "2018-06-23 08:15:00 CEST"
t_num
# [1] "2018-06-23 08:15:00 CEST"

This differs from the 'non-update by reference' behaviour of as.POSIXct in base R:

t_num <- 1529734500
as.POSIXct(t_num, origin = "1970-01-01")
# [1] "2018-06-23 08:15:00 CEST"
t_num
# 1529734500

Similarly, anydate(<numeric>) also updates by reference:

d_num <- 17707
anydate(d_num)
# [1] "2018-06-25"
d_num
# [1] "2018-06-25"

I can't find an explicit description of this behaviour in ?anytime. I could use as.POSIXct as above, but does anyone know how to handle this within anytime?

Henrik
  • 65,555
  • 14
  • 143
  • 159
  • Just noticing your `t_num` is not _really_ real (e.g. `data.table:::isReallyReal(t_num)`), i.e., you can just replace it with `1529734500L` :) – MichaelChirico Jun 25 '18 at 02:54
  • just that `class(1529734500)` is `numeric` but `class(1529734500L)` is not, so you can eschew the `1*`/`0+` kludges by starting with an integer – MichaelChirico Jun 25 '18 at 10:32
  • Enforcing the 'it is a cast so you get a copy'. And of course 1529734500 _is_ `numeric`, but as you point, `data.table` reminds us we can express _the same value_ using an `integer`. – Dirk Eddelbuettel Jun 25 '18 at 11:33

5 Answers5

10

anytime author here: this is standard R and Rcpp and passing-by-SEXP behaviour: you cannot protect a SEXP being passed from being changed.

The view that anytime takes is that you are asking for an input to be converted to a POSIXct as that is what anytime does: from char, from int, from factor, from anything. As a POSIXct really is a numeric value (plus a S3 class attribute) this is what you are getting.

If you do not want this (counter to the design of anytime) you can do what @Moody_Mudskipper and @PKumar showed: used a temporary expression (or variable).

(I also think the data.table example is a little unfair as data.table -- just like Rcpp -- is very explicit about taking references where it can. So of course it refers back to the original variable. There are idioms for deep copy if you need them.)

Lastly, an obvious trick is to use format if you just want different display:

R> d <- data.frame(t_num=1529734500)
R> d[1, "posixct"] <- format(anytime::anytime(d[1, "t_num"]))
R> d
       t_num             posixct
1 1529734500 2018-06-23 01:15:00
R> 

That would work the same way in data.table, of course, as the string representation is a type change. Ditto for IDate / ITime.

Edit: And the development version in the Github repo has had functionality to preserve the incoming argument since June 2017. So the next CRAN version, whenever I will push it, will have it too.

Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • See my updated answer showing that it works differently with integer inputs – moodymudskipper Jun 24 '18 at 14:26
  • Great explanation, especially for understanding why this only affects _numeric_ input to `anytime`. It would be helpful to add this to `?anytime`. – MichaelChirico Jun 24 '18 at 14:26
  • 1
    @Moody_Mudskipper: That is _precisely_ the gotcha example we have shown a number of times with `Rcpp`: input of `int` to `numeric` function does a cast, hence a copy, hence no change on (now copied) input! – Dirk Eddelbuettel Jun 24 '18 at 14:28
  • @MichaelChirico: Sure. Where? Under 'Details' ? – Dirk Eddelbuettel Jun 24 '18 at 14:28
  • maybe Note? and/or a quick example: `t = 0; anytime(t); t` – MichaelChirico Jun 24 '18 at 14:32
  • @Dirk what about a `numbyref` logical argument to make it 100% explicit ? – moodymudskipper Jun 24 '18 at 14:39
  • @Henrik: As I tried to explain, that is _baked into_ R and Rcpp: int and char are different types, and when `POSIXct` comes out at the end they are not affected as they are distinct copies -- `numeric` gets converted for efficiency reasons. You generally do not want spurious copies. – Dirk Eddelbuettel Jun 24 '18 at 14:43
  • @Moody_Mudskipper: No, not worth it. We very rarely see `numeric` as input anyway. – Dirk Eddelbuettel Jun 24 '18 at 14:44
  • 1
    Given this inconsistency, any reason not to throw `is.numeric` cases into `.POSIXct`? From what I can tell the speed is ~ the same. – MichaelChirico Jun 24 '18 at 14:53
  • It is not an inconsistency. It is design (of R and its `SEXP`), and how a typed language works. – Dirk Eddelbuettel Jun 24 '18 at 15:33
  • One small question Dirk: In the light of "_It is design (of R and its SEXP) and how a typed language works_", do you mind to elaborate on the difference in update-by-reference behavior between `anytime()` and its `base` analogue `as.POSIXct()`? To me, this may seem more like a design choice of `anytime` (rather than of R and its SEXP)? Which would be perfectly fine! Cheers – Henrik Jun 24 '18 at 16:49
  • 1
    `POSIXct` is not a native `SEXP` type; it is a `numeric` with an S3 class attribute. Hence "basically the same" at the C++ level, – Dirk Eddelbuettel Jun 24 '18 at 16:50
  • @DirkEddelbuettel FYI, I removed the `data.table` stuff from my Q (not needed to illustrate my point), so you may want to do a corresponding edit of your answer. Cheers – Henrik Jun 24 '18 at 17:24
  • 1
    The observed behavior for non-`numeric` vs. `numeric` input is different. That's the very definition of inconsistency (whether it's by design or not is orthogonal). Anyway as long as it's clearly stated in the documentation the design is up to you. – MichaelChirico Jun 25 '18 at 02:54
  • I just tried to look into adding this functionality and ... realized younger me already did a year ago. Just use the development version from github. – Dirk Eddelbuettel Jun 25 '18 at 22:54
  • @DirkEddelbuettel Thanks! I assume you didn't see my 'self-answer' when you made your edit ;) – Henrik Jun 26 '18 at 10:20
2

You could hack it like this:

library(anytime)
t_num <- 1529734500
anytime(t_num+0)
# POSIXct[1:1], format: "2018-06-23 08:15:00"
t_num
# [1] 1529734500

Note that an integer input will be treated differently:

t_int <- 1529734500L
anytime(t_int)
# POSIXct[1:1], format: "2018-06-23 08:15:00"
t_int
# [1] 1529734500
moodymudskipper
  • 46,417
  • 11
  • 121
  • 167
2

If you do this, it will work :

t_num <- 1529734500
anytime(t_num*1)

#> anytime(t_num*1)
#[1] "2018-06-23 06:15:00 UTC"
#> t_num
#[1] 1529734500
PKumar
  • 10,971
  • 6
  • 37
  • 52
2

Any reason to be married to anytime?

.POSIXct(t_num, tz = 'Europe/Berlin')
# [1] "2018-06-23 08:15:00 CEST"

.POSIXct(x, tz) is a wrapper for structure(x, class = c('POSIXct', 'POSIXt'), tzone = tz) (i.e. you can ignore declaring the origin), and is essentially as.POSIXct.numeric (except the latter is flexible in allowing non-UTC origin dates), look at print(as.POSIXct.numeric).

MichaelChirico
  • 33,841
  • 14
  • 113
  • 198
2

When I did my homework before posting the question, I checked the open anytime issues. I have now browsed the closed ones as well, where I found exactly the same issue as mine:

anytime is overwriting inputs

There the package author writes:

I presume because as.POSIXct() leaves its input alone, we should too?

So from anytime version 0.3.1 (unreleased):

Numeric input is now preserved rather than silently cast to the return object type


Thus, one answer to my question is: "wait for 0.3.1"*.

When 0.3.1 is released, the behaviour of anytime(<numeric>) will agree with anytime(<non-numeric>) and as.POSIXct(<numeric>), and work-arounds not needed.


*Didn't have to wait too long: 0.3.1 is now released: "Numeric input is now preserved rather than silently cast to the return object type"

Henrik
  • 65,555
  • 14
  • 143
  • 159