2

I have a data frame with columns of years, months, days, and hours. I want to add a column in which each row represents the POSIXlt object defined by the years, months, days, and hours on that row.

The conversion for each row is straightforward, for example:

library(dplyr)
library(string)
library(lubridate)
years <- 2022
months <- 2
day <- 25
hours <- 13
pos_times <- c(years, months, day, hours) %>%  
                 str_c(collapse = " ") %>%  
                 ymd_h  
pos_times %>% 
  str
pos_times

And that yields the following correct output:

> pos_times %>% 
+   str
 POSIXct[1:1], format: "2022-02-25 13:00:00"
> pos_times
[1] "2022-02-25 13:00:00 UTC"

But when I try the operation for more than one set of years, months, days, hours to produce the new column as a vector, I use the following code:

df <- data.frame(years = c(2010, 2011),
                 month = c(11, 12),
                 day = c(1, 2),
                 hour = c(3, 5))

N <- nrow(df)
vec_time <- rep(NA, N) 
for(i in 1:N){
  pos_time <- (df[i, 1:4]) %>%  
    str_c(collapse = " ") %>%  
    ymd_h  
  print(paste("Structure of calculated object for row number", i))
  pos_time %>% str
  vec_time[i] <- pos_time
}
print("Structure of vector of calculated objects")
vec_time %>% 
  str


Its output is wrong:

[1] "Structure of calculated object for row number 1"
 POSIXct[1:1], format: "2010-11-01 03:00:00"
[1] "Structure of calculated object for row number 2"
 POSIXct[1:1], format: "2011-12-02 05:00:00"
> print("Structure of vector of calculated objects")
[1] "Structure of vector of calculated objects"
> vec_time %>% 
+   str
 num [1:2] 1.29e+09 1.32e+09

In the calculation in each line pos_time is again correctly shown as a POSIXlt object, but the values of the vector vec_time are numeric.

I realise that a POSIXlt object is made from just a number but I want my data frame to show the POSIXlt objects as such.

JeremyC
  • 445
  • 2
  • 14

2 Answers2

1

The ymd_h() function returns the date-time objects as POSIXct objects, which are stored as numeric values representing the number of seconds since the Unix epoch (January 1, 1970, 00:00:00 UTC). Using format your code will work:

Change your code to this:

N <- nrow(df)
vec_time <- rep(NA_character_, N) 

for(i in 1:N){
  pos_time <- ymd_h(paste(df[i, 1:4], collapse = "-"), tz = "UTC")
  vec_time[i] <- format(pos_time, format = "%Y-%m-%d %H:%M:%S")
}

df$datetime <- vec_time
df

I would suggest this:

library(dplyr)
library(lubridate)

df %>% 
  mutate(datetime= ymd_h(paste(years, month, day, hour, sep = "-")))

  years month day hour            datetime
1  2010    11   1    3 2010-11-01 03:00:00
2  2011    12   2    5 2011-12-02 05:00:00
TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    Very many thanks for such a quick response. I don't understand why your code works and mine doesn't. Just one of life's little mysteries! – JeremyC Feb 25 '23 at 13:46
0

You can use ISOdate from base, a date-time conversion function from numeric representations:

library(dplyr)

df %>%
  mutate(datetime = ISOdate(years, month, day, hour))

# # A tibble: 2 × 5
#   years month   day  hour datetime           
#   <dbl> <dbl> <dbl> <dbl> <dttm>             
# 1  2010    11     1     3 2010-11-01 03:00:00
# 2  2011    12     2     5 2011-12-02 05:00:00
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
  • Thank you for drawing my attention to ISOdate. I had not heard of it. I see that it is a wrapper for strptime. I had tried to use that function but for some reason could not get it to work as I wanted. – JeremyC Feb 26 '23 at 15:25