0

This is my setup: I have an excel-file with hourly electricity prices. I want to index them by the hourly interval, file here: Data. I load the data the usual way.

library(readxl)
library(tidyverse)

rm(list = ls())

DK1 <- read_excel("DK1.xlsx") 
time_index <- as.POSIXct(DK1$Datetime, format="%Y/%m/%d %H:%M:%S", tz=Sys.timezone())
test <- xts(DK1[,-1], order.by = time_index)

This is just one of many ways I've tried to index it in XTS to no avail. The index row looks wrong and I do not know what to do.

enter image description here

UPDATE 1: dput(head(DK1))

enter image description here

manotheshark
  • 4,297
  • 17
  • 30
NotPzl
  • 21
  • 4
  • Please check your `format`. The `DateTime` columns shows `%Y-%m-%d %H:%M:%S` format – akrun Feb 10 '20 at 15:52
  • You put slashes ("/") between year, month, day in your `format=` parameter but your data appears to have dashes ("-"). Those need to match! – MrFlick Feb 10 '20 at 15:53
  • 1
    Please don't use images a I can't copy the dput. – akrun Feb 10 '20 at 16:12

2 Answers2

1

It appears that read_excel is converting your time column into a datetime, but with all the dates set to "1899-12-31". This can be seen by running:

> str(DK1)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   8760 obs. of  6 variables:
 $ Date      : POSIXct, format: "2019-01-01" "2019-01-01" "2019-01-01" "2019-01-01"...
 $ Hours     : POSIXct, format: "1899-12-31 00:00:00" "1899-12-31 01:00:00" "1899-12-31 02:00:00" "1899-12-31 03:00:00" ...
 $ Datetime  : chr  "2019-01-01 00:00:00" "2019-01-01 01:00:00" "2019-01-01 02:00:00" "2019-01-01 03:00:00" ...
 $ DK1       : num  211.5 75.2 -30.5 -74 -55.3 ...

This is more of a data import problem and the Datetime concat in excel can be performed in R. Generally it's simpler to have all data manipulation performed in a single spot.

library(readxl)
library(xts)

DK1 <- read_excel("DK1.xlsx") 
# pasting date and time together in new column name for comparison
# note the use of strftime to remove the date information discussed earlier
DK1$Datetime2 <- paste(DK1$Date, strftime(DK1$Hours, "%H:%M:%S", tz = "UTC"))
# the format / in excel need to change to - for how it's displayed in R
DK1$time_index <- as.POSIXct(DK1$Datetime, format = "%Y-%m-%d %H:%M:%S", tz = Sys.timezone())

# filtering out the NA value of 2019-03-10 02:00:00 which is when daylight savings occurred
DK1 <- DK1[!is.na(DK1$time_index), ]
DK1a <- xts(DK1[, "DK1"], order.by = DK1$time_index)

> head(DK1a)
                       DK1
2019-01-01 00:00:00 211.48
2019-01-01 01:00:00  75.20
2019-01-01 02:00:00 -30.47
2019-01-01 03:00:00 -74.00
2019-01-01 04:00:00 -55.33
2019-01-01 05:00:00 -93.72
manotheshark
  • 4,297
  • 17
  • 30
0

We can select the numeric column and then order.by the 'Date' which is already a Datetime class

library(xts)
xts(DK1$DK1, order.by = DK1$Date)

as the format is in the default format, we don't have to specify the format

akrun
  • 874,273
  • 37
  • 540
  • 662
  • When I run "as.POSIXct(DK1$Datetime, format = "%Y-%m-%d %H:%M:%S" " I get the following error: Error in xts(DK1[, -1], order.by = as.POSIXct(DK1$Datetime, format = "%Y-%m-%d %H:%M:%S")) : 'order.by' cannot contain 'NA', 'NaN', or 'Inf' – NotPzl Feb 10 '20 at 16:05
  • @NotPzl Could you please updatee your post with dput of a small example data – akrun Feb 10 '20 at 16:06
  • @NotPzl `dput(head(DK1))` would help – akrun Feb 10 '20 at 16:07
  • Sure! Just uploaded it – NotPzl Feb 10 '20 at 16:11
  • @NotPzl Can you please paste the ouptut of `dput` instead of image – akrun Feb 10 '20 at 16:13
  • structure(list(Date = structure(c(1546300800, 1546300800, 1546300800, 1546300800, 1546300800, 1546300800), class = c("POSIXct", "POSIXt" ), tzone = "UTC"), Hours = structure(c(-2209075200, -2209071600, -2209068000, -2209064400, -2209060800, -2209057200), class = c("POSIXct", "POSIXt"), tzone = "UTC"), Datetime = c("2019-01-01 00:00:00", "2019-01-01 01:00:00", "2019-01-01 02:00:00", "2019-01-01 03:00:00", "2019-01-01 04:00:00", "2019-01-01 05:00:00"), DK1 = c(211.48, 75.2, -30.47, -74, -55.33, -93.72)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) – NotPzl Feb 10 '20 at 16:13
  • @NotPzl Based on your dput, it would be `xts(DK1$DK1, order.by = DK1$Date)` – akrun Feb 10 '20 at 16:14
  • Even so, without the hourly mark, the row index comes out all wrong: X2019.01.01.1 etc. – NotPzl Feb 10 '20 at 16:17
  • @NotPzl Not clear. with your dput, the hour part is all 00:00:00 – akrun Feb 10 '20 at 16:17
  • Datetime = c("2019-01-01 00:00:00", "2019-01-01 01:00:00", "2019-01-01 02:00:00", "2019-01-01 03:00:00", "2019-01-01 04:00:00", "2019-01-01 05:00:00") I dont understand, I see the hourly increments? – NotPzl Feb 10 '20 at 16:19
  • @NotPzl From your dput, it is only showing the 'hour' as 00:00:00, so it iss not clear to me – akrun Feb 10 '20 at 16:20
  • I don't know what to say. Guess I'll maniputelate the excel-file a different instead since I'm kinda. Thanks a ton though – NotPzl Feb 10 '20 at 16:22