3

I have data like this (it's a time series problem):

Time                    y
2017-01-01 00:00:00   34002
2017-01-01 01:00:00   37947
2017-01-01 02:00:00   41517
2017-01-01 03:00:00   44476
2017-01-01 04:00:00   46234

I want to extract the hour, day of the week and day off as categorical variables, but somehow it doesn't work:

data = pd.DataFrame(dataset)
data.columns = ["y"]

data.index = pd.to_datetime(data)
data["hour"] = data.index.hour
data["weekday"] = data.index.weekday
data['is_weekend'] = data.weekday.isin([5,6])*1

data.head()

Python throws the following error, with which I don't know what to do:

 2 data.columns = ["y"]
      3 
----> 4 data.index = pd.to_datetime(data)
      5 data["hour"] = data.index.hour
      6 data["weekday"] = data.index.weekday

ValueError: to assemble mappings requires at least that [year, month, day] be specified: [day,month,year] is missing
Max AweTery
  • 103
  • 6

1 Answers1

0

Elaborating on the answer included in @MrFruppes comment:

The problem here is that we were trying to convert the DataFrame data to datetime objects, rather than convert the index of the DataFrame to datetime objects.

It is possible to access a DataFrame's index using the .index property. Feeding those values into the .to_datetime() method and assigning the outputs to the data.index allows us to overwrite the original values with new values that have been converted to datetime objects.

import pandas as pd
data = pd.DataFrame(dataset)
data.columns = ['y']

Here, we access the .index and convert it.

data.index = pd.to_datetime(data.index)
data["hour"] = data.index.hour
data["weekday"] = data.index.weekday
data['is_weekend'] = data.weekday.isin([5,6])*1

data.head()
E. Ducateme
  • 4,028
  • 2
  • 20
  • 30