-1

I have the following dataframe:

df = pd.DataFrame.from_dict({'Date': {0: '2021-01-01 00:00:00',
  1: '2021-01-02 00:00:00',
  2: '2021-01-03 00:00:00',
  3: '2021-01-04 00:00:00',
  4: '2021-01-05 00:00:00',
  5: '2021-01-06 00:00:00',
  6: '2021-01-07 00:00:00',
  7: '2021-01-08 00:00:00',
  8: '2021-01-09 00:00:00',
  9: '2021-01-10 00:00:00',
  10: '2021-01-11 00:00:00',
  11: '2021-01-12 00:00:00',
  12: '2021-01-13 00:00:00',
  13: '2021-01-14 00:00:00',
  14: '2021-01-15 00:00:00',
  15: '2021-01-16 00:00:00',
  16: '2021-01-17 00:00:00',
  17: '2021-01-18 00:00:00',
  18: '2021-01-19 00:00:00',
  19: '2021-01-20 00:00:00'}})

I want to create a simple dummy variable: when the date in the dataframe is equal to a specific date then 1, otherwise 0. I did this:

def int_21(x):
    if x == '2021-01-07':
        return '1'
    else:
        return '0'

df['comm0'] = df['Date'].apply(int_21)

However, it returns only 0s. Why? What am I doing wrong?

Thanks!

Rollo99
  • 1,601
  • 7
  • 15

2 Answers2

2
import pandas as pd

Use to_datetime() method and convert your date column from string to datetime:

df['Date']=pd.to_datetime(df['Date'])

Finally use apply() method:

df['comm0']=df['Date'].apply(lambda x:1 if x==pd.to_datetime('2021-01-07') else 0)

Or as suggested by @anky:

Simply use:

df['comm0']=pd.to_datetime(df['Date']).eq('2021-01-07').astype(int)

Or If you are familiar with numpy then you can also use after converting your Date columns to datetime:

import numpy as np
df['comm0']=np.where(df['Date']=='2021-01-07',1,0)
Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41
1

It's a problem with types.

df['Date'] is a string and not a datetime object, so when you compare each element with '2021-01-07' (another string) they differ because the time informations (00:00:00).

as solution you can convert elements to datetime, as following:

def int_21(x):
    if x == pd.to_datetime('2021-01-07'):
        return '1'
    else:
        return '0'

df['Date'] = pd.to_datetime(df['Date'])
df['comm0'] = df['Date'].apply(int_21)

or, you can still use string objects, but the comparing element must have the same format as the dates:

def int_21(x):
    if x == '2021-01-07 00:00:00':
        return '1'
    else:
        return '0'
  • thanks a lot, clear now the silly mistake. What is weird is that I tried adding the zeroes, but I kept getting the same results. Thanks again for your help – Rollo99 Apr 15 '21 at 15:56
  • thanks a lot, clear now the silly mistake. What is weird is that I tried adding the zeroes, but I kept getting the same results. Thanks again for your help – Rollo99 Apr 15 '21 at 15:56