2

I have tried using DatetimeIndex method.

The column with values is as follows

reg_date                    

2013-06-10T00:00:00.000Z

2014-09-30T00:00:00.000Z

2014-09-30T00:00:00.000Z

2014-09-30T00:00:00.000Z

2014-10-01T00:00:00.000Z



type(df.reg_date) yields

pandas.core.series.Series

and have used the following

 df['reg_month'] = pd.DatetimeIndex(df['reg_date']).month

I got this for earlier data, but DatetimeIndex doesn't work here

and getting the Below error


TypeError                                 Traceback (most recent call last)
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike(arg, box, format, name, tz)
    302             try:
--> 303                 values, tz = tslib.datetime_to_datetime64(arg)
    304                 return DatetimeIndex._simple_new(values, name=name, tz=tz)

pandas/_libs/tslib.pyx in pandas._libs.tslib.datetime_to_datetime64()

TypeError: Unrecognized value type: <class 'str'>

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-22-4e7ef5ca2997> in <module>()
----> 1 df['reg_month'] = pd.DatetimeIndex(df['reg_date']).month

C:\ProgramData\Anaconda3\lib\site-packages\pandas\util\_decorators.py in wrapper(*args, **kwargs)
    116                 else:
    117                     kwargs[new_arg_name] = new_arg_value
--> 118             return func(*args, **kwargs)
    119         return wrapper
    120     return _deprecate_kwarg

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\indexes\datetimes.py in __new__(cls, data, freq, start, end, periods, copy, name, tz, verify_integrity, normalize, closed, ambiguous, dtype, **kwargs)
    340                 is_integer_dtype(data)):
    341             data = tools.to_datetime(data, dayfirst=dayfirst,
--> 342                                      yearfirst=yearfirst)
    343 
    344         if issubclass(data.dtype.type, np.datetime64) or is_datetimetz(data):

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in to_datetime(arg, errors, dayfirst, yearfirst, utc, box, format, exact, unit, infer_datetime_format, origin)
    378         result = _convert_listlike(arg, box, format, name=arg.name)
    379     elif is_list_like(arg):
--> 380         result = _convert_listlike(arg, box, format)
    381     else:
    382         result = _convert_listlike(np.array([arg]), box, format)[0]

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike(arg, box, format, name, tz)
    304                 return DatetimeIndex._simple_new(values, name=name, tz=tz)
    305             except (ValueError, TypeError):
--> 306                 raise e
    307 
    308     if arg is None:

C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py in _convert_listlike(arg, box, format, name, tz)
    292                     dayfirst=dayfirst,
    293                     yearfirst=yearfirst,
--> 294                     require_iso8601=require_iso8601
    295                 )
    296 

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslib.pyx in pandas._libs.tslib.array_to_datetime()

pandas/_libs/tslibs/parsing.pyx in pandas._libs.tslibs.parsing.parse_datetime_string()

C:\ProgramData\Anaconda3\lib\site-packages\dateutil\parser.py in parse(timestr, parserinfo, **kwargs)
   1180         return parser(parserinfo).parse(timestr, **kwargs)
   1181     else:
-> 1182         return DEFAULTPARSER.parse(timestr, **kwargs)
   1183 
   1184 

C:\ProgramData\Anaconda3\lib\site-packages\dateutil\parser.py in parse(self, timestr, default, ignoretz, tzinfos, **kwargs)
    557 
    558         if res is None:
--> 559             raise ValueError("Unknown string format")
    560 
    561         if len(res) == 0:

ValueError: Unknown string format
Keren Caelen
  • 1,466
  • 3
  • 17
  • 38
Abhishek Pal
  • 51
  • 1
  • 9
  • UPDATE- Tried df['year'] = df['reg_date'].dt.year the error was AttributeError: Can only use .dt accessor with datetimelike values Is there any fast way to convert the following to datetime values. – Abhishek Pal Apr 27 '18 at 04:13

2 Answers2

2

You can convert your data to datetime objects :

import datetime as dt    
df['reg_date'] = pd.to_datetime(df['reg_date'], errors='coerce')

And then you can extract the month as below:

df['month'] = df['reg_date'].dt.month

Output:

    time    month
0   2013-06-10  6
1   2014-09-30  9
2   2014-09-30  9
3   2014-09-30  9
4   2014-10-01  10

Here are the docs.

harvpan
  • 8,571
  • 2
  • 18
  • 36
  • The code is running RN, was i supposed to put reg_date as an argument in pd.to_datetim?? – Abhishek Pal Apr 27 '18 at 04:25
  • Same error when i try to execute the first block of code ValueError: Unknown string format @Harv Ipan – Abhishek Pal Apr 27 '18 at 04:40
  • I simply copied your data and did it in my environment. Are you sure that before you run my code, all your values are in `string` format? – harvpan Apr 27 '18 at 04:46
  • yes, just checked; type(df.reg_date[0]) returns str @Harv Ipan – Abhishek Pal Apr 27 '18 at 04:50
  • 1
    Your `df` likely contains a bad row with a string format that can't be read in. You could do `pd.to_datetime(df['reg_date'], errors='coerce')` to convert the bad strings to a null value, `NaT` in this case. – ALollz Apr 27 '18 at 04:50
  • Are you sure you do not have `null` values? @AbhishekPal – harvpan Apr 27 '18 at 04:51
  • 1
    @Harv Ipan yes the issue was due to null values ALollz`s suggestion solved it. Thanks for your help too, cheers mate!! – Abhishek Pal Apr 27 '18 at 04:53
  • @AbhishekPal, for the sake of completeness, I have added `errors='coerce'` to my answer. – harvpan Apr 27 '18 at 04:54
0
import pandas as pd

n = {"year":[], "month":[], "day":[]}
for i in df['reg_date']:
    n["year"].append(i.split("T")[0].split("-")[0])
    n["month"].append(i.split("T")[0].split("-")[1])
    n["day"].append(i.split("T")[0].split("-")[2])


#Now 'n' is the dictionary contains separated day, month and year from df["reg_date"].. 

Another approach

df["reg_date"] = df["reg_date"].apply(lambda x: x.split("T")[0]) 

 #Here df["reg_date"] converts to column containing date for each records