1

I have a Series s as

10241715000
  201709060
   11202017
     112017
     111617
     102417
     110217
    1122018

I tried the following code to convert s into datetime;

pd.to_datetime(s.str[:7], format='%-m%d%Y', errors='coerce')

but it returned s as it is without any conversions been done, I was expecting something like,

NaT
NaT
2017-01-20
NaT
NaT
NaT
NaT
2018-01-12

The format is defined according to strftime directives that %-m indicates Month as a decimal number, e.g. 1; %Y indicates Year as a decimal number, e.g. 2018. I am wondering what is the issue here. I am using Pandas 0.22.0 and Python 3.5.

UPDATE

data = np.array(['10241715000','201709060','11202017','112017','111617','102417',
 '110217','1122018'])

s = pd.Series(data)

pd.to_datetime(s.str[-7:], format='%-m%d%Y', errors='coerce')

0    1715000
1    1709060
2    1202017
3     112017
4     111617
5     102417
6     110217
7    1122018
dtype: object
daiyue
  • 7,196
  • 25
  • 82
  • 149

1 Answers1

2

It should be -7 not 7 for str slice

pd.to_datetime(s.astype(str).str[-7:], format='%m%d%Y', errors='coerce')
Out[189]: 
0          NaT
1          NaT
2   2017-01-20
3   2017-01-01
4          NaT
5          NaT
6          NaT
7   2018-11-02
Name: a, dtype: datetime64[ns]

Update

pd.to_datetime(s.str[-7:].str.pad(8,'left','0'), format='%m%d%Y', errors='coerce')
Out[208]: 
0          NaT
1          NaT
2   2017-01-20
3          NaT
4          NaT
5          NaT
6          NaT
7   2018-01-12
dtype: datetime64[ns]
BENY
  • 317,841
  • 20
  • 164
  • 234
  • I have updated my op with some code, and it didn't work. – daiyue Oct 31 '18 at 14:57
  • what if I want to convert `1122018` to `2018-01-12` instead of `2018-11-02`, what `format` should I define? – daiyue Oct 31 '18 at 15:02
  • also, I am wondering why it returns the entire series unchanged since `to_datetime` should return `NaT` when it couldn't convert a string to DateTime defined by `format`? – daiyue Oct 31 '18 at 15:04