0

I had a pandas data frame in which one column contained dates given as strings (for exampel "2014-10-17". I wanted to have this values as Python date objects. I decided to make this transformation in two steps:

df.col = pandas.to_datetime(df.col)
df.col = df.map(lambda x : x.date())

Before the first step, after the first step and after the second step I used the same operation to check the content of the column:

df.col.tolist()[:5]

I have noticed that when the dates were given as strings or datetime.date the above operation operation was relative fast. In contrast, when the dates were given as pandas.datetime objects, the operation was considerably slower.

Can someone explain this behavior?

Roman
  • 124,451
  • 167
  • 349
  • 456
  • 1
    Is there a specific reason you want to convert the entire column/series to a list and then just display the first 5 rows? Why not just show `df.col.head(5)` or `list(df.col.head(5)` if you insist on having a list? – EdChum Oct 17 '14 at 14:31
  • @EdChum, in this particular case I do not have a real reason to convert to lists. But in general, it might happen that I want to have a list from a series and if `pandas.datetime` series require significantly more time for this operation, it would be nice to know that to avoid performance problems. – Roman Oct 17 '14 at 14:44

0 Answers0