-1

I have a DateTimeIndex, I need to convert to a certain column of the Dataframe, and use a specific format, my code is as follows, how to optimize?

import numpy as np
import pandas as pd

original = pd.date_range(start='20210520 09:00:00', end='20210520 12:00:00', freq='30min')
time = np.vectorize(lambda s: s.strftime('%H:%M:%S'))(original.to_pydatetime())
result = pd.DataFrame(time, columns=['time'])
print('original:')
print(original)
print('result:')
print(result)
original:
DatetimeIndex(['2021-05-20 09:00:00', '2021-05-20 09:30:00',
               '2021-05-20 10:00:00', '2021-05-20 10:30:00',
               '2021-05-20 11:00:00', '2021-05-20 11:30:00',
               '2021-05-20 12:00:00'],
              dtype='datetime64[ns]', freq='30T')
result:
       time
0  09:00:00
1  09:30:00
2  10:00:00
3  10:30:00
4  11:00:00
5  11:30:00
6  12:00:00
jaried
  • 632
  • 4
  • 15

1 Answers1

1

Instead of this:

time = np.vectorize(lambda s: s.strftime('%H:%M:%S'))(original.to_pydatetime())

Use:

time=original.time.astype(str)

Performance:

​%%timeit
original = pd.date_range(start='20210520 09:00:00', end='20210520 12:00:00', freq='30min')
time = np.vectorize(lambda s: s.strftime('%H:%M:%S'))(original.to_pydatetime())
result = pd.DataFrame(time, columns=['time'])

>>>925 µs ± 53.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%%timeit
original = pd.date_range(start='20210520 09:00:00', end='20210520 12:00:00', freq='30min')
time=original.time.astype(str)
result = pd.DataFrame(time, columns=['time'])
      
>>>724 µs ± 12 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

enter image description here

Anurag Dabas
  • 23,866
  • 9
  • 21
  • 41