Most recent answer: if moving to a straight strptime()
has not improved the running time, then my suspicion is that there is actually no problem here: you have simply written a program, one of whose main purposes in life is to call strptime()
very many times, and you have written it well enough — with so little other stuff that it does — that the strptime()
calls are quite properly being allowed to dominate the runtime. I think you could count this as a success rather than a failure, unless you find that (a) some Unicode or LANG setting is making strptime()
do extra work, or (b) you are calling it more often than you need to. Try, of course, to call it only once for each date to be parsed. :-)
Follow-up answer after seeing example date string: Wait! Hold on! Why are you parsing the line instead of just using a formatting string like:
"%d/%b/%Y:%H:%M:%S"
Original off-the-cuff-answer: If the month were a integer you could do something like this:
new_entry['time'] = datetime.datetime(
int(parsed_line['year']),
int(parsed_line['month']),
int(parsed_line['day']),
int(parsed_line['hour']),
int(parsed_line['minute']),
int(parsed_line['second'])
)
and avoid creating a big string just to make strptime()
split it back apart again. I wonder if there is a way to access the month-name logic directly to do that one textual conversion?