0

I am using the benedict python library to parse a .xml file (sample below):

data_source = """
<?xml version="1.0" encoding="utf-8"?>
<RunInfo Version="5">
   <Run Id="210910_A00154_0856_BH2TTNDMXY" Number="856">
        <Date>9/10/2021 3:08:02 PM</Date>
   </Run>
</RunInfo>"""

Eventually what I want to parse is the time in the format of timestamp, but without the date i.e. only 3:08:02 PM

Given that

type(data['RunInfo']['Run']['Date']) results in str

I did pd.to_datetime(data['RunInfo']['Run']['Date'])

yet the date is there for obvious reasons.

So I though about slicing only the part I want to parse (3:08:02 PM), then I would convert it to timeStamp format, with pd.to_datetime(data['RunInfo']['Run']['Date'][-10:], format="%H:%M:%S")

But what happened is that pd.to_dateTime() still outputs a date, now a random one, which is worse.

Does anyone know how can I parse only the time from the original .xml file?

U13-Forward
  • 69,221
  • 14
  • 89
  • 114
BCArg
  • 2,094
  • 2
  • 19
  • 37

1 Answers1

1

We can convert this 9/10/2021 3:08:02 PM in DataTime format like so :

>>> df['timestamp'] = pd.to_datetime(df['timestamp'])

Then, to extract the time :

>>> df['timestamp'].dt.strftime("%I:%M:%S %p")
tlentali
  • 3,407
  • 2
  • 14
  • 21