0

Hello and thanks for taking a moment to read my issue. I currently have a column or series of data within a Pandas dataframe that I am attempting to parse into a proper YYYY-MM-DD (%Y-%m-%d %H:%M) type format. The problem is this data does not contain a year on its own.

cur_date is what I currently have to work with.

cur_date
Jan-20 14:05
Jan-4 05:07
Dec-31 12:07
Apr-12 20:54
Jan-21 06:12
Nov-3 04:10
Feb-5 11:45
Jan-7 07:09
Dec-3 12:11

req_date is what I am aiming to achieve.

req_date
2023-01-20 14:05
2023-01-04 05:07
2022-12-31 12:07
2022-04-12 20:54
2022-01-21 06:12
2021-11-03 04:10
2021-02-05 11:45
2021-01-07 07:09
2020-12-03 12:11

I am aware of writing something like the following df['cur_date'] = pd.to_datetime(df['cur_date'], format='%b-%d %H:%M') but this will not allow me to append a descending year to the individual row.

I tried various packages, one being dateparser which has some options to handle incomplete dates such as the settings={'PREFER_DATES_FROM': 'past'} setting but this does not have the capability to look back at a previous value and interpret the date as I am looking for.

parsem
  • 1
  • 1
  • 1
    Pardon me if I've misunderstood - but how do you propose to correctly determine the year associated with a date if you haven't been given it explicitly? or is the year separately provided in a different column? – Vin Jan 21 '23 at 05:42
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Jan 21 '23 at 14:34

1 Answers1

0

i hope these codes work for you :)

note: When the epoch value is equal, it's up to you whether to change the year or not

import time

current_year = 2023
last = {"ly":current_year, "epoch":0}

def set_year(tt):
    epoch = time.mktime(tt)   

    if epoch > last["epoch"] and last["epoch"] != 0: # first year must current year or you can compare with current time
        last["ly"] -= 1

    last["epoch"] = epoch

    return str(last["ly"])

def transform_func(x):
    time_tup = time.strptime(f"{current_year}-"+x, "%Y-%b-%d %H:%M") # const year for comparing
    time_format = time.strftime("%m-%d %H:%M", time_tup)
    ly = set_year(time_tup)

    return f"{ly}-{time_format}"

df["req_date"] = df["cur_date"].transform(transform_func)
SimoN SavioR
  • 614
  • 4
  • 6