0

I have a csv like so:

Category,Position,Name,Time
A,1,Tom Smith,00:45:01.23

there are multiple rows in the same format.

I am getting the time of the first place rider in category 'A', and calculating the time which is 15% above, i.e. if they take 1 minute 40 seconds then time to calculate is 1 minute 55 seconds. It will then give anybody in cat A above this time 0 points in a new csv.

I have this code:

def convert(seconds):  # function to convert amount of seconds to a time format
    seconds = seconds % (24 * 3600)
    hour = seconds // 3600
    seconds %= 3600
    minutes = seconds // 60
    seconds %= 60
    return "%d:%02d:%02d" % (hour, minutes, seconds)


with open("results.csv", 'rt', encoding='UTF-8', errors='ignore') as file:  # opening the full results file
    reader = csv.reader(file, skipinitialspace=True, escapechar='\\')  # skipping headers
    MaleCategoryList = []  # setting category as blank so a change is recognised
    for row in reader:
             if row[0] not in MaleCategoryList:
                    if row[0] == "A":
                        firstPlaceTime = datetime.strptime(row[3], "%H:%M:%S.%f")
                        timeInSecs = firstPlaceTime.second + firstPlaceTime.minute * 60 + firstPlaceTime.hour * 3600
                        timeDifference = timeInSecs * 1.15
                        MaxTime = datetime.strptime(convert(timeDifference), "%H:%M:%S")
# some code here which is not relevant i.e calculate points
             if cat == "A" and datetime.strptime(row[3], "%H:%M:%S.%f") > MaxTime:
                    points = int(0)
                    position_for_file = "DQ Time-Cut"
                    cat = "Time Cut"
             data = {'Position': position_for_file, 'Category': cat, 'Name': name, 'Club': club,
                        'Points': points, 'Time': time}  # dictionary of data to write to CSV

I feel it is very messy and inefficient as there are lots of if loops and it relies on lots of calculations which do seem unnecessary. Do you have any ideas of how I could re-write this/improve it?

PythonIsBae
  • 368
  • 3
  • 10
  • I'd recommend looking into the pandas package for this type of data processing. See https://pandas.pydata.org/ – ScootCork Jun 27 '20 at 12:21
  • @ScootCork I will have a look at Pandas thank you! But, I would prefer to use pythons own modules unless it is significantly more complex. – PythonIsBae Jun 27 '20 at 12:25
  • Ok cool, its definetly worth looking into both for maintainability and speed. If you would like to have feedback on your working code than the question is probably better addressed at Code Review, see: https://meta.stackexchange.com/questions/90362/where-can-i-post-code-for-others-to-review – ScootCork Jun 27 '20 at 12:42
  • In any case, please include sample input and expected output (for all the rows in your CSV) as text as part of your question. – Roy2012 Jun 27 '20 at 12:55
  • @Roy2012 I have included example input, example output is explained. – PythonIsBae Jun 27 '20 at 13:05

1 Answers1

0

What you could do to simplify your time arithmetic is add timedelta to the game. The time you parse from the string can be converted to a timedelta object if you subtract the date part from it (a default that is added upon creation of the datetime object by strptime). Multiply that by 0.15 to get the 15% to add to the original datetime object. Ex:

from datetime import datetime, timedelta

dt = datetime.strptime('00:01:40.00', "%H:%M:%S.%f")
add = (dt - datetime(*dt.timetuple()[:3]))*0.15
dt_new = dt + add

print(dt_new.time())
# 00:01:55

By the way, I'd also suggest using pandas for what you want to do (whenever you have data that nicely fits in a table...). But I'd use the same concept there (timedelta) - so it won't hurt the experience doing this in pure Python.

FObersteiner
  • 22,500
  • 8
  • 42
  • 72
  • This is a good method of getting `dt_new.time` but when you use this line: `if cat == "A" and datetime.strptime(row[3], "%H:%M:%S.%f") > MaxTime:` it doesn't work as `dt_new.time` includes the date part as you say however the csv doesn't so you need to convert one or the other. – PythonIsBae Jun 27 '20 at 13:43
  • @PythonIsBae: to clarify: datetime.strptime will *always* add a date part. if it's not defined in the input string, a default is used (1900-1-1). I used `dt_new.time()` only for the print statement, that might have been missleading. for the comparisons in your code, compare full datetime. no need to strip the date part. – FObersteiner Jun 27 '20 at 13:50