1

I have a csv file with dates in format M/D/YYYY from 1948 to 2017. I'm able to plot other columns/lists associated with each date by list index. I want to be able to ask the user for a start date, and an end date, then return/plot the data from only within that period.

Problem is, reading dates in from the csv, they are strings so I cannot use if date[x] >= startDate && date[x] <= endDate because theres no way for me to turn dates in this format to integers.

Here is my csv file

I am already able to read in the dates from the csv to its own list.

How can I take the dates in my list and only return the ones within the user specified date range?

Here is my function for plotting the entire dataset right now:

#CSV Plotting function
def CSV_Plot (data,header,column1,column2):

  #pyplot.plot([item[column1] for item in data] , [item[column2] for item in data])
  pyplot.scatter([item[column1] for item in data] , [item[column2] for item in data])
  pyplot.xlabel(header[column1])
  pyplot.ylabel(header[column2])
  pyplot.show()

  return True

CSV_Plot(mycsvdata,data_header,dateIndex,rainIndex)

This is how I am asking the user to input the start and end dates:

 #Ask user for start date in M/D/YYY format
  startDate = input('Please provide the start date (M/D/YYYY) of the period for the data you would like to plot: ')
  endDate = input('Please provide the end date (M/D/YYYY) of the period for the data you would like to plot: ')
Ryan Brad
  • 23
  • 4

1 Answers1

1

You need to compare the dates.

I would suggest parsing the dates from your CSV into a datetime object, and also turning the user input value into a datetime object.

How to create a datetime object from a string? You need to specify the format string and the strptime() will parse it for you. Details here: Converting string into datetime

In your case, it could be something like

from datetime import datetime

# Considering date is in M/D/YYYY format
datetime_object1 = datetime.strptime(date_string, "%m/%d/%Y")

Then you can compare them with a > or < operator. Here you can find details of how to compare the dates.

Filip Kubicz
  • 459
  • 1
  • 5
  • 17
  • Thank you for the quality answer. Do you think it would be better for me do to this within my plotting function's list comprehension or should I convert the entire list into datetime objects, and then run my function against that ? – Ryan Brad Apr 25 '21 at 17:47
  • It depends if you only need to plot the data, or you want to keep this filtered range for some other operations. If you just filter it for plotting, then doing it in the list comprehension is fine, if you perform other operations on it, then it will be better to structure your data somehow, for example as a dictionary where datetime is a key and the rest of value is a namedtuple. Take a look https://www.geeksforgeeks.org/namedtuple-in-python/ – Filip Kubicz Apr 25 '21 at 18:16
  • In your case it could be e.g. `Weather = namedtuple('Weather', ['prcp', 'tmax', 'tmin', 'rain'])` and then you will be able to access e.g. the rain value like this: `data[datetime1].rain` – Filip Kubicz Apr 25 '21 at 18:16
  • Thanks Flip, lastly can someone help me understand the format I need to be using for my datetime object? the dates in my list are of format:'1948-01-01' and I get the following error: `ValueError: time data '1948-01-01' does not match format '%m/%d/%Y'` – Ryan Brad Apr 25 '21 at 19:05
  • I think the format for '1948-01-01' could be `%Y-%m-%d` -> please check if month or day is first in your data. – Filip Kubicz Apr 26 '21 at 10:51
  • Please see https://docs.python.org/3/library/datetime.html#strftime-and-strptime-format-codes – Filip Kubicz Apr 26 '21 at 11:09