I'm trying to create a Python programme which will show the following data about Covid-19, all from a CSV file:
- cases per day
- total cases
- deaths per day
- total deaths
I'm reading the data from a file and populating 4 arrays with the info. I can iterate over the arrays and print out the data - it all looks OK.
The issue is when I look at the graph produced - the data seems to be OK for the first 25 entries and then rises with all lines in parallel. The scale on the y-axis is wrong too - it seems to restart the scale for each of the 4 data sets.
Here's the code:
import csv
import matplotlib.pyplot as plt
from datetime import datetime
#create the arrays to hold virus data
caseDate, cases, casesCum, deaths, deathsCum = [], [], [], [], []
leg1 = "Cases"
leg2 = "Cases - Cum"
leg3 = "Deaths"
leg4 = "Deaths - Cum"
#open virus data file
filename = 'Virus Data.csv'
with open(filename) as f:
reader = csv.reader(f)
header_row = next(reader)
#record data in arrays
for row in reader:
current_date = datetime.strptime(row[0], "%d/%m/%Y")
caseDate.append(current_date)
cases.append(row[1])
casesCum.append(row[2])
deaths.append(row[3])
deathsCum.append(row[4])
for i in range (len(cases)):
print("Entry ", i, ": Cases: ", cases[i],"; Cum Cases: ", casesCum[i],"; Deaths: ", deaths[i],"; Cum Deaths: ", deathsCum[i])
#data now recorded
#plot data in chart
fig = plt.figure(dpi = 128, figsize=(12,6))
plt.plot(caseDate, cases, c="red", alpha=0.5)
plt.plot(caseDate, casesCum, c="blue", alpha=0.5)
plt.plot(caseDate, deaths, c="green", alpha = 0.5)
plt.plot(caseDate, deathsCum, c="black", alpha = 0.5)
#format plot
title = "Covid-19 Statistics"
plt.title(title, fontsize = 20)
plt.xlabel("", fontsize = 16)
fig.autofmt_xdate()
plt.ylabel("Cases / Cum Cases / Deaths / Cum Deaths", fontsize = 12)
plt.tick_params(axis='both', which='major', labelsize=16)
plt.show()
It's almost there, but not quite. When I create the chart in Excel, it looks like I expect:
I'd like it to work in Python.....