3

I am trying to plot this data with matplotlib.

I am able to get all of the data and make a graph of jones vs time, but my scale becomes wacky when I try to use the falls data set.

The graph and code I created are below.

graph

import gspread
from oauth2client.service_account import ServiceAccountCredentials
import pandas as pd
from pandas import DataFrame
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np

scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']

credentials = ServiceAccountCredentials.from_json_keyfile_name(
         'pandastest-273019-00f3c09845fb.json', scope) # Your json file here

gc = gspread.authorize(credentials)

wks = gc.open("data").sheet1

data = wks.get_all_values()
headers = data.pop(0)

df = pd.DataFrame(data, columns=headers)
pd.set_option("display.max_rows",None,"display.max_columns", None)
print(df.head())

dates = []
fort = []
jones = []
pat = []
ferry = []
middle = []
north = []
canton = []
for x in data[:]:
    for i in range(0, len(x)):
        try:
            x[i] = int(x[i])
        except:
            continue

    print(x)
    dates.append(x[0])
    fort.append(x[1])
    jones.append(x[2])
    pat.append(x[3])
    ferry.append(x[4])
    middle.append(x[5])
    north.append(x[6])
    canton.append(x[7])

#fort
x = [dt.datetime.strptime(d,'%m/%d/%Y').date() for d in dates]
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%m/%d/%Y'))
plt.gca().xaxis.set_major_locator(mdates.DayLocator(interval=29))
plt.plot(x,fort)
plt.gcf().autofmt_xdate()
plt.legend(["fort", ])
plt.xlabel("Dates")
plt.ylabel("Matter")
plt.show()

Does anyone know why this is happening and how I can fix it?

William Miller
  • 9,839
  • 3
  • 25
  • 46
Alexis B.
  • 35
  • 5
  • Have you tried with only parts of the dataset to see whether you still get the weirdness? My only idea is it's because of the missing data point, but I'd hope pandas would know how to deal with that – Nathan majicvr.com Apr 03 '20 at 07:26

1 Answers1

2

Your y-axis values are strings, they need to be converted to scalar values, something like this

for x in data[:]:
    # ...
    fort.append(int(x[1]))
    # ...

Of course there is a much easier way to approach this problem, once the data has been read into a pandas dataframe you can directly extract the column data:

x = pd.to_datetime(df["Date"], format = "%m/%d/%Y")
fort = df["Fort McHenry Channel"].values

In either case the output will be

enter image description here

William Miller
  • 9,839
  • 3
  • 25
  • 46