1

I have below data -

ProductName    01/01/2016    01/07/2016    01/14/2017
ABC              12             34            51
XYZ               9             76            12
PQR              12             23             7
DEF              54              4            34

I want to plot a timeseries scatterplot showing total sales on each day. I have created the following function -

def scatterplot(x_data, y_data, x_label, y_label, title):
_, ax = plt.subplots()
ax.scatter(x_data, y_data, s = 30, color = '#539caf', alpha = 0.75)

ax.set_title(title)
ax.set_xlabel(x_label)
ax.set_ylabel(y_label)

I am confused about how to call this function to get my desired result. The plot should show date on the x-axis and total sales on the y.

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
ComplexData
  • 1,091
  • 4
  • 19
  • 36

1 Answers1

2

If your data is in a pandas DataFrame, you may take the column headers as x values and the sum of the data along the vertical axis (i.e. the total number of products sold that day) as y values.

import pandas as pd
import matplotlib.pyplot as plt

# replicate Data from question in DataFrame
v = [[12,34,51], [9,76,12], [12,23,7], [54,4,34]]
df = pd.DataFrame(v, columns=["01/01/2016","01/07/2016","01/14/2017"], 
                      index=["ABC", "XYZ", "PQR", "DEF"])
print(df)


def scatterplot(x_data, y_data, x_label, y_label, title):
    fig, ax = plt.subplots()
    ax.scatter(x_data, y_data, s = 30, color = '#539caf', alpha = 0.75)

    ax.set_title(title)
    ax.set_xlabel(x_label)
    ax.set_ylabel(y_label)
    fig.autofmt_xdate()

#use column headers as x values
x = pd.to_datetime(df.columns, format='%m/%d/%Y')
# sum all values from DataFrame along vertical axis
y = df.values.sum(axis=0)    
scatterplot(x,y, "x_label", "y_label", "title")

plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712