I have Daily Crude oil prices downloaded from FRED, about 10k observations, some values are blank(code cleans them). I believe that I cannot share excel sheets here, so I will just give you a screenshot of what the data looks like:
I calculate the differences and returns and clean up the data but I am kind of stuck.
Here is what the code looks like to get you started:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
data = pd.read_csv("DCOILWTICO.csv")
nan_value = float("NaN")
data.replace("", nan_value, inplace=True)
data.replace(".", nan_value, inplace=True)
data['Previous'] = data['DCOILWTICO'].shift(1)
data.dropna(subset=['Previous'],inplace=True)
data.replace("", nan_value, inplace=True)
data.replace(".", nan_value, inplace=True)
data['DCOILWTICO'] = data['DCOILWTICO'].astype(float)
data['Previous'] = data['Previous'].astype(float)
data['Diff'] = data['DCOILWTICO'] - data['Previous']
data['Return'] = (data['DCOILWTICO'] - data['Previous'])/data['Previous']
Here comes the question: I am trying to duplicate the graph below.(which I believe was generated using Mathematica) The difficult part is to be able to create the bins in the right way. Looking at the graph it looks like there are around 200 bins. On the x-axis are the returns and on the y axis are the frequencies(which have been binned).