I am trying to plot the average return by age for different companies. I want to have the age variable evenly spaced, though the stepsize increases and is not evenly distributed. My goal looks like this:
fig, ax = plt.subplots(figsize=(10,6), sharex=True, sharey=True)
ax.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "1980-1989") & (ipo_cut["Age"] >= 2)].groupby("Age").mean(), color="r")
ax2 = ax.twinx()
ax2.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "1990-1998") & (ipo_cut["Age"] >= 2)].groupby("Age").mean(), color = "g")
ax3 = ax.twinx()
ax3.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "1999-2000") & (ipo_cut["Age"] >= 2)].groupby("Age").mean(), color="grey")
ax4 = ax.twinx()
ax4.plot(ipo_cut[["IR", "Age", "Timespan"]][(ipo_cut["Timespan"] == "2001-2003") & (ipo_cut["Age"] >= 2)].groupby("Age").mean())
ax.set(title ="Average First-day Returns by Age of Firm at Time of IPO",
xlabel = "Age",
ylabel = "Average First-Day Return in %")
ax.set_ylim([0,1])
ax2.set_ylim([0,1])
ax3.set_ylim([0,1])
ax4.set_ylim([0,1])
ax.set_xlim([0,70])
ax2.set_xlim([0,70])
ax3.set_xlim([0,70])
ax4.set_xlim([0,70])
ax2.get_yaxis().set_visible(False)
ax3.get_yaxis().set_visible(False)
ax4.get_yaxis().set_visible(False)
plt.xticks([0,1,2,3,4,5,6,7,8,9,10, 11,12,13,15,17,20,30,40,50,60,70])
plt.setp(ax.get_xticklabels(), rotation = 90)
fig.show()
and the output looks like the following:
Based on other posts I could display only the desired years of "Age", but the axis is unfortunately not evenly distributed. I feel like I am really close to a solution, but could not find an answer by myself.
A solution in seaborn would also be very helpful to me.
EDIT with some additional information: The shape of the x-vector and the data does not match. I have filtered my dataframe only for the relevant ages, but some information is missing. Here is a minimal reproducible example:
import matplotlib.pyplot as py
import numpy as np
#create dataframe
x = np.arange(0,11)
df = pd.DataFrame(data = np.random.rand(len(x)), index = x, columns=["IR"])
df.index.name = "Age"
df.loc[[0, 9], "IR"] = np.nan
df.dropna(inplace = True)
# open figure
fig, ax = plt.subplots(figsize=(10,6))
# create x-values
x = [0,1,2,5,7,9,10]
# plot
ax.plot(x, df, color="r")
ax.set(
title ="Average First-day Returns by Age of Firm at Time of IPO",
xlabel = "Age",
ylabel = "Average First-Day Return in %",
ylim = [0,1],
xlim = [0,10])
ax.tick_params(axis='x', labelrotation=90)
fig.show()
My goal is to plot the whole dataframe without gaps. So the idea is that I have the line plot in the background with the needed x-values on the x-axis. I hope this information helps.
SOLUTION:
import matplotlib.pyplot as py
import numpy as np
#create dataframe
x = np.arange(0,10)
df = pd.DataFrame(data = np.random.rand(len(x)), index = x, columns=["IR"])
df.index.name = "Age"
#create nan in sample
df.loc[[0, 9], "IR"] = np.nan
# slice data for unbroken line (solution by max)
lg = df["IR"].isna()
# create ticks for even distribution
n = 10
a = np.arange(n)
# open figure
fig, ax = plt.subplots(figsize=(10,6))
# create x-values fir tick labels
x = (0,5,10,13,15,24,30,40,55,70)
# plot
ax.plot(a[~lg], df[~lg], color="r")
ax.set(
title ="Average First-day Returns by Age of Firm at Time of IPO",
xlabel = "Age",
ylabel = "Average First-Day Return in %",
ylim = [0,1],
xlim = [0,10])
ax.xaxis.set_ticks(a) #set the ticks to be a
ax.xaxis.set_ticklabels(x) # change the ticks' names to x
ax.tick_params(axis='x', labelrotation=90)
fig.show()
Thanks to the input of max and "How to make ticks evenly spaced despite their value?" I could find a solution to my problem. If your original dataframe has x-values with stepsize other than one I suggest to simply reset_index()
.