-1

I have a question that continues something I asked previously, this time about how to change x ticks.

Multiindex scatter plot

Let me repeat the set-up, so you don't have to go to the link. Suppose I have the following data:

data = {'Value': {('1', 1): 3.0,
('1', 2): 4.0,
('1', 3): 51.0,
('1', 4): 10.0,
('1', 5): 2.0,
('1', 6): 17.0,
('1', 7): 14.0,
('1', 8): 7.0,
('1', 9): 2.0,
('1', 10): 1.0}}
df=pd.DataFrame(data)

Let's say this represents values for something for the first ten days in January. I want to plot this data, so I use:

df.plot()
plt.show()

Now, suppose I have another data set that has values for a subset of these dates with slightly different values but the same index values:

df1 = df[df['Value']<10]
df1['Value'] = df1['Value']*2

Per the answer, I can overlay a scatter plot as:

ax = df.plot()
df1.reindex(df.index).plot(marker='o',linestyle='none',color='g', ax=ax)

In a more general example, where the x-axis represents the 365 days of the year (in non-leap years), how can I get the x ticks to represent the first day of each month? The best solution I could come up with is:

plt.xticks(np.arange(0,365,30),['1/1','2/1','3/1','4/1','5/1','6/1','7/1','8/1','9/1','10/1','11/1','12/1'])

Of course, this doesn't work exactly since the months aren't all 30 days. What is an easier/accurate way?

user21359
  • 466
  • 5
  • 18

1 Answers1

3

Assuming that I understand you, it's not too difficult. Most of the action takes place outside pandas. This seems to imply that there is considerable latitude in setting ticks.

import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt

df = pd.DataFrame(
    data = {
        'Values' : [3, 4, 5, 6, 2, 7, 6, 3, 4, 2, 4, 5], 
        'Months': [datetime(2017, _, 1) for _ in range(1,13)]
    }
    )
ax = df.plot()
ax.set_xticks(list(range(0,12)))
ax.set_xticklabels([datetime(2017,_,1).strftime('%b') for _ in range(1,13)])
plt.show()

You also asked about the precision of tick placement. In the code above the tick placements are simply a linear function of month number. There is almost nothing to be gained — in a plot — from doing anything more sophisticated.

Here I make a more careful calculation of tick positions based on the lengths of the months and the total number of days between the 1st of January and the 1st of December, and compare this with the simpler calculation used in the plot. There's little difference.

>>> for m in range(1, 13):
...     '%.3f %.3f' % ((m-1)/11, (datetime(2017,m,1)-datetime(2017,1,1)).days/334)
...     
'0.000 0.000'
'0.091 0.093'
'0.182 0.177'
'0.273 0.269'
'0.364 0.359'
'0.455 0.452'
'0.545 0.542'
'0.636 0.635'
'0.727 0.728'
'0.818 0.817'
'0.909 0.910'
'1.000 1.000'

Here's the plot.

Monthly ticks

Bill Bell
  • 21,021
  • 5
  • 43
  • 58