2

I have a multiple Timeseries in different files and I know that Pandas can infer the frequency of the DateTimeIndex for each:

pd.infer_freq(data.index)

Is there a programmatic way to get the approximate frequency per year from general files. For instance:

'M' -> 12
'BM' -> 12
'B' -> 252
'D' -> 365
rhaskett
  • 1,864
  • 3
  • 29
  • 48
  • https://stackoverflow.com/questions/54945550/pandas-datetimeindex-number-of-periods-in-a-frequency-string is close but I'm not sure how to adapt the answer. – rhaskett Feb 08 '21 at 17:59
  • 1
    Though this is a cool question, given that there are a very small number of common frequencies, practically it might just be easier to calculate them yourself and store them in a dict in some script. The solution I proposed is a bit overkill, and perhaps doesn't always give you the number you want, but if you want to include a `calendar` with `pd.bdate_range` to exclude holidays , you should be able to modify it to get the number – ALollz Feb 08 '21 at 18:42

1 Answers1

1

Here's one alternative. We'll create a date_range using the provided frequency and then groupby to figure out the most common number that fit into a year. The periods argument should be large enough such that given the frequency the date range creates many years of data. Really shouldn't need to change it, unless you want ns or something insanely small. (But for those it will be more efficient to just calculate manually).

def infer_periods_in_year(freq, periods=10**4):
    """
    freq : str pandas frequency alias.
    periods : numeric, given freq, should create many years. 
    """
    
    while True:
        try:
            s = pd.Series(data=pd.date_range('1970-01-01', freq=freq, periods=periods))
            break
        # If periods is too large
        except (pd.errors.OutOfBoundsDatetime, OverflowError, ValueError): 
            periods = periods/10
    
    return s.groupby(s.dt.year).size().value_counts().index[0]

infer_periods_in_year('D')
#365
infer_periods_in_year('BM')
#12
infer_periods_in_year('M')
#12
infer_periods_in_year('B')
#261
infer_periods_in_year('W')
#52
infer_periods_in_year('min', periods=10**7)
#525600
ALollz
  • 57,915
  • 7
  • 66
  • 89