3

I am trying to find a package that enables temporal disaggregation of timeseries. There is a package in R called tempdisagg.

https://journal.r-project.org/archive/2013/RJ-2013-028/RJ-2013-028.pdf

Is there any similar package in python anyone is aware of?

If this package does not exist in python, is there an available example whereby someone may call such functions in that package from R into Python.

Nimantha
  • 6,405
  • 6
  • 28
  • 69
rsc05
  • 3,626
  • 2
  • 36
  • 57
  • A general approach I recommend is to use R magics in Jupyter notebooks using rpy2. You just import input dataframes from Python to R and then output from R to Python, while the package-specific code is still written in R. – krassowski Feb 04 '20 at 16:09
  • @krassowski Do you have a good tutorial on this package or similar package you can share on how to do so? – rsc05 Feb 04 '20 at 16:28

1 Answers1

2

I've created an open source Python package called timedisagg that is based on the R tempdisagg package. The package implements the basic Chow-Lin and Litterman methods. It also allows for basic average, sum, first and last conversion choices like the R package.

Given the following function call in R to disaggregate sales.a as a function of exports.q:

model <- td(sales.a ~ 0 + exports.q,method="chow-lin-maxlog",conversion="sum")

A similar call can be made using timedisagg as below:

from timedisagg.td import TempDisagg
td_obj = TempDisagg(conversion="sum", method="chow-lin-maxlog")
final_disaggregated_output = td_obj(expected_dataset)

where the expected_dataset is a pandas dataframe with the following format:

      index  grain            X            y
0     1972      1   1432.63900          NaN
1     1972      2   1456.89100          NaN
2     1972      3   1342.56200          NaN
3     1972      4   1539.39400          NaN
4     1973      1   1535.75400          NaN
5     1973      2   1578.45800          NaN
6     1973      3   1574.72400          NaN
7     1973      4   1652.17100          NaN
8     1974      1   2047.83400          NaN
9     1974      2   2117.97100          NaN
10    1974      3   1925.92600          NaN
11    1974      4   1798.19000          NaN
12    1975      1   1818.81700   136.702329
13    1975      2   1808.22500   136.702329
14    1975      3   1649.20600   136.702329
15    1975      4   1799.66500   136.702329
16    1976      1   1985.75300   151.056074
17    1976      2   2064.66300   151.056074
18    1976      3   1856.38700   151.056074
19    1976      4   1919.08700   151.056074
..     ...    ...          ...          ...
152   2010      1  19915.79514   988.309676
153   2010      2  19482.48000   988.309676
154   2010      3  18484.64900   988.309676
155   2010      4  18026.46869   988.309676
156   2011      1  19687.52100          NaN
157   2011      2  18913.06608          NaN

Here X is is exports.q and y is sales.a.

The output final_disaggregated_output will appear as below where y_hat is the disaggregated sales:

   index  grain         X   y      y_hat
0   1972      1  1432.639 NaN  21.656879
1   1972      2  1456.891 NaN  22.219737
2   1972      3  1342.562 NaN  20.855413
3   1972      4  1539.394 NaN  23.937916
4   1973      1  1535.754 NaN  24.229008

Edit - If someone needs help working their data into my package, feel free to raise an issue at the git for the package.

jstephenj14
  • 169
  • 1
  • 7