0

Could anyone help me with calculating compound inflation rates, please? I did not find a solution online so I used a good old for loop to calculate one cell at a time (code attached). There must be a better way to accomplish this.

Data: https://data.worldbank.org/indicator/NY.GDP.DEFL.KD.ZG

import pandas as pd

df_gdp_deflator = pd.read_csv(r"API_NY.GDP.DEFL.KD.ZG_DS2_en_csv_v2_3358227.csv", header=2)

df_gdp_deflator_0 = df_gdp_deflator.copy()

base_year = 2020

for country_code in df_gdp_deflator["Country Code"]:
    for year in range(base_year, 1959, -1):
        if year == base_year:
            df_gdp_deflator_0.loc[df_gdp_deflator_0["Country Code"] == country_code, str(year)] = 1
        elif year != base_year:
            df_gdp_deflator_0.loc[df_gdp_deflator_0["Country Code"] == country_code, str(year)] = df_gdp_deflator_0.loc[df_gdp_deflator_0["Country Code"] == country_code, str(year + 1)] * (1 + df_gdp_deflator.loc[df_gdp_deflator["Country Code"] == country_code, str(year + 1)] / 100)

df_gdp_deflator_0.to_csv(r"df_gdp_deflator_0.csv")
Tomerikoo
  • 18,379
  • 16
  • 47
  • 61
Howard
  • 81
  • 1
  • 8
  • Have you read this? It shows you how to use **pow** to **compound.** – NeoTheNerd Dec 14 '21 at 12:48
  • Related: [adjust pandas dollar column for inflation](https://stackoverflow.com/q/56122433/6045800) – Tomerikoo Dec 14 '21 at 14:17
  • Thank you @NeoTheNerd, the pow function is useful but, if understand correctly, the referenced discussion concerns a fixed rate rather than a set of rates that vary from year to year? – Howard Dec 15 '21 at 13:11
  • Is the compounding formula constant from year to year? and a set of rates, I'd need to understand, what the set of rates entails. When I'm compounding to get information like company free cash flows, I compound the value from the last value and increment by one. – NeoTheNerd Dec 15 '21 at 13:17
  • Thank you @Tomerikoo, numpy.cumprod is perhaps the function I was looking for. I will need to figure out how to apply it but it seems promising. – Howard Dec 15 '21 at 13:18
  • @NeoTheNerd ah by varying rates I was referring to the annual % of inflation. It is not constant. The cash flow of course also goes up and down over the years. Therefore, there is a series of pairs of unique rate and value. – Howard Dec 15 '21 at 13:24

1 Answers1

1

1. Clean the data

I would suggest to firstly have a cleaner data frame to work, e.g. by creating a long dataframe making it simpler to work:

import pandas as pd

df = pd.read_csv("API_NY.GDP.DEFL.KD.ZG_DS2_en_csv_v2_3358227.csv", header=2)
print(df)
#                     Country Name Country Code  ...        2020 Unnamed: 65
# 0                          Aruba          ABW  ...         NaN         NaN
# 1    Africa Eastern and Southern          AFE  ...    5.510381         NaN
# 2                    Afghanistan          AFG  ...    5.497478         NaN
# 3     Africa Western and Central          AFW  ...    1.591080         NaN
# 4                         Angola          AGO  ...   15.096997         NaN
# ..                           ...          ...  ...         ...         ...
# 261                       Kosovo          XKX  ...    0.975958         NaN
# 262                  Yemen, Rep.          YEM  ...         NaN         NaN
# 263                 South Africa          ZAF  ...    5.286184         NaN
# 264                       Zambia          ZMB  ...   21.628327         NaN
# 265                     Zimbabwe          ZWE  ...  610.000000         NaN

id_vars = ['Country Code']
value_vars = [str(_) for _ in range(1960, 2020 + 1) if str(_) in df.columns]
var_name = "year"
value_name = "rate"
long = df.melt(id_vars=id_vars, value_vars=value_vars, var_name=var_name, value_name=value_name)
long.rate = long.rate / 100 + 1
print(long)
#       Country Code  year      rate
# 0              ABW  1960       NaN
# 1              AFE  1960       NaN
# 2              AFG  1960       NaN
# 3              AFW  1960       NaN
# 4              AGO  1960       NaN
# ...            ...   ...       ...
# 16221          XKX  2020  1.009760
# 16222          YEM  2020       NaN
# 16223          ZAF  2020  1.052862
# 16224          ZMB  2020  1.216283
# 16225          ZWE  2020  7.100000

2. Use groups

Pandas groupb is very useful and allows here grouping by country code:

compounded = pd.DataFrame.from_records([
    {
        "Country Code": code, 
        "Compounded Rate": 100 * group.rate.product()
    } for code, group in long.groupby("Country Code")
    ])
print(compounded)
#     Country Code  Compounded Rate
# 0            ABW     2.668155e+02
# 1            AFE     1.740744e+04
# 2            AFG     2.913292e+02
# 3            AFW     2.338532e+03
# 4            AGO     5.692971e+12
# ..           ...              ...
# 261          XKX     1.398405e+02
# 262          YEM     5.087848e+03
# 263          ZAF     1.816078e+04
# 264          ZMB     1.138452e+07
# 265          ZWE     2.436689e+04

Note that the compounded inflation rate is calculated via group.rate.product() by multiplying (1 + ) the yearly inflation rates for a given CC.

However, e.g. for US, the result doesn't seem the be in line with the official numbers: we have 691.071696 vs 877.79 reported.

The discrepancy can be explained by the data: e.g. the 2020/2021 inflation is 101.211289 in the dataset whereas the gov website reports 101.40.

Then again, I might also have made a mistake somewhere...

niko
  • 5,253
  • 1
  • 12
  • 32
  • Hi, niko. Many thanks for the detailed and structured answer, making it easy to follow. I have two questions: – Howard Dec 15 '21 at 13:33
  • (1) is long.rate the equivalent of long["rate"]; (2) is the "Compounded Rate" in the output a single number for each country rather than a set of numbers that corresponds to multiple years? – Howard Dec 15 '21 at 13:38
  • (1) Yes. In general: `df.col_name == df["col_name"]` (works with restriction, e.g. no white space). (2) What I called compounded (inflation) rate is a single number per country. If you wanted the vector representing the cumulative compounded rate, you would use `group.rate.cumprod()` (instead of `prod`). – niko Dec 15 '21 at 14:12
  • many thanks! cumprod is the exact function that allows varying rates over the years to be cumulated. – Howard Mar 07 '22 at 12:43