Recode or bin multiple numeric columns as categorical values and summing the counts

Question

I want to recode or bin multiple numeric columns as categorical values and summing the counts. The columns are ages by individual year from 18yrs to 90+ yrs with the cells containing summed totals by age. I want to create categories to reflect age cohorts and maintain the now cumulative count.
The aim is to categorise the ages in to the following cohorts:

'young_adults_18_29', 'younger_working_age_30_49', 'older_working_age_50_64', 'retirement_age_65_79', 'older_adults_80+'

score 0 · Answer 1 · edited Aug 18 '22 at 04:08

0

# df <- original dataframe
max_age = df.columns[-1]    # I assumed that the last column value is also the number(age).

df['young_adults_18_29'] = sum(df[i] for i in range(18, 30))
df['younger_working_age_30_49'] = sum(df[i] for i in range(30, 50))
df['older_working_age_50_64'] = sum(df[i] for i in range(50, 65))
df['retirement_age_65_79'] = sum(df[i] for i in range(65, 80))
df['older_adults_80+'] = sum(df[i] for i in range(80, max_age + 1))

df_new = df[['la_code', 'young_adults_18_29', 'younger_working_age_30_49', 'older_working_age_50_64', 'retirement_age_65_79', 'older_adults_80+']]

edited Aug 18 '22 at 04:08

Sunderam Dubey

1
11
20
40

answered Aug 09 '22 at 02:41

jaemmin

341
1
12

Thank you for your code. I attempted to implement the code, however, I get key errors for the first line on the column headed 18. I may need to reindex the headers. – db2020 Aug 09 '22 at 09:00
I have also attempted using bins and perhaps the headers are what was stopping it working. – db2020 Aug 09 '22 at 09:01
I guess the type of 18 is string. If so, edit the code as sum(df[str(i)] for i in range(18, 30)) – jaemmin Aug 09 '22 at 23:40

Recode or bin multiple numeric columns as categorical values and summing the counts

1 Answers1