I'm working with pandas in an statistical project and I have population dataset that I should assign from blocks to single plots, the question is if there is any method that I can apply to fill a certain set of plots with values without decimals until complete the overall blocks population values.
The input dataframe is like:
plot_id block_id block_pop
1 1 5
2 1 5
3 2 11
4 2 11
5 2 11
- Calculate number of plots by block:
group_1 = df.groupby('block_id')['plot_id'].count().reset_index().rename(columns = {'plot_id': 'n_plots'})
df = df.merge(group_1, on = 'block_id')
Calculate mean population by plot (without remainder):
df['pop_mean'] = df['block_pop']//df['n_plots']
The step I'm stuck in is to distribute the remainder among some block plots as int numbers, not float, filling the total block population.
The expected result is something like:
plot_id block_id block_pop n_plots pop_mean final_plot_pop
1 1 5 2 2 3
2 1 5 2 2 2
3 2 11 3 3 4
4 2 11 3 3 4
5 2 11 3 3 3
Any help will be very much appreciated