0

I'm working with pandas in an statistical project and I have population dataset that I should assign from blocks to single plots, the question is if there is any method that I can apply to fill a certain set of plots with values without decimals until complete the overall blocks population values.

The input dataframe is like:

plot_id   block_id   block_pop  

      1          1           5          
      2          1           5          
      3          2          11          
      4          2          11          
      5          2          11    
  1. Calculate number of plots by block:
    group_1 = df.groupby('block_id')['plot_id'].count().reset_index().rename(columns = {'plot_id': 'n_plots'})
    df = df.merge(group_1, on = 'block_id')
  1. Calculate mean population by plot (without remainder):

    df['pop_mean'] = df['block_pop']//df['n_plots']

  2. The step I'm stuck in is to distribute the remainder among some block plots as int numbers, not float, filling the total block population.

The expected result is something like:

plot_id   block_id   block_pop   n_plots   pop_mean   final_plot_pop

      1          1           5         2          2                3
      2          1           5         2          2                2
      3          2          11         3          3                4
      4          2          11         3          3                4
      5          2          11         3          3                3

Any help will be very much appreciated

Rodrigo Vargas
  • 273
  • 3
  • 17

0 Answers0