I am busy messing around in SQLite and want to calculate the Gini coefficient per US states based on median income per capita and the populations per county.
The process is explained here: https://medium.com/google-cloud/calculating-gini-coefficient-in-bigquery-3bc162c82168
Only I have no idea how to do this for my particular data set. I understand that I would need to determine the % of income and population per county and then aggregate that per state to get the cumulative values, but other than that I am completely lost.
The output would then be the Gini coefficient per state but dont know how I could create a new entity with these values.
A sample of my data set is as follows, and although not complete, there should be enough data to get it working.
State | County | TotalPop | IncomePerCap | |
---|---|---|---|---|
Alabama | Autauga | 55221 | 24974 | |
Alabama | Baldwin | 195121 | 27317 | |
Alabama | Barbour | 26932 | 16824 | |
Alabama | Bibb | 22604 | 18431 | |
Alabama | Blount | 57710 | 20532 | |
Alabama | Bullock | 10678 | 17580 | |
Alabama | Butler | 20354 | 18390 | |
Alabama | Calhoun | 116648 | 21374 | |
Alabama | Chambers | 34079 | 21071 | |
Alabama | Cherokee | 26008 | 21811 | |
Alaska | Bristol Bay Borough | 970 | 38267 | |
Alaska | Denali Borough | 2060 | 35315 | |
Alaska | Dillingham Census Area | 4979 | 22257 | |
Alaska | Fairbanks North Star Borough | 99705 | 33244 | |
Alaska | Haines Borough | 2560 | 33902 | |
Alaska | Hoonah-Angoon Census Area | 2128 | 30943 | |
Alaska | Juneau City and Borough | 32531 | 39979 | |
Alaska | Kenai Peninsula Borough | 57221 | 31537 | |
Alaska | Ketchikan Gateway Borough | 13699 | 32021 | |
Alaska | Kodiak Island Borough | 13973 | 30657 |