I have a complex data frame and a minimal example is as follows:
df <- structure(list(District = c("Adilabad", "Adilabad", "Adilabad",
"Adilabad", "Adilabad", "Adilabad", "Adilabad", "Adilabad", "Adilabad",
"Adilabad"), Subdistt = c("Adilabad", "Adilabad", "Adilabad",
"Tamsi", "Tamsi", "Tamsi", "Tamsi", "Tamsi", "Tamsi", "Tamsi"
), TRU = c("Total", "Rural", "Urban", "Total", "Rural", "Urban",
"Rural", "Rural", "Urban", "Urban"), Level = c("District", "District",
"District", "Sub-District", "Sub-District", "Sub-District", "Village",
"Village", "Town", "Town"), No_HH = c(1277, 364, 913,
1277, 364, 913, 117, 247, 614, 299)), .Names = c("District",
"Subdistt", "TRU", "Level", "No_HH"), row.names = c(NA, 10L), class = "data.frame")
which looks like this:
District Subdistt TRU Level No_HH
1 Adilabad Adilabad Total District 1277
2 Adilabad Adilabad Rural District 364
3 Adilabad Adilabad Urban District 913
4 Adilabad Tamsi Total Sub-District 1277
5 Adilabad Tamsi Rural Sub-District 364
6 Adilabad Tamsi Urban Sub-District 913
7 Adilabad Tamsi Rural Village 117
8 Adilabad Tamsi Rural Village 247
9 Adilabad Tamsi Urban Town 614
10 Adilabad Tamsi Urban Town 299
Each subsequent column in a way is a kind of subset of the previous column. I have to validate if the sum of Sub-District and District at the Rural, Urban and Total level.
For eg: The sum of rows 7 and 8 is equal to the value in row 5. Row 5 is a Rural Sub-Distrit. As we extend the df, I have many rural sub-districts. The sum of all rural sub-districts is given in the Rural District, which is row 2.
A minimal expected output will be as follows:
District Subdistt TRU Level No_HH
1 Adilabad Tamsi Rural Sub-District 364
2 Adilabad Tamsi Urban Sub-District 913
364 is a sum of 117 + 247 given in the minimal example above and 913 is the sum of sum of rows 614 + 299 given in the minimal example.
Currently I am able to subset to a particular value but don't know how to sum based on these complex selections. Can someone help?