I have a dataframe of prescribing data from UK practices. The original data is at http://datagov.ic.nhs.uk/T201207.exe. I've wrangled it into a PCT level data frame, ordered by PCT and by the most common prescription (descending order in the 'items' column).
pct sha chem.code items nic act.cost
32360 5ZW Q39 0212000Y0 12421 17811.40 16888.21
28769 5ZW Q39 0209000A0 8741 7834.43 7554.72
4439 5ZW Q39 0103050P0 7733 21566.51 20210.05
...
82763 5D7 Q30 0603020L0 1 1.08 1.13
152673 5D7 Q30 1502010C0 1 0.92 0.85
5149 5D7 Q30 0104020N0 1 0.70 0.68
149501 5D7 Q30 1311060I0 1 0.50 0.49
There are 151 pct's and each has over 1000 items. I want to extract the top 50 items for each pct. I know I could write a for
loop and just iterate over the levels of pct, but that's not R
. I haven't figured out how to use apply
or sapply
to do the subset over the levels. This seems to be better at getting entire columns than getting a subset of the rows.