I have a data set like below. Now my problem is many fold. For each combination of client, task and subtask I want to exclude the top 10% extreme values. I want 2 data sets in out put, one with the extreme values for all the combination and other one is the normal values for all the combinations.
client task subtask time
a abc t1 12
a abc t2 23
b xyz t3 334
c ijk t1 1
c ijk t1 12
b xyz t1 12
a xyz t2 23
b ijk t3 24
a ijk t2 344
c xyz t3 34343
b ijk t2 34
c xyz t3 34
a xyz t1 23
c ijk t1 223
a ijk t1 23
b xyz t3 21
b ijk t1 45
a xyz t2 23
c ijk t3 45