I have a function like this:
remove_outliers<-function(x){
qnt<- quantile(x,probs=0.99)
y<- x
y[x>qnt]<- NA
y}
The purpose is to remove outliers that are at the top 1% of the data (replace their value with NA). How can I apply this function across levels of a factor variable?
For example,
An original dataset with group A and B:
group share
A 100
A 50
A 30
A 10
... ...
B 100
B 90
B 80
B 60
... ...
Should end up like this:
group share
A NA
A 50
A 30
A 10
... ...
B NA
B 90
B 80
B 60
... ...
I already tried by, tapply, sapply, but these all change the structure of the dataset output.