2

I need to group my data into 2 or 3 levels of selection and apply a function on each specific group. The command tapply does it when I use a standard function (mean, median, SD), but when I insert more than one filter variable it does not work.

The code:

  tipo      <-  rep(LETTERS[1:3], 9)
  vendedor  <-  rep(LETTERS[11:13], 9)        
  produto   <-  rep(LETTERS[17:19],9)
  valor     <-  trunc(rnorm(27,1000,50)) 
  dados     <-  data.frame(tipo, vendedor, produto, valor)
  funcao    <-  function(dados) c(media = mean(valor), 
                                 desvio = sd(valor)*0.23)
  simplify2array(tapply(dados$valor, dados$tipo, funcao))
  simplify2array(tapply(dados$valor, list(dados$tipo, dados$vendedor), funcao))

The output of the command tapply 1, which works OK:

> simplify2array(tapply(dados$valor, dados$tipo, funcao))
            A          B          C
media  998.370370 998.370370 998.370370
desvio   9.763732   9.763732   9.763732

The output of tapply 2 command, which does not work correctly:

> simplify2array(tapply(dados$valor, list(dados$tipo, dados$vendedor), funcao))
  K         L         M        
A Numeric,2 NULL      NULL     
B NULL      Numeric,2 NULL     
C NULL      NULL      Numeric,2

Does anyone know how I can fix this?

josliber
  • 43,891
  • 12
  • 98
  • 133

1 Answers1

2

As I understand it, you have a function funcao that returns 2 elements (media and desvio), and you want to apply it across each tipo/vendedor pairing using tapply. You can do this with:

funcao <- function(valor) c(media = mean(valor), desvio = sd(valor)*0.23)
simplify2array(tapply(dados$valor, paste(dados$tipo, dados$vendedor), funcao))
#              A K       B L        C M
# media  967.11111 989.11111 1001.55556
# desvio  12.55158  12.63768   11.27241

Basically all I have done is changed the grouping variable from list(dados$tipo, dados$vendedor) to paste(dados$tipo, dados$vendedor), which is just pastes the tipo and vendedor variables together. Thanks to @thelatemail's comment, I also updated funcao to use its argument.

josliber
  • 43,891
  • 12
  • 98
  • 133
  • Or `interaction(tipo,vendedor,drop=TRUE)` - also the function should be edited to act on each group, like `function(x) c(media=mean(x), desvio=sd(x)*0.23)` or it will just give the same results for every group. – thelatemail Oct 12 '15 at 01:36
  • @thelatemail good eye -- I hadn't noticed the function from the OP wasn't using the passed argument! – josliber Oct 12 '15 at 01:39
  • hiii, thanks!!! the idea of guys works, but when I applied another function, does not. My job is wrong? Where am I going wrong? `homo <- function (a){ a <- a[order(a$valor),] n <- nrow(a) # sobra <- rep(NA, n -1) for(i in 1:n){ a$sobra[i] = round(((a$valor[i+1] / a$valor[i])*100)-100, dig = 2) } a <- subset (a, a$sobra < 50) return (a) }` – Woldinei Meier Oct 13 '15 at 17:51
  • @WoldineiMeier it is quite difficult to read code posted in comments. If you need to edit this question to provide more details please do so. If you are asking a new question then please use the "Ask Question" button to ask a new question. – josliber Oct 13 '15 at 18:52