2

I am trying to calculate z-statistic over regular interval of rows.

mean = 77
std = 31
samp.45 = rnorm(45,mean,std)

z.test = function(a, mu, sd){
zeta = (mean(a) - mu) / (sd / sqrt(length(a)))
return(zeta)
}

z.hypothesis =  function(a, mu, sd){
z.stat = z.test(a,mu,sd)
if(abs(z.stat)>1.96){
return(1)
}
else{
return(0)
 }
}

group = as.numeric(ceiling(1:45/15))
df <- as.data.frame(cbind(samp.45, group))
## Correct this
tapply(df$samp.45, as.factor(df$group), z.hypothesis(df$samp.45,mean,std)) 

I was planning to use tapply to perform function calculation for each group and return the output. I know that simple functions like mean can be applied directly and give the desired result, but how can I get a similar output for my own function? Any other approach is also welcome.

> tapply(df$samp.45, as.factor(df$group), mean)
       1        2        3 
78.19556 79.65747 68.91818 
Manish
  • 458
  • 6
  • 19

2 Answers2

1

tapply(df$samp.45, as.factor(df$group), function(x) z.hypothesis(x,mean,std))

pieca
  • 2,463
  • 1
  • 16
  • 34
  • That's it?! :P Well, I thought about it, then felt maybe since the function is already defined it would be inappropriate. – Manish Jun 04 '18 at 17:12
1

In the tidyverse you can try

library(tidyverse)
df %>% 
  group_by(group) %>% 
  summarise(zh=z.hypothesis(samp.45, MEAN, std))
# A tibble: 3 x 2
  group    zh
  <dbl> <dbl>
1     1     0
2     2     0
3     3     0

Avoid to use internal function names as variable names. Thus I renamed mean to MEAN

Roman
  • 17,008
  • 3
  • 36
  • 49