4

I have been trying to use the Hmisc package to produce output similar to below.

                                            Group
Step      Method                   G1           G2         G3 .......   

s1          m1          N          45            26         17
                       Min          2             2         3
                       Max          7             6         4
                       Mean         3.5          4.5        2.5
                       Sdev         2.6          3.6         1

            m2          N          
                       Min          
                       Max          
                       Mean        
                       Sdev  

s2          m1          N          
                       Min          
                       Max        
                       Mean         
                       Sdev        

            m2          N          
                       Min          
                       Max          
                       Mean        
                       Sdev    

My raw data looks like below.

       Site    Step  Method   Group   Outcome
        a1      s1     m1      g1      3.6
        a1      s1     m4      g1      2.3
        a2      s2     m1      g2      14
        a3      s1     m3      g1      7
        a3      s3     m6      g1      1
        a4      s1     m1      g3      6.2

I am trying to compute the n, min, mean, sdev, and max for all the site outcomes in each group,by step and method. I am using the sites as my unique identifiers. Not every site has every step, and not every step has every method, so there are missing values. I have been playing with the Hmisc package, and have been able to compute the n, mean, min, and max using fun=summary, but I have only been able to do it for each method individually, and it is displayed in a not so pretty matrix. I know that the package uses latex (I am total novice with this), and I have used the option in summary(....,file="data.tex") I think it is to save a .dvi file, which I right click on and tell it to covert to pdf, but the pdf is all broken looking with data in the wrong place. I really don't know what I am doing wrong, so any feedback/input is greatly appreciated. Cheers.

user2117897
  • 95
  • 1
  • 6

2 Answers2

4

The tabular function in the tables package was ment to create SAS like tables. You can try something like this (dat beeing your example data):

library(tables)
(tab1 <- tabular(Step*Method*Heading()*Outcome*((n = 1) + min + max + mean + sd) ~ Group, 
        data = dat))

                  Group          
 Step Method      g1    g2   g3  
 s1   m1     n     1.0     0  1.0
             min   3.6   Inf  6.2
             max   3.6  -Inf  6.2
             mean  3.6   NaN  6.2
             sd     NA    NA   NA
      m3     n     1.0     0  0.0
             min   7.0   Inf  Inf
             max   7.0  -Inf -Inf
             mean  7.0   NaN  NaN
             sd     NA    NA   NA
             ...   ...   ...  ...

To further process the data, with latex for example, latex(tab1) creates a nicely formated latex tabular.

NOTE: You can easily improve the Formating of the table like this:

tabular(Step*RowFactor(Method, levelnames = c("M1", "M2", "M3", "M4"), spacing = 1)*
                Heading()*Outcome*
                (Format()*(N= 1) + (Min = min) + (Max = max) + (Mean = mean) + 
                    (Sdev = sd)) ~ 
                Factor(Group, levelnames = c("G1", "G2", "G3")), 
        data = dat)

also applying this to all Sites is straight forward, using tabular(Site*Step*...)

adibender
  • 7,288
  • 3
  • 37
  • 41
  • This is exactly the sort of thing I was looking for. I got the tabular to work but am running into some errors though. The main one is that when I try to run tabular for all the methods at once, I get the error "evaluation nested too deeply:infinite recursion/options(expressions=)?". I get tabular to work when I do one method at a time though, but when I do the latex(tab1) to give me the latex code, and I copy and paste it, I get the error "Missing \begin{document}" in my Tex thing. Thanks for the advice. – user2117897 Mar 05 '13 at 03:11
  • as for the first issue, hard to say without the data and the code you used. Can you post the code you used that triggered the error? And is your data freely accesible on the web? – adibender Mar 05 '13 at 11:07
  • 1
    as for the second issue: latex(tab1) only gives you the latex code for the table, in your document you need to put it between \documentclass{article}\begin{document}...\end{document}, where ... is the code produced by latex(tab1), but that's more of an latex issue, not R – adibender Mar 05 '13 at 11:09
  • Thanks. I got rid of the infinite recursion error. Was a hidden character in my response vector I think. I took it out and it runs now. I got the pdf created also, however, the output is cutoff on the page and it only prints 1 page of output even though it should need a second page. – user2117897 Mar 06 '13 at 01:24
  • @user2117897: try latex(tab1, options = list(tabular = "longtable")) – adibender Mar 06 '13 at 01:34
  • @user2117897: also, for more information on using `tabular` and `latex.tabular`, check out the package vignette (http://cran.r-project.org/web/packages/tables/vignettes/tables.pdf) if needed – adibender Mar 06 '13 at 01:47
3

i'm assuming you don't care about the formatting (which might be incorrect), you could just use the aggregate function :)

# run any function, grouped by whatever variables you want..
aggregate( Outcome ~ Step + Method + Group , data = x , summary )

# the summary function doesn't include standard deviations,
# so run that separately
aggregate( Outcome ~ Step + Method + Group , data = x , sd )

assuming your data looks like this..

# read in your data
x <- read.table( h = T , text = "Site    Step  Method   Group   Outcome
        a1      s1     m1      g1      3.6
        a1      s1     m4      g1      2.3
        a2      s2     m1      g2      14
        a3      s1     m3      g1      7
        a3      s3     m6      g1      1
        a4      s1     m1      g3      6.2")

if it's just performing a task by group, look at ?aggregate and ?tapply and in the future include groupwise in your search terms.

if you want to run it all in one line, you can create a quick custom function that just lumps the output of summary together with the output of sd..

# alternatively, you can tack a standard deviation onto the summary function if you like..
swsd <- function( x ) c( summary( x ) , sd( x ) )

# ..and then run that through `aggregate` instead :)
aggregate( Outcome ~ Step + Method + Group , data = x , swsd )
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77