0

I am attempting to create my first custom function in R (yay!). I've got something that sort of works now but I think it could be improved.

Basically, I want to create my own custom table within R that can be run through xtable for a final report. I want the table to follow this format for each column:

group1mean, group1sd, group2mean, group2sd, t-value, p-value.

At current, my function does this. However, it produces column names (e.g., V3 and V4) that I would like to leave blank and I would like to have it loop through multiple dependent variables and append the results as new rows in the matrix automatically. Right now, I have to write a line of code for each dependent variable manually (in the example below the DVs are PWB, SWB, and EWB.

Here is my code so far:

data <- read.delim("~/c4044sol.txt", header=T)

library(psych)

proc.ttest <- function(dv,group,decimals) {

    x1 <- describeBy((dv), (group), mat=TRUE)
    stat1 <- t.test((dv) ~ (group))
    output1 <- c(paste (round(x1$mean[1], digits=(decimals)),"(", round(x1$sd[1], digits=    (decimals)), ")", sep =" "),
           paste (round(x1$mean[2], digits=(decimals)), "(", round(x1$sd[2], digits=(decimals)), ")", sep =" "),
           round(stat1$statistic, digits=2), round(stat1$p.value, digits=3))

    return(output1) 
}

toprow <- c("M (SD)", "M (SD)", "t", "p")

outtable <- rbind(toprow,
              proc.ttest(data$PWB, data$college, 2),
              proc.ttest(data$SWB, data$college, 2),
              proc.ttest(data$EWB, data$college, 2))


colnames(outtable) <- c("College graduate", "Less than college graduate", "", "")
row.names(outtable) <- c("", "PWB", "SWB", "EWB")

library(xtable)
xtable(outtable)

So to repeat, I would like to suppress the column names "V3" and "V4" (leave them blank) and make the code run automatically on a list of variables. Are either of these things possible? Thanks for your time.

zero323
  • 322,348
  • 103
  • 959
  • 935
graywolf97
  • 33
  • 3
  • Hello, why specifically do you need "blank" column names? What would that allow you to do that you cannot do with having column names? – Ricardo Saporta Oct 01 '13 at 18:33
  • I have essentially two rows of column names - one for the groups "college" and "non college" and one under that for the statistics, M1, SD1, M2, SD2, t, p. As it works now, when you use xtable it prints "V3" above t and "V4" above p. It's non-essential information for the final report. So it's an aesthetic thing. – graywolf97 Oct 01 '13 at 18:49
  • Is the issue one of formatting for printing output or one of managing the data? If the latter, it sounds like using a `list` of `data.frame`s would be your best bet – Ricardo Saporta Oct 01 '13 at 18:50
  • It's the former. The data are all where I want them to be. I just need the top level to be formatted correctly for printing via xtable. – graywolf97 Oct 01 '13 at 19:13
  • It's like trying to use a hammer on a nail. – Ricardo Saporta Oct 01 '13 at 19:14

2 Answers2

0

Try keeping outtable as you have it, but without toprow. Instead, use toprow as the names:

toprow <- c("M (SD)", "M (SD)", "t", "p")

outtable <- rbind( # toprow,
              proc.ttest(data$PWB, data$college, 2),
              proc.ttest(data$SWB, data$college, 2),
              proc.ttest(data$EWB, data$college, 2))

names(outtable) <- toprow
## note that the parens and spaces are 
##   not best practices, but this should still
##   get your your desired results
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
  • I see what you're trying to do here and I understand why you would do it this way. However, it's not working for my specific purposes. Because I have statistics and group names I want them both to be in the table. So I need the very top row to be the group labels "College Graduate" and "Less than college graduate". Then immediately below that I need another row that has the labels for each statistic M (SD), M (SD), t, p. Your solution only handles the second set. – graywolf97 Oct 01 '13 at 18:57
  • 1
    this doesnt sound like a data-handling issue, but rather an output/printing issue. If so, I would highly suggest using other means, including `cat` `print` etc – Ricardo Saporta Oct 01 '13 at 19:13
  • I think you might be right. I think I need to actually suppress the column names via the print function and actually include all the labels I want in the first two rows of the matrix. Thanks for you help. – graywolf97 Oct 01 '13 at 19:29
0

I fixed the extra column labels printing issue by putting all the labels I actually wanted in the final table in the first two rows of the matrix...

toptoprow <- c("College graduate", "Less than college graduate", "", "")
toprow <- c("M (SD)", "M (SD)", "t", "p")


outtable <- rbind(toptoprow,toprow, proc.ttest(PWB, college, 2),
              proc.ttest(SWB, college, 2),
              proc.ttest(EWB, college, 2))

And then suppressing the colnames using the print function (as suggested by Ricardo)...

print(xtable(outtable), hline.after=c(-1,1,nrow(outtable)),include.colnames=FALSE)

I still would like to automate the function itself so I can ideally give it a list of variable names, it will run the function on each variable, and populate the results in the final matrix. But one baby step at at time...

graywolf97
  • 33
  • 3