3

I'm using xtabs to tabulate some data that contains NAs. In order to make sure the totals are complete, I'm using addNA to count the ones with missing factor levels.

However, this causes problems when using xtable to export to LaTeX for Sweaving because there are NAs in the row and column names now. I have a solution:

rownames(tab)[is.na(rownames(tab))]<-"NA"
colnames(tab)[is.na(colnames(tab))]<-"NA"

But this can become tiresome for lots of tables, is there a way of doing this more automatically? Or is there a better way of producing the tables in the first place?

James
  • 65,548
  • 14
  • 155
  • 193

2 Answers2

6

Interesting question. I couldn't find a way of dealing with this using xtable itself, either. So the best I can suggest is to turn your workaround into a little function that can then be called easily.

For example:

# Construct some data
df <- data.frame(
  x1 = addNA(sample(c(NA, LETTERS[1:4]), 100, replace = TRUE)),
  x2 = addNA(sample(c(NA, letters[24:26]), 100, replace = TRUE))
)

# Create a function to rename NA row and column names in a data.frame
rename_NA <- function(x){
  rownames(x)[is.na(rownames(x))] <- "NA"
  colnames(x)[is.na(colnames(x))] <- "NA"
  x
}

tab <- rename_NA(xtabs(~x1+x2, data=df))
xtable(tab)

This creates valid latex without error:

% latex table generated in R 2.13.0 by xtable 1.5-6 package
% Wed Apr 27 17:20:21 2011
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrrrr}
  \hline
 & x & y & z & NA \\ 
  \hline
A & 4.00 & 7.00 & 10.00 & 4.00 \\ 
  B & 6.00 & 5.00 & 4.00 & 2.00 \\ 
  C & 8.00 & 4.00 & 4.00 & 2.00 \\ 
  D & 8.00 & 5.00 & 1.00 & 6.00 \\ 
  NA & 5.00 & 2.00 & 7.00 & 6.00 \\ 
   \hline
\end{tabular}
\end{center}
\end{table}
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • Thanks, I've decided "go in early" and use a modified `addNA` function, which I'll post below. Accepting your answer though as it still solves the problem nicely. – James Apr 28 '11 at 10:25
2

Another solution to consider is to use a modified addNA to allow it to output the factor level as a string in the first place:

addNA2 <- function (x, ifany = FALSE, as.string = TRUE)
{
    if (!is.factor(x)) 
        x <- factor(x)
    if (ifany & !any(is.na(x))) 
        return(x)
    ll <- levels(x)
    if (!any(is.na(ll))) 
        ll <- c(ll, NA)
    x <- factor(x, levels = ll, exclude = NULL)
    if(as.string) levels(x)[is.na(levels(x))] <- "NA"
    x
}
James
  • 65,548
  • 14
  • 155
  • 193
  • +1 Nice one, James. If you _really_ feel strongly about this, you can go one step further and overwrite the behaviour of addNA itself. – Andrie Apr 28 '11 at 19:53
  • @Andrie Not brave enough to overwrite the base functions just in case it messes with anything. Would probably need to change the default of `as.string` to `FALSE` to maintain compatibility. – James May 09 '11 at 14:32