0

Here is a data.table with lots of records / rows:

dt <- data.table(a=sample(letters,1000,replace = T), b=rnorm(1000))

We can view it using simply:

dt

... and a very convenient view is generated with first and last 5 rows.

However, when using xtable to print this using knitr for a pdf report, xtable prints all the thousand rows:

print(xtable(dt))

Any ideas how to fix this?

I want a pretty table from xtable but in the default format of data.table.

A hack like rbind with first and last five elements is possible, but not very elegant.

Shambho
  • 3,250
  • 1
  • 24
  • 37

1 Answers1

1

Here is a modified version of print.data.table that returns the formatted object rather than prints it:

firstLast <- function(x, ...) {
    UseMethod("firstLast")
}

firstLast.data.table  <- function (x,
                                   topn = getOption("datatable.print.topn"),
                                   nrows = getOption("datatable.print.nrows"),
                                   row.names = TRUE, ...) {
    if (!is.numeric(nrows)) 
        nrows = 100L
    if (!is.infinite(nrows)) 
        nrows = as.integer(nrows)
    if (nrows <= 0L) 
        return(invisible())
    if (!is.numeric(topn)) 
        topn = 5L
    topnmiss = missing(topn)
    topn = max(as.integer(topn), 1L)
    if (nrow(x) == 0L) {
        if (length(x) == 0L) 
            return("Null data.table (0 rows and 0 cols)\n")
        else return(paste("Empty data.table (0 rows) of ", length(x), 
            " col", if (length(x) > 1L) 
                "s", ": ", paste(head(names(x), 6), collapse = ","), 
            if (ncol(x) > 6) 
                "...", "\n", sep = ""))
    }
    if (topn * 2 < nrow(x) && (nrow(x) > nrows || !topnmiss)) {
        toprint = rbind(head(x, topn), tail(x, topn))
        rn = c(seq_len(topn), seq.int(to = nrow(x), length.out = topn))
        printdots = TRUE
    }
    else {
        toprint = x
        rn = seq_len(nrow(x))
        printdots = FALSE
    }
    toprint = data.table:::format.data.table(toprint, ...)
    if (isTRUE(row.names)) 
        rownames(toprint) = paste(format(rn, right = TRUE), ":", 
            sep = "")
    else rownames(toprint) = rep.int("", nrow(x))
    if (is.null(names(x))) 
        colnames(toprint) = rep("NA", ncol(toprint))
    if (printdots) {
        toprint = rbind(head(toprint, topn), `---` = "", tail(toprint, 
            topn))
        rownames(toprint) = format(rownames(toprint), justify = "right")
        return(toprint)
    }
    if (nrow(toprint) > 20L) 
        toprint = rbind(toprint, matrix(colnames(toprint), nrow = 1))
    return(toprint)
}

This can be used to prepare a large data.table for formatting by xtable:

library(xtable)
xtable(firstLast(dt))
% latex table generated in R 3.1.2 by xtable 1.7-4 package
% Tue Feb 17 20:15:12 2015
\begin{table}[ht]
\centering
\begin{tabular}{rll}
  \hline
 & a & b \\ 
  \hline
   1: & i & -0.6356429 \\ 
     2: & w & -1.1533783 \\ 
     3: & r & -0.7459959 \\ 
     4: & x &  1.5646809 \\ 
     5: & o & -1.8158744 \\ 
    --- &  &  \\ 
   996: & z & -1.0835897 \\ 
   997: & a &  0.9219506 \\ 
   998: & q &  0.3388118 \\ 
   999: & l & -1.7123250 \\ 
  1000: & l &  0.1240633 \\ 
   \hline
\end{tabular}
\end{table}
Ista
  • 10,139
  • 2
  • 37
  • 38
  • Many thanks. This is exactly what I needed. I tried to compare the function your wrote here with the function I got from `getAnywhere(print.data.table)`, and found very little difference. Would you plz explain what have you changed here? Also any recommendation for reading reference would be much appreciated! – Shambho Feb 18 '15 at 20:00
  • 1
    Functions in R generally return a value, and this value is usually the interesting thing. But there are exceptions to this general rule, and `print` functions are one of those exceptions. We don't use `print` for it's return value, but for printing something to the console. My modification of `print.data.table` just makes it behave more like most other functions, i.e., it returns a value rather than printing it. – Ista Feb 18 '15 at 21:18