0

Please consider the following. I started writing reproducible documents in with R markdown and want some output for a report. As I am working with more than one data.frame and their column names are not very informative or pretty I would like to make use of the col.names argument in knitr::kable().


Problem: Since the data.frame is fairly big and I want to display only specific columns throughout the report I would like the new column names to appear automatically depending on the columns I choose.

I can do this by hand like in the following example:

library(knitr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

knitr::kable(iris %>% head(),
             col.names = c("Sepal length", "Sepal width", "Petal length",
                           "Petal width", "Species"))

| Sepal length| Sepal width| Petal length| Petal width|Species |
|------------:|-----------:|------------:|-----------:|:-------|
|          5.1|         3.5|          1.4|         0.2|setosa  |
|          4.9|         3.0|          1.4|         0.2|setosa  |
|          4.7|         3.2|          1.3|         0.2|setosa  |
|          4.6|         3.1|          1.5|         0.2|setosa  |
|          5.0|         3.6|          1.4|         0.2|setosa  |
|          5.4|         3.9|          1.7|         0.4|setosa  |

But when I reduce this data.frame to display only certain columns, I have to manually set the col.names again (here deleting the col.names I don't need anymore) to not receive an error message:

knitr::kable(iris %>% filter(Species == "setosa") %>% 
           select(Sepal.Length, Sepal.Width, Species) %>% head(),
         col.names = c("Sepal length", "Sepal width", "Species"))

| Sepal length| Sepal width|Species |
|------------:|-----------:|:-------|
|          5.1|         3.5|setosa  |
|          4.9|         3.0|setosa  |
|          4.7|         3.2|setosa  |
|          4.6|         3.1|setosa  |
|          5.0|         3.6|setosa  |
|          5.4|         3.9|setosa  |

Question: Is there a way to overcome this with for example using switch and specifying only once that "Sepal.Length" = "Sepal length" etc.? This should also take into account any new columns I create for instance through dplyr::mutate() by either keeping the newly added column name as is or by also specifying it at the beginning of the document without throwing back an error every time this column is not (yet) existing.

Frederick
  • 810
  • 8
  • 28

1 Answers1

0

You can modify the column names to data.frames and use column number to specify columns in select()

Such as:

library(dplyr)
colnames(iris) <- c("Sepal length", "Sepal width", "Petal length",
                           "Petal width", "Species")

knitr::kable(iris %>% head() %>% select(1,2,5), format = "markdown")    

| Sepal length| Sepal width|Species |
|------------:|-----------:|:-------|
|          5.1|         3.5|setosa  |
|          4.9|         3.0|setosa  |
|          4.7|         3.2|setosa  |
|          4.6|         3.1|setosa  |
|          5.0|         3.6|setosa  |
|          5.4|         3.9|setosa  |

Created on 2018-08-21 by the reprex package (v0.2.0.9000).

TC Zhang
  • 2,757
  • 1
  • 13
  • 19
  • Yes. But this would also change the column names for the entire data analysis (i.e. I would have to use the full names like "Sepal length" if I don't want to use the column numbers). Also, using column numbers instead of column names I found less reliable when working with `data.frames` that change in the number of their columns. – Frederick Aug 21 '18 at 07:46