3

I am creating a plot where I want to display labels using ggrepel. I am showing a minimal example below that illustrates how the label has two components separated by a comma - first related to the type of iris flower species and the second to the sample size for that group.

# needed libraries
set.seed(123)
library(ggrepel)

# creating a dataframe with label column
(df <- iris %>%
  dplyr::group_by(Species) %>%
  dplyr::summarise(n = n(), mean = mean(Sepal.Length)) %>%
  purrrlyr::by_row(
    .d = .,
    ..f = ~ paste("list(~",
                  .$Species,
                  ",",
                  .$n,
                  ")",
                  sep = ""),
    .collate = "rows",
    .to = "label",
    .labels = TRUE
  ))
#> # A tibble: 3 x 4
#>   Species        n  mean label               
#>   <fct>      <int> <dbl> <chr>               
#> 1 setosa        50  5.01 list(~setosa,50)    
#> 2 versicolor    50  5.94 list(~versicolor,50)
#> 3 virginica     50  6.59 list(~virginica,50)

# displaying labels
ggplot(iris, aes(Species, Sepal.Length)) +
  geom_point() +
  ggrepel::geom_label_repel(data = df,
                            aes(x = Species, y = mean, label = label),
                            parse = TRUE)

Created on 2018-11-17 by the reprex package (v0.2.1)

My question is how I can get rid of the space between these two components. Although I have specified sep = "" in paste() function, there is still extra space between two components that I don't want (e.g., setosa, 50, versicolor, 50, virginica, 50 labels should instead be setosa,50, versicolor,50, virginica,50).

Indrajeet Patil
  • 4,673
  • 2
  • 20
  • 51
  • For your simple example you could make your labels in such a way that you don't need to parse them. If you use `dplyr::mutate(label = paste0(Species, ",", n))` and remove `parse = TRUE` from `geom_label_repel()` you don't get the extra space. I'm guessing you have reasons to use the label-making code you did on your real use case but I don't think it's needed here. – aosmith Nov 19 '18 at 23:07
  • Yes, that's correct. My actual function displays equations, so that's why I am using plotmath in a list and then parsing it. Here is an example- https://indrajeetpatil.github.io/ggstatsplot/articles/ggcoefstats.html#repeated-measures-anova-aovlist. Look at the degrees of freedom for F-statistic. There is a space between numerator df and denominator df that I want to remove and thus the question. – Indrajeet Patil Nov 20 '18 at 00:27

1 Answers1

2

Below is an updated version of your code, which implements a way to place the comma (without a succeeding space) between the Species name and the sample size. You labels, for example, will look like "~setosa*\",\"*50" instead of list(~setosa,50)

(df <- iris %>%
    dplyr::group_by(Species) %>%
    dplyr::summarise(n = n(), mean = mean(Sepal.Length)) %>%
    purrrlyr::by_row(
      .d = .,
      ..f = ~ paste("~",
                    .$Species,
                    "*\",\"*",
                    .$n,
                    "",
                    sep = ""),
      .collate = "rows",
      .to = "label",
      .labels = TRUE
    ))
#> # A tibble: 3 x 4
#>   Species        n  mean label               
#>   <fct>      <int> <dbl> <chr>               
#> 1 setosa        50  5.01 "~setosa*\",\"*50"
#> 2 versicolor    50  5.94 "~versicolor*\",\"*50"
#> 3 virginica     50  6.59 "~virginica*\",\"*50"


# displaying labels
ggplot(iris, aes(Species, Sepal.Length)) +
  geom_point() +
  stat_smooth(method="lm",size=0.6,se=FALSE,colour="black")+
  ggrepel::geom_label_repel(data = df,
                            aes(x = Species, y = mean, label = label),
                            parse = TRUE)

Which produces the following plot:

Plot with labels without spaces

Hope it helps.

Community
  • 1
  • 1
Taher A. Ghaleb
  • 5,120
  • 5
  • 31
  • 44