Warning Message: pairwise_count Function

Question

I'm attempting to follow this tutorial on using the pairwise_count function in the widyr package.

In particular, consider this line of code, where data is a tibble which includes the columns "word" and "section":

data %>% pairwise_count(word, section, sort = TRUE)

However, I received the following warning messages:

distinct_() is deprecated as of dplyr 0.7.0. Please use distinct() instead.
tbl_df() is deprecated as of dplyr 1.0.0. Please use tibble::as_tibble() instead.

I suspect that the pairwise_count function in the widyr package uses some outdated functions, causing these warnings. Is there a more up-to-date package or function in the tidyverse I can use as a replacement? Otherwise, is there a way to use the function without triggering these warnings?

It would be helpful if you included the data and code in this post itself instead of sending people to another website. — Ronak Shah, Sep 20 '20 at 09:52
@RonakShah I've updated the question with the relevant line of code. — dext, Sep 20 '20 at 10:21
What @RonakShah meant is that you should include a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) in your question, not just the line of code that generated the warning messages. That said, I included a [minimal reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) as part of my answer. — Len Greski, Sep 20 '20 at 10:54

Len Greski · Answer 1 · 2020-09-20T11:08:23.960

Code from the widyr section of Text Mining with R Chapter 4 generates deprecated function messages for usage of distinct_() and tbl_df() functions. Since there are over 100 lines of code in Chapter 4 of the book, we whittle it down to the relevant section and minimum number of packages needed to replicate the warning messages.

library(dplyr)
library(janeaustenr)
library(tidytext)
austen_section_words <- austen_books() %>%
     filter(book == "Pride & Prejudice") %>%
     mutate(section = row_number() %/% 10) %>%
     filter(section > 0) %>%
     unnest_tokens(word, text) %>%
     filter(!word %in% stop_words$word)

austen_section_words

library(widyr)

# count words co-occuring within sections
word_pairs <- austen_section_words %>%
     pairwise_count(word, section, sort = TRUE)

word_pairs

...generates the following:

> # count words co-occuring within sections
> word_pairs <- austen_section_words %>%
+      pairwise_count(word, section, sort = TRUE)
Warning messages:
1: `distinct_()` is deprecated as of dplyr 0.7.0.
Please use `distinct()` instead.
See vignette('programming') for more help
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
2: `tbl_df()` is deprecated as of dplyr 1.0.0.
Please use `tibble::as_tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated. 
> 
> word_pairs
# A tibble: 796,008 x 3
   item1     item2         n
   <chr>     <chr>     <dbl>
 1 darcy     elizabeth   144
 2 elizabeth darcy       144
 3 miss      elizabeth   110
 4 elizabeth miss        110
 5 elizabeth jane        106
 6 jane      elizabeth   106
 7 miss      darcy        92
 8 darcy     miss         92
 9 elizabeth bingley      91
10 bingley   elizabeth    91
# … with 795,998 more rows

These messages are generated because widyr::pairwise_count() uses dplyr::distinct_(), which then calls tbl_df().

#' @rdname pairwise_count
#' @export
pairwise_count_ <- function(tbl, item, feature, wt = NULL, ...) {
  if (is.null(wt)) {
    func <- squarely_(function(m) m %*% t(m), sparse = TRUE, ...)
    wt <- "..value"
  } else {
    func <- squarely_(function(m) m %*% t(m > 0), sparse = TRUE, ...)
  }

  tbl %>%
    distinct_(.dots = c(item, feature), .keep_all = TRUE) %>%
    mutate(..value = 1) %>%
    func(item, feature, wt) %>%
    rename(n = value)
}

We can see the sources of the warnings when we print the warning messages with lifecycle::last_warnings().

<deprecated>
message: `tbl_df()` is deprecated as of dplyr 1.0.0.
Please use `tibble::as_tibble()` instead.
This warning is displayed once every 8 hours.
Call `lifecycle::last_warnings()` to see where this warning was generated.
backtrace:
  9. widyr::pairwise_count(., word, section, sort = TRUE)
 10. widyr::pairwise_count_(...)
  3. dplyr::distinct_(., .dots = c(item, feature), .keep_all = TRUE)
  3. dplyr::mutate(., ..value = 1)
 10. widyr:::func(., item, feature, wt)
 19. widyr:::new_f(tbl, item, feature, value, ...)
  7. widyr:::custom_melt(.)
 15. dplyr::tbl_df(.)

>

Version 0.1.3 of widyr is the current version of the package. To resolve these warning messages, one must replace the reference to dplyr::distinct_() in widyr::pairwise_count(). Since this is a currently supported R package, to initiate this process one would report an Issue at the widyr Github Issues page.

As noted in the text of the warning message, distinct_() has been replaced with dplyr::distinct(), and tbl_df() has been replaced with tibble::as_tibble().

Suppressing the warnings

One can suppress the warnings produced by pairwise_count() by wrapping it within a suppressWarnings() function.

library(widyr)
suppressWarnings(
# count words co-occuring within sections
word_pairs <- austen_section_words %>%
     pairwise_count(word, section, sort = TRUE))

...and the output:

> suppressWarnings(
+ # count words co-occuring within sections
+ word_pairs <- austen_section_words %>%
+      pairwise_count(word, section, sort = TRUE))
> 
> word_pairs
# A tibble: 796,008 x 3
   item1     item2         n
   <chr>     <chr>     <dbl>
 1 darcy     elizabeth   144
 2 elizabeth darcy       144
 3 miss      elizabeth   110
 4 elizabeth miss        110
 5 elizabeth jane        106
 6 jane      elizabeth   106
 7 miss      darcy        92
 8 darcy     miss         92
 9 elizabeth bingley      91
10 bingley   elizabeth    91
# … with 795,998 more rows

Appendix

This code was run on version 4.0.2 of R, with the following packages, as reported by sessionInfo():

R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.6

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] tidytext_0.2.5    janeaustenr_0.1.5 widyr_0.1.3       tidyr_1.1.1      
[5] dplyr_1.0.2      

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5       rstudioapi_0.11  magrittr_1.5     tidyselect_1.1.0
 [5] lattice_0.20-41  R6_2.4.1         rlang_0.4.7      fansi_0.4.1     
 [9] stringr_1.4.0    tools_4.0.2      grid_4.0.2       packrat_0.5.0   
[13] broom_0.7.0      utf8_1.1.4       cli_2.0.2        ellipsis_0.3.1  
[17] assertthat_0.2.1 tibble_3.0.3     lifecycle_0.2.0  crayon_1.3.4    
[21] Matrix_1.2-18    purrr_0.3.4      vctrs_0.3.2      tokenizers_0.2.1
[25] SnowballC_0.7.0  glue_1.4.1       stringi_1.4.6    compiler_4.0.2  
[29] pillar_1.4.6     generics_0.0.2   backports_1.1.8  pkgconfig_2.0.3

See remark of Julia Silge in [this github issue](https://github.com/dgrtwo/widyr/issues/11#issuecomment-658876284) — phiver, Sep 20 '20 at 18:12
@phiver - If I understand the github issue correctly, the linked issue refers to an error trying to compute `distinct()` for variables not found in the data, which was resolved by tidytext 0.1.9.9. The latest source code for `pairwise_count()` in github as of September 20, 2020 still uses `distinct_()` as noted in my answer, so an update to the development version of `widyr` won't eliminate the warning messages. — Len Greski, Sep 20 '20 at 19:38
correct. Julia says you will still see a warning about `distinct_()`. Tidyverse packages are nice, but there is too much interdependency :-( . — phiver, Sep 21 '20 at 08:01

Warning Message: pairwise_count Function

1 Answers1

Suppressing the warnings

Appendix