2

The tidytext book has examples with a tidier for topicmodels:

library(tidyverse)
library(tidytext)
library(topicmodels)
library(broom)

year_word_counts <- tibble(year = c("2007", "2008", "2009"),
+                            word = c("dog", "cat", "chicken"),
+                            n = c(1753L, 1157L, 1057L))

animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)

animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))

animal_lda <- tidy(animal_lda, matrix = "beta")

# Console output
Error in as.data.frame.default(x) : 
  cannot coerce class "structure("LDA_VEM", package = "topicmodels")" to a data.frame
In addition: Warning message:
In tidy.default(animal_lda, matrix = "beta") :
  No method for tidying an S3 object of class LDA_VEM , using as.data.frame

Replicating the error which is also seen here but in this instance library(tidytext) is present.

Below is a list of all packages are their corresponding version:

 packageVersion("tidyverse")
 ‘1.2.1’

 packageVersion("tidytext")
 ‘0.1.6’   

 packageVersion("topicmodels")
 ‘0.2.7’  

 packageVersion("broom")
 ‘0.4.3’

Output from function call sessionInfo():

R version 3.4.3 (2017-11-30)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] broom_0.4.3       tidytext_0.1.6    forcats_0.2.0     stringr_1.2.0     dplyr_0.7.4       purrr_0.2.4       readr_1.1.1       tidyr_0.8.0      
 [9] tibble_1.4.2      ggplot2_2.2.1     tidyverse_1.2.1   topicmodels_0.2-7

loaded via a namespace (and not attached):
 [1] modeltools_0.2-21 slam_0.1-42       NLP_0.1-11        reshape2_1.4.3    haven_1.1.1       lattice_0.20-35   colorspace_1.3-2  SnowballC_0.5.1  
 [9] stats4_3.4.3      yaml_2.1.16       rlang_0.1.6       pillar_1.1.0      foreign_0.8-69    glue_1.2.0        modelr_0.1.1      readxl_1.0.0     
[17] bindrcpp_0.2      bindr_0.1         plyr_1.8.4        munsell_0.4.3     gtable_0.2.0      cellranger_1.1.0  rvest_0.3.2       psych_1.7.8      
[25] tm_0.7-3          parallel_3.4.3    tokenizers_0.1.4  Rcpp_0.12.15      scales_0.5.0      jsonlite_1.5      mnormt_1.5-5      hms_0.4.1        
[33] stringi_1.1.6     grid_3.4.3        cli_1.0.0         tools_3.4.3       magrittr_1.5      lazyeval_0.2.1    janeaustenr_0.1.5 crayon_1.3.4     
[41] pkgconfig_2.0.1   Matrix_1.2-12     xml2_1.2.0        lubridate_1.7.2   assertthat_0.2.0  httr_1.3.1        rstudioapi_0.7    R6_2.2.2         
[49] nlme_3.1-131      compiler_3.4.3   
Sam
  • 644
  • 4
  • 21
Isaiah
  • 2,091
  • 3
  • 19
  • 28

4 Answers4

6

Deleting .Rhistory and .RData led to correct behaviour.

Isaiah
  • 2,091
  • 3
  • 19
  • 28
2

Wow, that is extremely mysterious to me. I am not able to reproduce that error. I installed to all the same versions/etc as you, except that I am on MacOS instead of Windows. I do have tests for the LDA tidiers that run and pass on Windows on Appveyor, so I would expect this to work.

The code you have should work without loading broom, for what it's worth.

library(tidyverse)
library(tidytext)
library(topicmodels)

year_word_counts <- tibble(year = c("2007", "2008", "2009"),
                           word = c("dog", "cat", "chicken"),
                           n = c(1753L, 1157L, 1057L))

animal_dtm <- cast_dtm(data = year_word_counts, document = year, term = word, value = n)

animal_lda <- LDA(animal_dtm, k = 5, control = list( seed = 1234))

class(animal_lda)
#> [1] "LDA_VEM"
#> attr(,"package")
#> [1] "topicmodels"

tidy(animal_lda, matrix = "beta")
#> # A tibble: 15 x 3
#>    topic term                                                beta
#>    <int> <chr>                                              <dbl>
#>  1     1 dog     0.0000000000000000000000000000000000000000000372
#>  2     2 dog     0.0000000000000000000000000000000000000000000372
#>  3     3 dog     0.0000000000000000000000000000000000000000000372
#>  4     4 dog     1.00                                            
#>  5     5 dog     0.0000000000000000000000000000000000000000000372
#>  6     1 cat     0.0000000000000000000000000000000000000000000372
#>  7     2 cat     0.0000000000000000000000000000000000000000000372
#>  8     3 cat     0.0000000000000000000000000000000000000000000372
#>  9     4 cat     0.0000000000000000000000000000000000000000000372
#> 10     5 cat     1.00                                            
#> 11     1 chicken 0.0000000000000000000000000000000000000000000372
#> 12     2 chicken 0.0000000000000000000000000000000000000000000372
#> 13     3 chicken 1.00                                            
#> 14     4 chicken 0.0000000000000000000000000000000000000000000372
#> 15     5 chicken 0.0000000000000000000000000000000000000000000372

Created on 2018-02-14 by the reprex package (v0.2.0).

What happens if you load library(methods) as well?

Julia Silge
  • 10,848
  • 2
  • 40
  • 48
  • I tried two combinations: without Broom, and without Broom but with Method. Both combinations gave the same error. – Isaiah Feb 15 '18 at 14:00
  • This was also happening on another Windows PC. I tried deleting .Rhistory and .RData. On both PCs, the code then ran correctly. On the second machine I tried this, I noticed that before the deletion, when RStudio was opening the project, I would get a message "loading required package topicmodels". After deletion of the two files, the loading message went. – Isaiah Feb 15 '18 at 14:15
  • Are you saving your workspace as .Rdata between sessions? I wonder if something wacky is going on with that... For the record, I strongly encourage all R users to move away from such a workflow: http://stat545.com/block002_hello-r-workspace-wd-project.html#workspace-and-working-directory – Julia Silge Feb 15 '18 at 17:17
  • Not usually, but does happen. No more! – Isaiah Feb 15 '18 at 21:04
1

Adding to the very helpful answer provided by Julia Silge:

I too believe that the interaction between loading .Rdata and the topicmodels package is the culprit here. But you can still work with your saved workspace:

I was able to eliminate the problem by starting with a fresh restart of RStudio, loading the topicmodels package and then loading the .Rdata. Done in this sequence, the error message disappears. Loading first the data and then the package does not work.

One more word on workspaces: In the case of LDA, using these along with your RScripts is really the only way I could figure out to work efficiently. Depending on the parameters and the size of the corpus, fitting an LDA-model may take several hours. It is crucial to be able to save the model fit to then do further analyses down the road.

  • I often use RDS files in such cases: `saveRDS(animal_lda, "animal_lda.rds") animal_lda <- readRDS("animal_lda.rds")` – Isaiah May 24 '19 at 23:42
0

I had the same issue when I loaded the LDA I had saved. Finally, for no apparent reasons when I restarted the R session I worked again.

Léo Henry
  • 127
  • 10