How to resolve R Error using text2vec glove function: unused argument (grain_size = 100000)?

Question

Trying to work through the text2vec vignette in the documentation and here to create word embeddings for some tweets:

head(twtdf$Tweet.content)
[1] "$NFLX $GS $INTC $YHOO $LVS\n$MSFT $HOG $QCOM $LUV $UAL\n$MLNX $UA $BIIB $GOOGL $GM $V\n$SKX $GE $CAT $MCD $AAL $SBUX"            
[2] "Good news frequent fliers. @AmericanAir says lower fares will be here for awhile"                   
[3] "Wall St. closing out the week with more earnings. What to watch:\nâ–¶ï¸Ž $MCD\nâ–¶ï¸Ž $AAL\nâ–¶ï¸Ž $CAT\n"
[4] "Barrons loves $AAL at low multiple bc it's \"insanely profitable\". Someone tell them how cycles+ multiples work."               
[5] "These airlines are now offering in-flight Wi-Fi $DAL $AAL"

Pretty much followed the guide as given:

library(text2vec)
require(text2vec)

twtdf <- read.csv("tweets.csv",header=T, stringsAsFactors = F)
twtdf$ID <- seq.int(nrow(twtdf))

tokens = twtdf$Tweet.content %>% tolower %>%  word_tokenizer
length(tokens)
it = itoken(tokens)
# create vocabulary
v = create_vocabulary(it) %>% 
  prune_vocabulary(term_count_min = 5)

# create co-occurrence vectorizer
vectorizer = vocab_vectorizer(v, grow_dtm = F, skip_grams_window = 5L)

#dtm <- create_dtm(it, vectorizer, grow_dtm = R)

it = itoken(tokens)
tcm = create_tcm(it, vectorizer)
glove_model = glove(tcm, word_vectors_size = 50, vocabulary = v, x_max = 10, learning_rate = .2)

fit(tcm, glove_model, n_iter = 15)

#when this was executed, R couldn't find the function
#fit <- GloVe(tcm = tcm, word_vectors_size = 50, x_max = 10, learning_rate = 0.2, num_iters = 15)

However, whenever I get to executing glove_model, I get the following error:

Error in .subset2(public_bind_env, "initialize")(...) : 
  unused argument (grain_size = 100000)
In addition: Warning message:
'glove' is deprecated.
Use 'GloVe' instead.

*I did try using GloVe instead, but I get the error that R can't find the function despite reinstalling the text2vec package and requireing it.

To check to make sure it wasn't some sort of formatting issue with my data, I tried running the code with the movie_review data and encounter the same problem. Just to be thorough, I additionally tried specifying the grain_size argument, but get the same error. I checked the issues on the Git repository and didn't see anything nor anything on this site or in an internet query.

Anyone else encounter this or is a new person problem?

Where did you find `glove(tcm, word_vectors_size = 50, vocabulary = v, x_max = 10, learning_rate = .2)` ?? documentation on CRAN is clear - create model with `glove = GlobalVectors$new(word_vectors_size = 50, vocabulary = vocab, x_max = 10)`. — Dmitriy Selivanov, Apr 11 '17 at 13:08
I can't find where I got that from, but thank you very much! — xq1515426, Apr 11 '17 at 14:48
So everything works now? If so I will post my comment as answer. — Dmitriy Selivanov, Apr 11 '17 at 16:19
Yeah, it's working great. Please do and I'll accept it right away. Thanks again! — xq1515426, Apr 11 '17 at 22:12

score 4 · Answer 1 · answered Jun 29 '20 at 15:29

Apparently GlobalVectors constructor was changed once more and now takes vocabulary information directly from TCM?

glove = GlobalVectors$new(rank = 50, x_max = 10)
wv_main = glove$fit_transform(tcm, n_iter = 10, convergence_tol = 0.01, n_threads = 8)

score 1 · Accepted Answer · answered Apr 12 '17 at 05:36

1

Just use correct constructor for model : glove = GlobalVectors$new(word_vectors_size = 50, vocabulary = vocab, x_max = 10)

glove() is old one from very old package version.

answered Apr 12 '17 at 05:36

Dmitriy Selivanov

4,545
1
22
38

When I use correct constructor: ```glove <- GlobalVectors$new(word_vectors_szie = 50, vocabulary = vocab, x_max = 20)```, I get the following error: ```Error in .subset2(public_bind_env, "initialize")(...) : unused arguments (word_vectors_size = 50, vocabulary = vocab)```. Any thoughts on why? – nigus21 Feb 10 '20 at 22:25

How to resolve R Error using text2vec glove function: unused argument (grain_size = 100000)?

2 Answers2