1

I would like to pre-initialise glove, word vectors and biases using the initial parameter of the fit_transform. The documentation of the function states to pass as a named list "w_i, w_j, b_i, b_j" values - initial word vectors and biases.

As a result i fit_tranform and extract them. Consequently i create a new glove instance and pass the extracted to a new glove instance (using the initial parameter). Although i am expecting to "continue" from where the first fit_transform reached, the cost always explodes, indicating i am not doing it the right way, or it is not supported.

I tried passing the initial parameter on GloVe$new( only on glove_model$fit_transform only, and on both. The error/cost explodes any time i use the initial parameter.

# A. make vectoriser, tcm,
vectorizer <- vocab_vectorizer(vocab) 
tcm <- create_tcm(it_train, vectorizer, skip_grams_window = 2, skip_grams_window_context = "left")
# B. create glove and fit transform - first pass
glove_model <- GloVe$new(word_vectors_size = 300, vocabulary = vocab, x_max = 10)
wv <- glove_model$fit_transform(tcm, n_iter = 10,  progressbar = FALSE, shuffle = F, learning_rate = 0.25, lambda = 1e-5)# convergence_tol = 0.01,
# C. extract parameters from glove model into a named list
initialisationParamsNames <- c("w_i", "w_j", "b_i", "b_j")
initialParam <- lapply(initialisationParamsNames, function(x)glove_model$.__enclos_env__$private[[x]])
names(initialParam) <- initialisationParamsNames
# D. fit transform by using the initial parameter from the first pass
glove_model <- GloVe$new(word_vectors_size = 300, vocabulary = vocab, x_max = 10, initial = initialParam)
wv2 <- glove_model$fit_transform(tcm, n_iter = 10,  progressbar = FALSE, shuffle = F, learning_rate = 0.01, lambda = 1e-5, initial = initialParam)# convergence_tol = 0.01,

the output on the first pass (B.) is

INFO [2019-10-12 12:23:52] 2019-10-12 12:23:52 - epoch 1, expected cost 0.3355
INFO [2019-10-12 12:24:00] 2019-10-12 12:24:00 - epoch 2, expected cost 0.1273
INFO [2019-10-12 12:24:08] 2019-10-12 12:24:08 - epoch 3, expected cost 0.0930
INFO [2019-10-12 12:24:16] 2019-10-12 12:24:16 - epoch 4, expected cost 0.0804
INFO [2019-10-12 12:24:24] 2019-10-12 12:24:24 - epoch 5, expected cost 0.0735
INFO [2019-10-12 12:24:32] 2019-10-12 12:24:32 - epoch 6, expected cost 0.0686
INFO [2019-10-12 12:24:40] 2019-10-12 12:24:40 - epoch 7, expected cost 0.0648
INFO [2019-10-12 12:24:48] 2019-10-12 12:24:48 - epoch 8, expected cost 0.0618
INFO [2019-10-12 12:24:55] 2019-10-12 12:24:55 - epoch 9, expected cost 0.0594
INFO [2019-10-12 12:25:03] 2019-10-12 12:25:03 - epoch 10, expected cost 0.0574

while on the second pass the cost, explodes from 0.0574 to 1062

Warning in glove_model$fit_transform(tcm, n_iter = 10, progressbar = FALSE,  :
  Cost is too big, probably something goes wrong... try smaller learning rate
INFO [2019-10-12 12:27:49] 2019-10-12 12:27:49 - epoch 1, expected cost 1018.4479
Warning in glove_model$fit_transform(tcm, n_iter = 10, progressbar = FALSE,  :
  Cost is too big, probably something goes wrong... try smaller learning rate
INFO [2019-10-12 12:27:57] 2019-10-12 12:27:57 - epoch 2, expected cost 1062.0293
Warning in glove_model$fit_transform(tcm, n_iter = 10, progressbar = FALSE,  :
  Cost is too big, probably something goes wrong... try smaller learning rate
INFO [2019-10-12 12:28:05] 2019-10-12 12:28:05 - epoch 3, expected cost 1062.0293

I expect the cost resuming from 0.0574, but its not :(.

The parameters stated in the documentation appear matching the source code

Thank you very much for your help

Melt
  • 11
  • 1
  • weighting *initial* by 0.95, iterating with just 5 epochs, updating *initial* if the cost reached is better than previous, seems to solve exploding costs, while utilize prior computed weights and biases to a degree. – Melt Oct 12 '19 at 13:24

0 Answers0