1

I have been using the package text since a couple of days. Everything works fine as far as you call BERT or Electra for example. However when I try to call "roberta-base" or "xlm-roberta-base" to work on some texts I get very often an error.

Example of a text that creates problems:

"627% but if they had lower striked than 16 I would have gone even further OTM. This could really fall off a cliff."

I keep getting the following error: Error in dplyr::bind_cols(): ! Can't recycle ..1 (size 44) to match ..2 (size 32).

And if I go with the backtrace I get the following explanation:

Backtrace:

     ▆
  1. ├─base::system.time(embeddings_CL2 <- textEmbed(xx, model = "roberta-base"))
  2. ├─text::textEmbed(xx, model = "roberta-base")
  3. │ └─text::textEmbedRawLayers(...)
  4. │   └─text:::sortingLayers(x = hg_embeddings, layers = layers, return_tokens = return_tokens)
  5. │     └─dplyr::bind_cols(tokens_layer_number, layers_4_token)
  6. │       └─vctrs::vec_cbind(!!!dots, .name_repair = .name_repair, .error_call = current_env())
  7. └─vctrs::stop_incompatible_size(...)
  8.   └─vctrs:::stop_incompatible(...)
  9.     └─vctrs:::stop_vctrs(...)
 10.       └─rlang::abort(message, class = c(class, "vctrs_error"), ..., call = call)

Any idea why? Thanks Best Luigi

Progman
  • 16,827
  • 6
  • 33
  • 48

1 Answers1

0

This works for me using text_0.9.99.8 from Github (https://github.com/OscarKjell/text).

test_text <- "627% but if they had lower striked than 16 I would have gone even further OTM. This could really fall off a cliff."

test_emb <- textEmbed(test_text,
                      model = "roberta-base")
Oscar Kjell
  • 1,599
  • 10
  • 32
  • 1
    Right! Thanks. I was getting the above error by employing the CRAN version. With the 0.9.99.8 from Github everything is fine. Thanks! – Luigi Curini Apr 04 '23 at 16:12