Questions tagged [text2vec]

text2vec - R package which provides a fast and memory efficient framework for text mining applications within R. Vectorization, word embeddings, topic modelling and more.

text2vec goal is to provide tools to easily perform text mining in R with C++ speeds:

Core parts written in C++
Small memory footprint
Concise, pipe friendly API
No need load all data into RAM - process it in chunks
Easily vertical scaling with multiple cores, threads.

See development page at github.

111 questions

votes

1 answer

In text2vec package in R, could not find function "create_vocab_corpus"

I was trying to understand the text2vec package from http://dsnotes.com/articles/text2vec but at the following step: Now we can costruct DTM. Again, since all functions related to corpus construction have streaming API, we have to create iterator…

r text2vec

asked May 01 '16 at 12:27

Saurabh Yadav

-1

votes

1 answer

text2vec document similarity code returns two values

I am learning to assess text similarity in between documents. Going through the text2vec tutorial (http://text2vec.org/similarity.html) on the topic, I noticed that the code returns two values for similarity. Here is the tail end of the code in the…

r nlp text2vec

asked Apr 28 '20 at 17:14

nicholas heimpel

-1

votes

1 answer

Combine two words in a corpus with R

So here is my code ny <- read.csv2("nyt.csv", sep = "\t", header = T) ny_texte <- as.vector(ny) iterator <- itoken(ny_texte, preprocessor=tolower, tokenizer=word_tokenizer, …

r text-mining corpus text2vec

asked Dec 23 '19 at 23:52

florian joly

-1

votes

1 answer

Text Similarity - Cosine - Control

I would like to ask you, if anybody could check my code, because it was behaving weird - not working, giving me errors to suddenly working without changing anything - the code will be at the bottom. Background: So my goal is to calculate text…

r cosine-similarity linguistics quanteda text2vec

asked Nov 15 '18 at 18:43

Kamil Liskutin

-1

votes

1 answer

How to convert text fields into numeric/vector space for a SVM in R Studio?

I am attempting to train a Support Vector Machine to aid in the detection of similarity between strings. My training data consists of two text fields and a third field that contains 0 or 1 to indicate similarity. This last field was calculated with…

r svm data-mining text2vec vector-space

asked Jul 03 '17 at 21:21

UbuntuNewbie

-2

votes

0 answers

How to make embedding models sensitive to numbers?

I have a set of data, but it is presented in the form of logs such as v0.1.1, v0.2.3, and when I try it with a pretrained text2vec model I find it hard to pinpoint the exact version number or update date, seeing as it seems to be insensitive to the…

deep-learning nlp dataset word-embedding text2vec

asked Aug 21 '23 at 08:02

Omnis

Prev 1 2 3 4 5 6 7