Highest Voted 'fasttext' Questions

2

votes

1 answer

Gensim: Any chance to get word frequency in Word2Vec format?

I am doing my research with fasttext pre-trained model and I need word frequency to do further analysis. Does the .vec or .bin files provided on fasttext website contain the info of word frequency? if yes, how do I get? I am using…

python-3.6 gensim fasttext

asked Nov 06 '19 at 17:28

qby pony

33
4

2

votes

2 answers

FastText .bin file cannot fit in memory, even though I have enough RAM

I'm trying to load one of the FastText pre-trained models that has a form of a .bin file. The size of .bin file is 2.8GB and I have 8GB RAM and 8GB swap file. Unfortunately, the model starts loading and it occupies almost 15GB and then it breaks…

python gensim fasttext

asked Oct 16 '19 at 07:10

Kioko Key

119
1
10

2

votes

1 answer

Training a model from multiple corpus

Imagine I have a fasttext model that had been trained thanks to the Wikipedia articles (like explained on the official website). Would it be possible to train it again with another corpus (scientific documents) that could add new / more pertinent…

python artificial-intelligence gensim training-data fasttext

asked Sep 25 '19 at 13:28

bladeous

83
2

2

votes

1 answer

Fasttext how to load a .csv column into model.predict

I am new to python and NLP. I have followed this tutorial (https://fasttext.cc/docs/en/supervised-tutorial.html) to train my fasttxt supervised model in Python. I have a csv with Text column and I would like to predict labels to ever row from the…

python fasttext

asked Sep 24 '19 at 13:08

Hector Escaton

33
1
5

2

votes

1 answer

Understanding wordNgram from fastText

I'm trying to understanding what is the -wordNgrams parameter in the fastText. Let's take the following text as an example: The quick brown fox jumps over the lazy dog Now we have the context windows size of 2 at the 'brown' word, then we would…

word2vec fasttext

asked Sep 12 '19 at 13:20

Kleyson Rios

2,597
5
40
65

2

votes

1 answer

Difference between max length of word ngrams and size of context window

In the description of the fasttext library for python https://github.com/facebookresearch/fastText/tree/master/python for training a supervised model there are different arguments, where among others are stated as: ws: size of the context…

python nlp fasttext

asked Aug 15 '19 at 08:42

Akim Tsvigun

91
1
8

2

votes

1 answer

Are Principal Components of different word2vec models measuring the same thing?

All in all I need to run multiple word2vec over a period of time. For example I will be running word2vec once every month. To reduce computing workload I would like to run word2vec only on the data that was accumulated during the last month. My…

math gensim word2vec fasttext

asked Jul 24 '19 at 09:23

griischdoffer

31
3

2

votes

1 answer

Gensim most_similar() with Fasttext word vectors return useless/meaningless words

I'm using Gensim with Fasttext Word vectors for return similar words. This is my code: import gensim model = gensim.models.KeyedVectors.load_word2vec_format('cc.it.300.vec') words = model.most_similar(positive=['sole'],topn=10) print(words) This…

gensim fasttext

asked Apr 26 '19 at 18:02

user2797134

73
1
7

2

votes

1 answer

What are the defaults for gensim's fasttext?

I cannot find anything about the default values about the parameters for gensim fasttext here Or are they the same as for the original Facebook fasttext implementation?

gensim fasttext

asked Feb 02 '19 at 11:00

user9937436

2

votes

1 answer

fasttext keeps predicting one label

am trying to use fasttext to label some data [url]or[PN] just to test it after training on 6k of each label and upon predicting it keeps predicting [PN] training command fasttext supervised -input input.txt -output model -minn 0 -maxn 0 -epoch 100…

text-classification fasttext

asked Jan 21 '19 at 13:33

Exorcismus

2,243
1
35
68

2

votes

3 answers

Reading a large pre trained fastext word embedding file in python

I am doing sentiment analysis and I want to use pre-trained fasttext embeddings, however the file is very large(6.7 GB) and the program takes ages to compile. fasttext_dir = '/Fasttext' embeddings_index = {} f = open(os.path.join(fasttext_dir,…

python keras sentiment-analysis fasttext

asked Jan 19 '19 at 16:02

BlueMango

463
7
21

2

votes

1 answer

How to get list of context words in Gensim

How to get most frequent context words from pretrained fasttext model? For example: For word 'football' and corpus ["I like playing football with my friends"] Get list of context words: ['playing', 'with','my','like'] I try to use model_wiki =…

python gensim word2vec fasttext

asked Dec 28 '18 at 09:02

Alexander Chaptykov

63
8

2

votes

1 answer

How do Facebook's fasttext library handle numerical data in input for word vectorization?

I am using Facebook's Fasttext for performing text classification. I wanted to know how fasttext library handle the numbers in a text string provided as input for word vectorization. Do fasttext typecast each number as a string before creating word…

facebook nlp vectorization fasttext

asked Oct 29 '18 at 02:40

DK818

135
6

2

votes

1 answer

How to prepare data for word2vec in gensim and fasttext?

I want to train word2vec and fasttext to get vectors for a specific dataset that I have. What should my model take as input? My file is like this: Customer_4: I want to book a ticket to New York. Agent_9: Okay, when do you want the tickets…

python machine-learning gensim word2vec fasttext

asked Oct 25 '18 at 06:33

tstseby

1,259
3
10
20

2

votes

0 answers

gensim error : 'NoneType' object is not subscriptable during training in Fasttext

While implementing Fasttext in Python 3.7, I am facing an unexpected scenario related to Exception in thread, which leads to NoneType' object is not subscriptable The error (screenshot) of full stack trace is as follows: What exactly is this…

python python-3.x nltk gensim fasttext

asked Oct 12 '18 at 18:48

M S

894
1
13
41

Questions tagged [fasttext]