Questions tagged [fasttext]

fastText is a library for efficient learning of word representations and sentence classification.

fastText is a library for efficient learning of word representations and sentence classification. See https://github.com/facebookresearch/fastText for more information.

465 questions
3
votes
2 answers

supervised classification with fasttext api returns empty array when tested in windows

I am trying to build a supervised classifier using fast text API. My data is 'output.txt' with 15000 rows, 2 columns (gender and name) and 2 classes m/f. __label__F Mary __label__F Santa ... __label__M John code: #model =…
Raady
  • 1,686
  • 5
  • 22
  • 46
3
votes
0 answers

Use fasttext for the character embeddings?

We have pre-trained fast text word embeddings. Can we use it find the character embeddings. Although I found a blog This link. But in this blog author has just averaged the character over all the words. Is there any other way to have character…
hEcuLE
  • 31
  • 6
3
votes
2 answers

How to convert gensim Word2Vec model to FastText model?

I have a Word2Vec model which was trained on a huge corpus. While using this model for Neural network application I came across quite a few "Out of Vocabulary" words. Now I need to find word embeddings for these "Out of Vocabulary" words. So I did…
3
votes
2 answers

fine tuning pre-trained word2vec Google News

I am currently using the Word2Vec model trained on Google News Corpus (from here) Since this is trained on news only until 2013, I need to updated the vectors and also add new words in the vocabulary based on the news coming after 2013. Suppose I…
ayush singhal
  • 1,879
  • 2
  • 18
  • 33
3
votes
1 answer

Use of fasttext Pre-trained word vector as embedding in tensorflow script

Can I use fasttext word vector like the ones here: https://github.com/facebookresearch/fastText/blob/master/pretrained-vectors.md in a tensorflow script as an embedding vector instead of word2vec or glove without using the library fasttext
Aggounix
  • 251
  • 5
  • 15
3
votes
0 answers

Python FastText: How to create a corpus from a Dataframe Column

I need to create a corpus for my Email Classifer . Right now am Using fasttext 0.8.3 but it expects text file as a input whereas i need to pass a dataframe as an input . It shows error while i am Using following Code :- ``` import fasttext …
3
votes
0 answers

Segmentation Fault error using Fasttext

I am using Fasttext on Linux 6.7, but I keep getting a Segmentation fault error. It does not matter if I use my own data, or run the examples included with the Fasttext installation package. In either case, I get the same error. I am running…
Josh
  • 31
  • 2
2
votes
2 answers

Language detection for short string in a user content generated context

I have some question about the detection of short string. I need to detect the language of text sent in a chat, and I am faced with 2 problems: the lenght of the message the errors that may be in it and the noise (emoji etc...) but for the noise,…
Jourdelune
  • 131
  • 8
2
votes
0 answers

How can I deploy a fasttext model on google cloud?

I just want to have a model I can reach via REST API, and the model just has to be this : import fasttext ft = fasttext.load_model('pretrained model location') But I want it to be on the Google Cloud platform because the model takes 7GB of RAM.…
2
votes
1 answer

Can I use a different corpus for fasttext build_vocab than train in Gensim Fasttext?

I am curious to know if there are any implications of using a different source while calling the build_vocab and train of Gensim FastText model. Will this impact the contextual representation of the word embedding? My intention for doing this is…
2
votes
1 answer

How to run non-spark model training task (using fasttext) efficiently on a databricks cluster?

I want to train some models using fasttext and since it doesn't use spark, it will be running on my driver. The number of training jobs that will be running simultaneously is very large and so is the size of the data. Is there a way to make it run…
BBloggsbott
  • 195
  • 2
  • 12
2
votes
1 answer

Training fasttext word embedding on your own corpus

I want to train fasttext on my own corpus. However, I have a small question before continuing. Do I need each sentences as a different item in corpus or can I have many sentences as one item? For example, I have this DataFrame: text …
BlueMango
  • 463
  • 7
  • 21
2
votes
2 answers

Difference between Gensim's FastText and Facebook's FastText

I came upon the realization that there exists the original implementation of FastText here by which you can use fasttext.train_unsupervised in order to generate word vectors (see this link as an example). However, turns out that gensim also supports…
Perl Del Rey
  • 959
  • 1
  • 11
  • 25
2
votes
1 answer

How to find similar Sentences using FastText ( Sentences with Out of Vocabulary words)

I am trying to create an NLP model which can find similar sentences. For example, It should be able to say that "Software Engineer", "Software Developer", "Software Dev", "Soft Engineer" are similar sentences. I have a dataset with a list of roles…
rspenpal
  • 45
  • 5
2
votes
1 answer

Genesis most_similar find synonym only (not antonyms)

Is there a way to let model.wv.most_similar in gensim return positive-meaning words only (i.e. that shows synonyms but not antonyms)? For example, if I do: import fasttext.util from gensim.models.fasttext import load_facebook_model from…
Jinhua Wang
  • 1,679
  • 1
  • 17
  • 44