Questions tagged [fasttext]

fastText is a library for efficient learning of word representations and sentence classification.

fastText is a library for efficient learning of word representations and sentence classification. See https://github.com/facebookresearch/fastText for more information.

465 questions
4
votes
0 answers

train word2vec with pretrained vectors

I am training word vectors on particular text corpus using fast text. Fasttext provides all the necessary mechanics and options for training word vectors and when looked with tsne, the vectors are amazing. I notice gensim has a wrapper for fasttext…
bicepjai
  • 1,615
  • 3
  • 17
  • 35
4
votes
1 answer

Handling C++ arrays in Cython (with numpy and pytorch)

I am trying to use cython to wrap a C++ library (fastText, if its relevant). The C++ library classes load a very large array from disk. My wrapper instantiates a class from the C++ library to load the array, then uses cython memory views and…
Bob
  • 1,274
  • 1
  • 13
  • 26
4
votes
1 answer

Specifying the # of hidden units in Facebook fasttext

In the paper on fasttext for supervised classification, the authors specified various quantities of hidden units by altering some parameter (h is the one on pages 3,4 - In table 1 you see "It has 10 hidden units and we evaluate it with and without…
Adam P.
  • 89
  • 5
4
votes
2 answers

Error installing Fasttext on Windows 10 Python 3

I am trying to install fastText using pip install fastText on Windows 10. I have Python 3 installed in Anaconda. I tried reading several posts but they do not give a clear idea about what exact changes should I make to install. I am getting the…
ayush singhal
  • 1,879
  • 2
  • 18
  • 33
3
votes
1 answer

Language names of Languages supported by Fasttext

I am trying to find out the names of languages supported by Fasttext's LID tool, given these language codes listed here: af als am an ar arz as ast av az azb ba bar bcl be bg bh bn bo bpy br bs bxr ca cbk ce ceb ckb co cs cv cy da de diq dsb dty dv…
3
votes
1 answer

Can't install fasttext on docker container

I am trying to build a python docker container. Here is my dockerfile: # syntax=docker/dockerfile:1 FROM python:3.8-slim WORKDIR /src COPY req.ini req.ini RUN apt-get update RUN pip install --upgrade pip setuptools wheel RUN pip install -r…
TheGainadl
  • 523
  • 1
  • 6
  • 14
3
votes
1 answer

How to reduce RAM consumption of gensim fasttext model through training parameters?

What parameters when training a gensim fasttext model have the biggest effect on the resulting models' size in memory? gojomos answer to this question mentions ways to reduce a model's size during training, apart from reducing embedding…
dasWesen
  • 579
  • 2
  • 11
  • 28
3
votes
1 answer

How to Find Top N Similar Words in a Dictionary of Words / Things?

I have a list of str that I want to map against. The words could be "metal" or "st. patrick". The goal is to map a new string against this list and find Top N Similar items. For example, if I pass through "St. Patrick", I want to capture "st…
Ian Yu
  • 57
  • 5
3
votes
0 answers

How to use pre-trained FastText embeddings with existing Seq2Seq model?

I'm new in NLP and I am trying to understand how to use pre-trained word embeddings like fastText with the existing Seq2Seq model. The Seq2Seq model I'm working with is the following. The encoder is simple and the decoder is Pointer Generator…
sarah
  • 31
  • 1
3
votes
1 answer

How does pre-trained FastText handle multi-word queries?

Using the pre-trained model: import fasttext.util fasttext.util.download_model('en', if_exists='ignore') # English ft = fasttext.load_model('cc.en.300.bin') Checking ft.words there aren't entries with spaces or _ in it, but if I query the model…
sarabert96
  • 63
  • 3
3
votes
1 answer

fasttext error: predict processes one line at a time (remove '\n')

Hi I have a dataframe column contains text. I want to use fasttext model to make prediction from it. I can achieve this by passing an array of text to fasttext model. import fasttext d = {'id':[1, 2, 3], 'name':['a', 'b', 'c']} df =…
Osca
  • 1,588
  • 2
  • 20
  • 41
3
votes
2 answers

Why FastText is not handling finding multi-word phrases?

FastText pre-trained model works great for finding similar words: from pyfasttext import FastText model = FastText('cc.en.300.bin') model.nearest_neighbors('dog', k=2000) [('dogs', 0.8463464975357056), ('puppy', 0.7873005270957947), ('pup',…
dzieciou
  • 4,049
  • 8
  • 41
  • 85
3
votes
2 answers

Write a fasttext customised transformer

I have a trained customised fasttext model (fasttext is a word embedding algorithm developed by Facebook). I managed to get the expected result in a function but now I want to rewrite it into a customised transformer so I can add it into my sklearn…
Osca
  • 1,588
  • 2
  • 20
  • 41
3
votes
1 answer

convert dataframe to fasttext data format

I want to convert a dataframe to fasttext format my dataframe text label Fan bake vs bake baking What's the purpose of a bread box? …
Osca
  • 1,588
  • 2
  • 20
  • 41
3
votes
1 answer

Looking for an effective NLP Phrase Embedding model

The goal I want to achieve is to find a good word_and_phrase embedding model that can do: (1) For the words and phrases that I am interested in, they have embeddings. (2) I can use embeddings to compare similarity between two things(could be word or…
Trent
  • 53
  • 1
  • 7