Questions tagged [fasttext]

fastText is a library for efficient learning of word representations and sentence classification.

fastText is a library for efficient learning of word representations and sentence classification. See https://github.com/facebookresearch/fastText for more information.

465 questions
0
votes
1 answer

How do I limit word length in FastText?

I am using FastText to compute skipgrams on a corpus containing a long sequence of characters with no spaces. After an hour or so, FastText produces a model containing vectors (of length 100) corresponding to "words" of length 50 characters from the…
Gavin
  • 91
  • 1
  • 9
0
votes
1 answer

How to load Gensim FastText model in native FastText

I trained a FastText model in Gensim. I want to use it to encode my sentences. Specifically, I want to use this feature from native FastText: ./fasttext print-word-vectors model.bin < queries.txt How to I save the model in Gensim so that it is the…
matlibplotter
  • 525
  • 5
  • 12
0
votes
0 answers

Value error when adding a word embedding layer to the CNN model

I am trying to add a FastText embedding layer to the famous text classification architecture with CNN: https://github.com/dennybritz/cnn-text-classification-tf I load my FastText embedding like this: embedding =…
Kerem
  • 1,494
  • 2
  • 16
  • 27
0
votes
1 answer

how to change parameters of fasttext api in a python script

We have fasttext commands to run in command prompt I have cloned the github repository and for example to change parameters of the network for a supervised learning in the command I used are like ./fasttext supervised -input FT_Race_data.txt…
Raady
  • 1,686
  • 5
  • 22
  • 46
0
votes
0 answers

Document tags in vectorization models

I am a little new to python and the unsupervised learning methods, but I have a quick question. where as doc2vec model has docvecs property holding all trained vectors for the 'document tags' seen during training; Are there similar properties that…
Dela
  • 115
  • 2
  • 12
0
votes
1 answer

Is there method .predict in official python bindings for fastText

I know there are unofficial bindings with .predict method in python(fasttext, pyfasttext) but they do not work with recent models trained on on official FastText bash tool or do not have all the options. Official python bindings have only…
Maciej Osowski
  • 115
  • 1
  • 2
  • 7
0
votes
1 answer

add custom dataset into fasttext classification deep learning

Based on this guithub link https://github.com/brightmart/text_classification, I want to running "fasttext" classification but there are some files that I couldn't find them so I want to add my custom dataset on it as an input and after that run…
brelian
  • 403
  • 2
  • 15
  • 31
0
votes
1 answer

When using Facebook-Fasttext to classify new text, why the data type of return is list?

I'm trying to classify new text with Facebook-Fasttext module, the code is as follow: #!usr/bin/python 2.7 import sys import jieba reload(sys) sys.setdefaultencoding('utf-8') import fasttext lines=[line.strip() for line in…
Arthur
  • 1
  • 1
0
votes
1 answer

Setting max length of char n-grams for fastText

I want to compare word2vec and fasttext model based on this comparison tutorial. https://github.com/jayantj/gensim/blob/fast_text_notebook/docs/notebooks/Word2Vec_FastText_Comparison.ipynb According to this, the semantic accuracy of fastText model…
utengr
  • 3,225
  • 3
  • 29
  • 68
0
votes
2 answers

Real reason for speed up in fasttext

What is the real reason for speed-up, even though the pipeline mentioned in the fasttext paper uses techniques - negative sampling and heirerchichal softmax; in earlier word2vec papers. I am not able to clearly understand the actual difference,…
0
votes
1 answer

Failed to write core dump , A fatal error has been detected by the Java Runtime Environment

I got an error when i used fasttext to get a vector for a word using Jfasttext library in java . the error is A fatal error has been detected by the Java Runtime Environment: # SIGSEGV (0xb) at pc=0x00007f412c606444, pid=14379,…
user5520049
0
votes
1 answer

Why cosine_similarity of pretrained fasttex model is high between two sentents are not relative at all?

I am wondering to know why pre-trained 'fasttext model' with wiki(Korean) seems not to work well! :( model = fasttext.load_model("./fasttext/wiki.ko.bin") model.cosine_similarity("테스트 테스트 이건 테스트 문장", "지금 아무 관계 없는 글 정말로 정말로") (in…
DSDS
  • 57
  • 7
-1
votes
0 answers

Using fasttext pretrained model embeddings with LSTM

Does someone know how to use fasttext pretrained models for conversion word into vector as LSTM embedding layer? The fasstext pretrained models for embedding has a 300 embedding size. If someone has code example I would honestly appreciate it.
Aslan
  • 1
-1
votes
1 answer

Should I Pass Word2Vec and FastText Vectors Separately or Concatenate Them for Deep Learning Model in Smart Contract Vulnerability Detection?

i have been working with word embedding latly, i have a question. So, here consider taking vulnerability detection in smart contract. So the input is smart contract files labeled with 0 or 1 stating vulnerable or not. Now i m performing 2 different…
-1
votes
1 answer

Subword vector in fastText?

I can't figure out what a subword input vector is. I read in the newspaper that the subword is hashed, the subword is the hash code, hash code is a number, not a vector Ex: Input vector of word eating is [0,0,0,1,0,0,0,0,0] So what is the input…
Gin
  • 1
  • 2
1 2 3
30
31