Pass a word2vec model from gensim package to langchain FAISS vectorstore

Question

I want to pass my trained gensim word2vec model as an embedding model to FAISS.from_documents(). Thereby I get an error

AttributeError: 'Word2Vec' object has no attribute 'embed_documents'

My code:


w2v_model = Word2Vec(sentences=list_pre_word2vec)

# # #
# # # some training magic for Word2Vec
# # #

from langchain.vectorstores import FAISS

# storing embeddings in the vector store
vectorstore = FAISS.from_documents(clean, w2v_model)

clean is list of langchain documents.

How can I pass my word2vec model as an embedding model to FAISS.from_documents(), how can I use the model in langchain? Is this an recommended approach or is there a better (easier, more efficient) way?

Thanks in forward.

Does [this](https://stackoverflow.com/a/68681997/8658598) answer help? — doneforaiur, Aug 30 '23 at 06:34
No, this post descripe an approach to solve the problem, if the attribute "most_similar" doesn't found. — Christian01, Aug 30 '23 at 07:07

score 0 · Answer 1 · answered Aug 30 '23 at 15:38

The langchain FAISS.from_documents() method's documentation…

https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.faiss.FAISS.html#langchain.vectorstores.faiss.FAISS.from_documents

…suggests it expects a 2nd argument of type Embeddings, where Embeddings is langchain's own class with an interface that's nothing-at-all like that of a Gensim Word2Vec model.

Embeddings looks like some sort of utility interface for turning full texts into vector-embeddings - whereas a Word2Vec model is only, at core, a mapping from individual words to word-vectors, not a multiword-text embedding method.

The word2vec algorithm is not a generic way to turn multiword texts into single vectors. If you wanted to use the vectors for single words from a Word2Vec model as the inputs to some other method of turning multiword texts into vectors, you'd want to consciously & explicitly choose such a method – and then your next steps would depend on the method you'd chosen.

Pass a word2vec model from gensim package to langchain FAISS vectorstore

1 Answers1