How does ElasticSearch create a vector representation of a document?

Asked Jul 11 '23 at 15:47

Active Jul 11 '23 at 15:47

Viewed 15 times

For its approximate nearest neighbor (ANN) search using HNSW (Hierarchical Navigable Small Worlds), Elasticsearch performs document similarity by comparing documents represented in vector form. How are these vectors created? I am familiar with word embeddings for individual words (ala Word2Vec). I am also familiar with bag-of-words (BOW) representations. Are these vectors directly created from some amalgam of word embeddings, such as a predefined set of keywords? Any pointer to where in their documentation this process is described would be helpful.

asked Jul 11 '23 at 15:47

Paul Chernoch

5,275
3
52
73

How does ElasticSearch create a vector representation of a document?

0 Answers0