0

How can I use FAISS ( Facebook AI Similarity Search ) to compare cosine similarity with texts with list of target texts, and return the max cosine similarity and target text from list which is most similar:

I have done this sofar:

 import faiss
    
    # Preprocess data as needed
    documents = [
        "This is the first document"
    ]
    
    documents2 = [
      "first doc",
        "This is the first document"
    
    ]
    
    # Use TF-IDF to convert the text documents into numerical vectors
    vectorizer = TfidfVectorizer()
    data = vectorizer.fit_transform(documents)
    data = data.toarray()
    
    # Normalize the vectors
    data = data / np.linalg.norm(data, axis=1, keepdims=True)
    
    # Create an index using FAISS
    index = faiss.IndexFlatIP(data.shape[1])  # Create an index with the same number of dimensions as your data
    index.add(data)  # Add your data to the index

# Search for the nearest neighbors of a given text document
query = vectorizer.transform(documents2).toarray()  # The text document you want to find the nearest neighbors of
k = 2  # The number of nearest neighbors to return
distances, indices = index.search(query, k)

# Calculate the similarity score between the text documents
similarity_score = distances[0][0]  # The inner product is equal to the cosine of the angle between the normalized vectors

However, this is not giving result which I am looking for. Can someone guide me please?

0 Answers0