I want to create an index of nearly 10M vectors of size 1024. Here is the code that I used.
import numpy as np
import faiss
import random
f = 1024
vectors = []
no_of_vectors=10000000
for k in range(no_of_vectors):
v = [random.gauss(0, 1) for z in range(f)]
vectors.append(v)
np_vectors = np.array(vectors).astype('float32')
index = faiss.IndexFlatL2(f)
index.add(np_vectors)
faiss.write_index(index, "faiss_index.index")
The code is worked for a small number of vectors. But the memory limit exceeds when the number of vectors is about 2M. I used index.add()
instead of appending vectors to list(vectors=[]). But it didn't work as well.
I want to know how to create an index for large number of vectors.