I am trying to implement a RAG architecture in AWS with documents that are in Spanish.
My question is the following: does it matter if I generate the embeddings of the documents with a model trained in English or multilingual? Or do I have to generate the embeddings with a model trained specifically in Spanish?
I am currently using the GPT-J-6b model to generate the embeddings and the Falcon-40b model to generate the response (inference), but when doing the similarity search I do not get good results.
The other question I have is: is it good practice to use the same model both to generate the embeddings and to generate the inference?