I think most of the people who works with Langchain in building products or automating stuff noticed how bad the quality of answers degrades when you ask multiple of questions using the same chain (e.g: the chain gets longer). It gets to answer the first 2 or 3 questions right then it may hallucinate returning a wrong answer, it may fail to answer your question or even crash regardless of the GPT model used.
Is there a way anyone found in order to optimize this? One way I always try is to prompt engineering things but it doesn't help as much.