I am working with a dataset (csv format) and creating a custom trained chatbot using the ChatGPT API in Python. Approximately there are 1000 observations and 12 variables. I was able to train the model, however when using asking questions, the chatbot does not give the required results. For example when I ask "What is the average age of the employees?" the result that I get is 15.5, which is incorrect (should be around 40). An other example, "How many males are there in the dataset?", the output is 60, however there are 340 males in the dataset.
I am quite sure that this has to do something with preprocessing the data, but I could not work my way around it. My other is to convert it to json format, from which the model would be able learn more accurately.
Has anyone else met with this issue? Did anyone else met with this issue? How did you manage to solve it?