2

I'm using LLMs for classifying products into specific categories. Multi-Class.

  1. One way to do it would it to ask if it's a yes/no for a specific category and loop through the categories.

  2. Another way would be to ask for a probability that that certain product belongs to one of those classes.

The second option allows me to adjust the prediction thresholds in "post" and over/under-classify certain classes.

However, The word on the street is that RLHF-trained OpenAI models such as gpt-3.5-turbo and gpt-4 are weak at guessing probabilities relative to text completion models like text-davinci-003 because RLHF training makes the model "think" more like a human (bad at guessing probabilities).

Are there any literature I can read up on/ should know about? Before I go ahead and run a 100 tests.

I've not tried anything as of yet given that testing is time/cost intensive. And would like a baseline understanding of how to tackle the problem before starting.

FAD
  • 31
  • 2

0 Answers0