0

What are the most suitable open source LLMs and frameworks for fine-tuning? I intend to use this model in a quite specific domain, perhaps a physics mentor for a school. How long might it take (with 3070 Ti 11Gb) to achieve acceptable accuracy for this purpose? I assume that the process of fine-tuning a new language is the same as fine-tuning on any other data, or is it not?

I couldn't find any open source LLMs that support the language I need, or are even partially trained on it, which would've made fine-tuning less complex. While there've been LLMs that support languages from the same family, but I believe that this is more likely to cause issues and confusion, since it'll be harder for the model to distinguish between languages.

desertnaut
  • 57,590
  • 26
  • 140
  • 166
cptatoo
  • 11
  • That are a lot of questions in one. I cannot answer all of them, so I just use a comment to guide you. It is unlikely that training a LLM from scratch is the best approach. Try to find a pre-trained model which speaks your language and use fine-tuning or prompting. Training an LLM from scratch is very very expensive (millions of dollars) and impossible for a single person at home. To learn about fine-tuning, you can follow this guide as a starting point: https://github.com/ray-project/ray-educational-materials/blob/main/NLP_workloads/Text_generation/LLM_finetuning_and_batch_inference.ipynb – h2stein Jun 27 '23 at 09:29
  • @h2stein Thanks for the response! Is it possible to fine-tune an LLM to hold at least an elementary conversation on a school subject within 2-3 months of fine-tuning? Or will it still produce nonsense? I apologize if I am repeating myself, but I need to know approximately whether to take on this matter. – cptatoo Jun 27 '23 at 09:52
  • To be honest, I can't tell. As far as I have understood, your problem is the language, not the domain knowledge. There are three key questions to ask: Which base model do you use to start with and how big does it need to be? Where do you get enough training data for fine-tuning the model to speak your language fluently? How do you evaluate that the model is good enough for your application? The most important issue is human resources, not hardware resources. Have a look at this blog post for information on what it takes to train and to fine-tune a model: https://www.mosaicml.com/blog/mpt-30b – h2stein Jun 27 '23 at 22:14

0 Answers0