3

I wish to to fine-tune a GPT-2 implementation on some text data. I then want to use this model to complete a text prompt. I can do the first part easily enough using Max Woolf's gpt-2-simple implementation. And Neil Shepherd's fork of OpenAI allows for GPT-2 to be trained on new data and completes text.

However, my corpus is too small to train on and not get gibberish back. Is there any way I can combine the two functions? Ideally, I'd like to be able to do this via a python interface (as opposed to CLI), as I'd like to use pandas for data cleaning and what-have-you. Thanks.

Guy Coder
  • 24,501
  • 8
  • 71
  • 136
Lodore66
  • 1,125
  • 4
  • 16
  • 34

1 Answers1

3

Huggingface's Transformers package has a GPT-2 implementation (including pre-trained models) for PyTorch and TensorFlow. You can easily work with them in Python.

Fine-tuning of GPT-2, however, requires a lot of memory and I am not sure is you will be able to do the full backpropagation on that. In that case, you fine-tune just a few highest layers.

Jindřich
  • 10,270
  • 2
  • 23
  • 44
  • Great, thanks. I've looked at the Higgingface repo and it does offer GPT-2. However––and maybe I'm just being dim––I don't see any functionality for fine-tuning GPT-2, even if it's just the highest layers? – Lodore66 Jan 28 '20 at 09:39
  • 1
    Once you have it loaded, it behaves like any other PyTorch model. You define a loss function, an optimizer (to which you need to tell what parameters you need to optimize). Nothing GPT-specific, just plain PyTorch. – Jindřich Jan 28 '20 at 09:42