1

I am trying to replicate the results of this demo, whose author primes GPT-3 with just the following text:

gpt.add_example(Example('apple', 'slice, eat, mash, cook, bake, juice'))
gpt.add_example(Example('book', 'read, open, close, write on'))
gpt.add_example(Example('spoon', 'lift, grasp, scoop, slice'))
gpt.add_example(Example('apple', 'pound, grasp, lift'))

I only have access to GPT-2, via the Huggingface Transformer. How can I prime GPT-2 large on Huggingface to replicate the above examples? The issue is that, with this, one doesn't get to prime with the input and corresponding output separately (as the author of the GPT-3 demo did above).

Similarly, this tutorial describes using Huggingface, but there's no example which clearly shows how you can prime it using input vs output examples.

Does anyone know how to do this?


Desired output: use GPT-2 to return something like, for input "potato", output "peel, slice, cook, mash, bake" (as in the GPT-3 demo: https://www.buildgpt3.com/post/41/). Obviously the exact list of output verbs won't be the same as GPT-2 and GPT-3 are not identical models.

Mobeus Zoom
  • 598
  • 5
  • 19

1 Answers1

1

The only thing the GPT model can do is predicting what word should follow. Technically, there is no input and output, it is a decoder-only model, so it only has output. Priming the model means that you force the output of the model to something that you want and then you let the model continue generating more text.

What happens in the demo is:

  1. You provide GPT-3 with natural language examples of what it should do. Something like this:
What can I do with an apple? slice eat, mash, cook, bake, juice
What can I do with a book? read, open, close, write on
What can I do with a spoon? lift, grasp, scoop, slice
  1. When a query comes (e.g., knife), you create a similar sentence to the examples:
What can I do with a knife?
  1. Let the model continue generation until it starts a new line that starts with What or until it breaks in a strange way which can always happen with a stochastic model. (And hope, the model got the pattern that you meant in the priming examples.)

Here is an example from the HuggingFace's demo of what happens with GPT-2. The text in bold was generated by the model.

enter image description here

Jindřich
  • 10,270
  • 2
  • 23
  • 44
  • It's possible to prime GPT-3 with an input and output (see, e.g. https://github.com/shreyashankar/gpt3-sandbox). So it should also be possible for GPT-2. How to do it is my question – Mobeus Zoom Feb 03 '21 at 14:19
  • 1
    My answer says how the priming is done. Priming does not change the model parameters. You just start generation from the hidden states after providing the model with initial examples. If you want to make it more efficient, you can cache the state of the model after the priming examples, but that is all. – Jindřich Feb 03 '21 at 16:18