How to use GPT-3 for fill-mask tasks?

Question

I use the following code to get the most likely replacements for a masked word:

!pip install git+https://github.com/huggingface/transformers.git
import torch
import pandas as pd
from transformers import AutoModelForMaskedLM, AutoTokenizer, pipeline

unmasker = pipeline('fill-mask', model='bert-base-uncased', top_k=100)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
model = AutoModelForMaskedLM.from_pretrained('bert-base-uncased')

results = unmasker(f"The sun is [MASK].")
for i in results:
  print(i["token_str"], i["score"]*100)

For example, the most likely replacement for "[MASK]" in the sequence "The sun is [MASK]." is "rising" (33.61%), "shining" (9.33%), and "up" (7.38%).

My question: is there a way to achieve the same with GPT-3? There is a "complete" and "insert" preset in the OpenAI playground, however, it gives me full sentences (instead of single words) and no probabilities. Can someone help?

Can you share the link to the GPT-3 model which you're using from HF — FoundABetterName, Aug 16 '22 at 08:21
Sorry I misread the question I played around in OpenAI playground so can't be of much help :( — FoundABetterName, Aug 16 '22 at 08:29

score 1 · Accepted Answer · answered Aug 16 '22 at 09:08

First of all, I don't think you can access properties like token or scores in GPT-3, all you have is the generated text.

Second of all, in my experience GPT-3 is ALL about the correct prompt. You just have to give it instructions like you were talking to a human being.

In you specific case, I would use a prompt like this:

Prompt:

The sun is [MASK].

Replace [MASK] with the most probable 5 words to replace, and give me their probabilities.

Result:

The sun is shining.

shining - 0.47

bright - 0.18

sunny - 0.13

hot - 0.10

beautiful - 0.09

If you want to do that programmatically, here's the code:

import openai
openai.organization = "your org key, if you have one"
openai.api_key = "you api key"
openai.Engine.list()

my_prompt = '''The sun is [MASK].
    
    Replace [MASK] with the most probable 5 words to replace, and give me their probabilities.'''

# Here set parameters as you like
response = openai.Completion.create(
  engine="text-davinci-002",
  prompt=my_prompt,
  temperature=0,
  max_tokens=500,
  # top_p=1,
  # frequency_penalty=0.0,
  # presence_penalty=0.0,
  # stop=["\n"]
)

print(response['choices'][0]['text'])

Thank you very much for that detailed response! I assumed that there is no direct access to tokens/scores but I wanted to make sure if I miss anything. Your solution gives the exact output that I wanted, thanks! — diggi2395, Aug 16 '22 at 09:16

How to use GPT-3 for fill-mask tasks?

1 Answers1