I would like to create a dataset to use it for fine-tuning GPT3. As I read from the following site https://beta.openai.com/docs/guides/fine-tuning, the dataset should look like this
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...
For this reason I am creating the dataset with the following way
import json
# Data to be written
dictionary = {
"prompt": "<text1>", "completion": "<text to be generated1>"}, {
"prompt": "<text2>", "completion": "<text to be generated2>"}
with open("sample2.json", "w") as outfile:
json.dump(dictionary, outfile)
However, when I am trying to load it, it looks like this which is not as we want
import json
# Opening JSON file
with open('sample2.json', 'r') as openfile:
# Reading from json file
json_object = json.load(openfile)
print(json_object)
print(type(json_object))
>> [{'prompt': '<text1>', 'completion': '<text to be generated1>'}, {'prompt': '<text2>', 'completion': '<text to be generated2>'}]
<class 'list'>
Could you please let me know how can I face this problem?