0

I'm trying to create a fine tuned GPT-3 model. to that end, I have some content that i've formatted in word that im trying to bring to excel (in order to import the training dataset).

My inputs are in the format:

*Point A
    *Subpoint A1
    *Subpoint A2
*Point B
    *Subpoint B1
    *Subpoint B2

However, when i copy the contents into excel, The excel cell converts this to:

*Point A
*Subpoint A1
*Subpoint A2
*Point B
*Subpoint B1
*Subpoint B2

Is there any way for me to preserve my original formatting?

Is there any other way better way than this?

Any help is appreciated greatly :)

Regards, Galeej

Rok Benko
  • 14,265
  • 2
  • 24
  • 49
galeej
  • 535
  • 9
  • 23

1 Answers1

1

It's definitely possible to copy-paste to Excel and keep the original formatting. What you need to do is choose Keep Source Formatting under Paste Options.

But, I don't think the fine-tuned model will return a completion with the original formatting (if that's your goal).


EDIT

Try using the CLI preparation tool to get some suggestions on how to structure the training dataset.

Rok Benko
  • 14,265
  • 2
  • 24
  • 49
  • Hi, no the goal isnt to have chatgpt auto complete with the original formatting. my understanding is that better formatted and structured inputs will help the fine-tuning process more. Keep source formatting is not an option because the copy paste goes to each cell. So for eg,. Point A is in A1 and SubpointA1 is in B1 – galeej Feb 15 '23 at 02:41
  • Currently, my way around this is to add delimiters (:- when the indentation starts and :| when the indentation ends... :-- and :|| when the second level of indentation starts... i've written a def in my python code that loops through and creates the indentation programmatically. however, the pain is to go back to each dataset and put the appropriate indentation – galeej Feb 15 '23 at 02:43
  • Aha, I get it. I edited my answer. Have you tried using the CLI preparation tool to get some feedback? Such a structure may not even be appropriate in the first place. Before you spend too much time on formatting, make sure this would be even appropriate for the OpenAI API. – Rok Benko Feb 15 '23 at 11:24