0

I'm trying to run finetuning on chatgpt and it keeps erroring my JSONL file out with the following error

... <some more error stack>
  File "/usr/local/Cellar/python@3.11/3.11.3/Frameworks/Python.framework/Versions/3.11/lib/python3.11/http/client.py", line 1256, in putheader
    values[i] = one_value.encode('latin-1')
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'latin-1' codec can't encode character '\u201c' in position 7: ordinal not in range(256)

This is the content of my JSONL file which I'm passing to the script.

{"prompt":"Why is my AC not working? ->","completion":" There can be several reasons why your AC is not working. Here are a few things you can check:\n    1. Power supply: Make sure that the AC unit is receiving power. Check the circuit breaker to ensure it hasn't tripped or the fuse hasn't blown.\n    2. Thermostat: Check the thermostat to make sure it's set to the desired temperature and that it's on the correct setting (e.g., 'cool' instead of 'heat').\n    3. Air filter: Check the air filter and replace it if it's dirty. A clogged air filter can prevent the AC from working efficiently.\n    4. Refrigerant level: If your AC unit is not blowing cold air, it could be due to a low refrigerant level. You may need to have a professional recharge the refrigerant.\n    5. Outdoor unit: Check the outdoor unit to ensure that it's clear of debris and there's nothing obstructing the airflow.\nIf none of these troubleshooting steps work, it may be time to call a professional AC technician to diagnose and fix the issue."}

On searching it turns out that \u201c is LEFT DOUBLE QUOTATION MARK. Where in my file is that and why is it erroring out? File is stored in utf-8 charset too.

Tried all kinds of transformations with JSON and Pandas libraries to no avail.

  • It's likely something produced by a word editor, which changes `"` into `“ (\u201c)` and `” (\u201d)` as you type it, depending on the context. – qrsngky May 18 '23 at 07:14
  • How can I prevent my editor from doing that? Tried it on sublime and vscode, both – Navneet Sinha May 18 '23 at 07:23
  • Sublime and VSCode don't produce "smart quotes". Your JSON file (as shown here) also doesn't contain such quotes. There should be another source of it. – qrsngky May 18 '23 at 07:37
  • Yep, the source was Slack. I'm disappointed. Spent too much time trying to find an error in the file, but the error was because slack did this to my message: export OPENAI_API_KEY=“”. The smart-quotes. – Navneet Sinha May 18 '23 at 08:52
  • Probably not changed by Slack but it got the message from some other source. – qrsngky May 18 '23 at 08:54

1 Answers1

0

My API KEY that was set had the erroneous quotes.