1

I have a json file and I'm trying to read the file using the below code

import json

with open('sample.json') as file:
     data = json.load(file)

and I'm getting the below error

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

My guess was the json file was not valid so I opened the json file in a text editor and copied the data and tried to validate it via online json validator and the online tool confirmed that its a valid json.

So i wanted to understand what my notebook is reading when I read the file so I tried to print the string and surprisingly the string had a lot of unwanted values such as \n and  etc which is definately not there when I open the json file in the notepad.

with open('sample.json') as file:
     test_text = file.read()
print(test_text)

o/p I get in python notebook when I try to print the read file:

'[{"iteration" : {"id" :"value"},
"filename" : "testfile.json"
}\n,
{"iteration" : {"id" :"value1"},
"filename" : "testfile2.json"
}\n]'

Please advise what I'm doing wrong and how to fix this

Stramzik
  • 297
  • 3
  • 19
  • 2
    The characters at the beginning look like UTF-8 BOM. Maybe this will help? https://stackoverflow.com/questions/13156395/python-load-json-file-with-utf-8-bom-header – Andrej Kesely Aug 19 '20 at 11:01
  • @AndrejKesely thank you very much the Encoding : utf-8-sig did the trick. Out of curiosity what are these encoding patterns? – Stramzik Aug 19 '20 at 11:09
  • what process creates this file, or where does this file come from? – CobyC Aug 19 '20 at 11:09
  • @CobyC the file was created on Azure DataFactory and encoded in UTF-8 format – Stramzik Aug 19 '20 at 11:14
  • 2
    @Stramzik see the last two paragraphs of [this section](https://docs.python.org/3/library/codecs.html#encodings-and-unicode) from the Python codecs docs. – snakecharmerb Aug 19 '20 at 11:17
  • @Stramzik I see that you managed to get it to work using the linked answer. – CobyC Aug 19 '20 at 11:22

1 Answers1

0

use the encoding parameter and set it to utf-8-sig

with open('sample.json', encoding='utf-8-sig') as f:
    data = json.load(f)

original answer comes from this question

CobyC
  • 2,058
  • 1
  • 18
  • 24