1

I am trying to parse data from an api with python and requests.

SO Reference Python codecs and utf-8 bom error

Listed multiple references above as I have updated script with each error received.

import requests
import codecs
import json

r = requests.get(
    "https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")
data = json.load(codecs.open(r.json(), 'utf-8-sig'))
# reads = r.json()
# data = reads.decode('utf-8-sig')

with open('data.json', 'w') as f:
    json.dump(data, f)

I want to save the response from the api https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/ to a file.json

Initially I received the below so applied codecs resolution from SO reference answer.

json.decoder.JSONDecodeError: Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)

this resolution from SO answer.

data = json.load(codecs.open(r.json(), 'utf-8-sig'))

Now I receive error that

TypeError: expected str, bytes or os.PathLike object, not dict

However I cannot resolve the typerror because I need to load using codecs to stop the ut8-sig error.

How can I parse and write from requests and avoid both errors?

EDIT

Updated using below answer, however fails to write the file to disk.

import requests
import codecs
import json

r = requests.get(
    "https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")
data = json.load(codecs.open(r.text, 'r', 'utf-8-sig'))

with open('data.json', 'w') as f:
    f.write(data)

Answer

import requests
import json

r = requests.get(
    "https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")

output = open('data.json', 'w')
output.write(r.text)
Community
  • 1
  • 1
sayth
  • 6,696
  • 12
  • 58
  • 100

2 Answers2

3

codecs.open opens a local file using a given encoding. codecs.decode will convert an in-memory object. So I think you're after:

data = json.load(codecs.decode(r.text, 'utf-8-sig'))

Note that I've used r.text which means the requests library will not attempt to do any parsing of its own. Unless you want to modify the data before saving though, you could just save the response directly to disk:

with open('data.json', 'w') as f:
    f.write(r.text)
Alex Taylor
  • 8,343
  • 4
  • 25
  • 40
  • updated, needed to add 'r' in codecs call. However it still fails to write the json file instead printing it out in the console. – sayth Apr 06 '17 at 03:18
  • 1
    instead of `r.text`, I needed to use `r.content` because `codecs.decode` was expecting binary input. Otherwise it worked for me! Thanks @alex-taylor. – Hel Dec 22 '18 at 17:53
1

Answer your updated question. You did not reach the code of writing data to file, If you scroll up your output I believe the error you got is:

IOError: [Errno 63] File name too long:...

The first parameter of codecs.open(r.text, 'r', 'utf-8-sig') is filename, as you can find out following docs of codecs.open. I think Alex Taylor's answer is enough to write response to a file, but if you really need to decode the response, you could try:

data = codecs.decode(str(response.text), 'utf-8-sig')

Another error in your code: data = json.load(codecs.open(r.text, 'r', 'utf-8-sig')) make data to be type of unicode, you can't write an unicode object to file. you can just dump it to your file:

import requests
import json
import codecs

r = requests.get("https://api.tatts.com/sales/vmax/web/data/racing/2017/4/05/mr/")
data = codecs.decode(str(r.text), 'utf-8-sig')

with open('data.json', 'w') as f:
    json.dump(data, f)

And you can load it back later with code:

with open('data.json', 'r') as f:
    data = json.load(f)
shizhz
  • 11,715
  • 3
  • 39
  • 49