1

Update

My asyncio powered GET requestor is giving me issues today, but only for certain workItem requests that it makes.

It is not a particular workItem that is causing the issue. I can retrieve a single workItem and then run the same function again and get rejected with the error:

Unexpected UTF-8 BOM (decode using utf-8-sig): line 1 column 1 (char 0)

The line that triggers the error is: workItem = await resp.json() in this function:

async def getWorkItem(self, session, url):
    async with session.get(url, headers=headers) as resp:
        workItem = await resp.json()  <----------------- the problem
        workItem = pd.json_normalize(workItem['value'])
        workItem.columns = workItem.columns.str.replace(
            'fields.', '', regex=False)
        return workItem

How might I try to solve this decoding issue?

John Stud
  • 1,506
  • 23
  • 46

1 Answers1

1

You could remove BOM before decode json, a minimal example:

test.py:

import aiohttp
import asyncio
import json

async def main():
    async with aiohttp.ClientSession() as session:
        # async with session.get('http://127.0.0.1:9989/data.json') as resp:
        async with session.get('http://127.0.0.1:9989/with_bom.json') as resp:
            raw_text = await resp.text()
            text_without_bom = raw_text.encode().decode('utf-8-sig')
            work_items = json.loads(text_without_bom)
            print(type(work_items))
            print(work_items)

asyncio.run(main())

Explanation:

  1. Use text() to replace json() to get the raw text.
  2. Remove BOM using utf-8-sig decode.
  3. Use json.loads() to transform str to dict.

Above code works both for response with BOM or without BOM.

atline
  • 28,355
  • 16
  • 77
  • 113
  • Thank you very much -- this solved it and made it more robust! I am still unclear why I could pull the workItem sometimes, but every `n` times, I would get the BOM error. – John Stud Feb 08 '22 at 13:02