0

I have a function that copies a csv file to a database. I'm trying to do that asynchronously:

import psycopg
from config import config
from pathlib import WindowsPath
from psycopg import sql
import asyncio

async def main():
    conn = await psycopg.AsyncConnection.connect(f'postgresql://{config.USER_PG}:{config.PASS_PG}@{config.HOST_PG}:{config.PORT_PG}/{config.DATABASE_PG}')

    p = WindowsPath(r'.\data\product_version.csv')

    async with conn:
        if p.exists():
            with p.open(encoding='utf-8-sig') as f:

                columns = list(next(f).strip().lower().split(','))


                async with conn.cursor() as cur:
                    await cur.execute(sql.SQL("TRUNCATE TABLE {} RESTART IDENTITY CASCADE").format(sql.Identifier('product_version_map')))

                    async with cur.copy(sql.SQL("COPY {} ({}) FROM STDIN WITH CSV").format(sql.Identifier('product_version_map'),sql.SQL(', ').join(map(sql.Identifier, columns)))) as copy:
                            while data :=  await f.read():
                                await copy.write(data)

        else:
            print(f'You need the product_version file')



if __name__=='__main__':
    asyncio.set_event_loop_policy(asyncio.WindowsSelectorEventLoopPolicy())
    asyncio.run(main()) 

But I'm getting this error:

psycopg.errors.QueryCanceled: COPY from stdin failed: error from Python: TypeError - object str can't be used in 'await' expression

f here has class string, it's the row of the file. the error comes from this line:

while data :=  await f.read():

This is documentation I'm referring to when building this code:

https://www.psycopg.org/psycopg3/docs/basic/copy.html#asynchronous-copy-support

moth
  • 1,833
  • 12
  • 29

1 Answers1

1

You are openning the file for read as a sync file - there is no await for reading from it. f.read() is resolved and returns a string before Python even gets to the await part: the keyword would expect the target expression on the right to be an "awaitable" object - that is not the case. (And it is exactly what the error message is telling you).

Simply drop the "await" keyword on this line, and your program will gte past this part.

In the code you show you are not really taking any advantage of possible parallelisations with async code, so, just take the await away. If you have this in production in a larger system where this would actually matter, and you got maximum performance in all other steps, then you can bring in async file reading and plug into this part for some gains.

jsbueno
  • 99,910
  • 10
  • 151
  • 209
  • is it that complicated to make a stringio buffer in python an async object ? I'm learning this stuff so I would like to implement and test. – moth Feb 22 '23 at 03:26
  • https://www.psycopg.org/psycopg3/docs/basic/copy.html#asynchronous-copy-support this link provides the code . i was really wondering how to make that work with any string buffer – moth Feb 22 '23 at 03:27
  • 1
    it does not make sense to make a stringio buffer an async object. There is no I/O involved, so there would not be a moment when the CPU is idle while writing to the stringbuffer which the async runtime could use to perform other work: if the CPU stops copying the data into the buffer, nothing happens and one just get the overhead of context switching. The idea of async is that you use the CPU for other stuff when the peripheral components of your computer are busy flushing your data to other devices. – jsbueno Feb 22 '23 at 14:59
  • So what could be an example of read async object that would be useful to copy to a postgres database? Since they put in their docs this copy operation could be used in an async way... – moth Feb 22 '23 at 21:16