0

I have a script that uses a model and creates a big table for a client, and the last step is to save the result in a Teradata table. Currently I'm using teradatasql in this way: SubCat_p is the final dataframe that I have to save in a table that I've already created

import teradatasql as tsql
with tsql.connect(host="<hostname>", 
                    user="<user_name>",
                    password="<user_password>") as connect:
        with connect.cursor() as cur:
            cur.executemany(f"""
                    insert into <TABLE_NAME> (CustomerID, ItemSubCategoryID, Est, Comprador)
                    values (?,?,?,?)""", SubCat_p.values.tolist())

The script works great with a lot of dataframes but when the result is a big dataframe (order of millons of rows) the script ends with an error

panic: runtime error: gobytes: length out of range

goroutine 17 [running, locked to thread]:
main._Cfunc_GoBytes(...)
        _cgo_gotypes.go:72
main.goCreateRows.func1(0x2?, 0x8924a486?)
        /tmp/13851/goside.go:566 +0x46
main.goCreateRows(0x0, 0x1, 0xc00012de20?, 0x7f98d52199b8?, 0xc00007c000?, 0x7f99740fe890, 0x7f987f8c4f10)
        /tmp/13851/goside.go:566 +0x98
Aborted (core dumped)

Is there an alternative to my method (or to teradatasql) to avoid this error? I found the module JayDeBeApi that works in a similar way but I failed to create the connection (I think it doesn't use a host name but a url that I don't know)

Jero
  • 15
  • 1
  • 6
  • Use `sqlalchemy` package with `teradatasqlalchemy` dialect to enable higher-level DataFrame `to_sql` method. You can then use the _chunksize_ option to limit the batch size if you still have issues. Or you could break the data up into chunks yourself rather than passing a single list with all the row values. – Fred Jul 18 '22 at 22:45
  • I would try `teradataml.copy_to_sql` or `teradataml.fastload`. Usually `teradataml.fastload` is more efficient for a large data frame. https://pypi.org/project/teradataml/ – Kota Mori Jul 20 '22 at 05:28

0 Answers0