29

I have a dataframe with the dtypes shown below and I want to insert the dataframe into a postgres DB but it fails due to error can't adapt type 'numpy.int64'

id_code               int64
sector              object
created_date         float64
updated_date    float64

How can I convert these types to native python types such as from int64 (which is essentially 'numpy.int64') to a classic int that would then be acceptable to postgres via the psycopg2 client.

data['id_code'].astype(np.int)  defaults to int64

It is nonetheless possible to convert from one numpy type to another (e.g from int to float)

data['id_code'].astype(float)

changes to

dtype: float64

The bottomline is that psycopg2 doesn't seem to understand numpy datatypes if any one has ideas how to convert them to classic types that would be helpful.

Updated: Insertion to DB

def insert_many():
    """Add data to the table."""
    sql_query = """INSERT INTO classification(
                id_code, sector, created_date, updated_date)
                VALUES (%s, %s, %s, %s);"""
    data = pd.read_excel(fh, sheet_name=sheetname)
    data_list = list(data.to_records())

    conn = None
    try:
        conn = psycopg2.connect(db)
        cur = conn.cursor()
        cur.executemany(sql_query, data_list)
        conn.commit()
        cur.close()
    except(Exception, psycopg2.DatabaseError) as error:
        print(error)
    finally:
        if conn is not None:
            conn.close()
qboomerang
  • 1,931
  • 3
  • 15
  • 20
  • Can you show the code where you do the insert? – Michael May 31 '18 at 14:29
  • @Michael sure, updated description of the issue. – qboomerang May 31 '18 at 16:47
  • Have you tried defining id_code as a BIGINT? – fiacre Jun 02 '18 at 00:51
  • If you mean defining it as BIGINT on the DB side, no I did not try that approach since I understood the issue to be the psycopg2 client not being to map numpy types to python natives. – qboomerang Jun 02 '18 at 13:44
  • Ints come back out of a DataFrame as numpy.int64. `data = [[1],[2]]` `df = pd.DataFrame(data)` `print(type(df.iloc[0][0]))` results in – Chris Oct 31 '19 at 22:12
  • also have a look at [this question](https://stackoverflow.com/questions/47423930/how-to-convert-pandas-dataframe-columns-to-native-python-data-types). `data['id_code'].astype('object')` will write directly to postgres in my use case. – lunguini May 01 '20 at 16:34

4 Answers4

24

Add below somewhere in your code:

import numpy
from psycopg2.extensions import register_adapter, AsIs
def addapt_numpy_float64(numpy_float64):
    return AsIs(numpy_float64)
def addapt_numpy_int64(numpy_int64):
    return AsIs(numpy_int64)
register_adapter(numpy.float64, addapt_numpy_float64)
register_adapter(numpy.int64, addapt_numpy_int64)
Tung Nguyen
  • 1,486
  • 2
  • 18
  • 13
23

Update:

If the value includes NaN, it still wrong. It seems that psycopg2 can't explain the np.int64 format, the following method works for me.

import numpy as np
from psycopg2.extensions import register_adapter, AsIs
register_adapter(np.int64, AsIs)

Original answer:

same problem here, successfully solve this problem after I transform series to nd.array and int.

you can try as following:

data['id_code'].values.astype(int)

--

M.Viking
  • 5,067
  • 4
  • 17
  • 33
Youngmi Huang
  • 231
  • 2
  • 4
  • 3
    Thanks, `register_adapter` helped. Note that with those imports, fully-qualified references don't work, you want just: `register_adapter(np.int64, AsIs)` – Vic Jul 28 '21 at 00:56
  • Thank you, this fixed `ProgrammingError: can't adapt type 'numpy.int64'` – M.Viking Mar 27 '23 at 17:32
7

I'm not sure why your data_list contains NumPy data types, but the same thing happens to me when I run your code. Here is an alternative way to construct data_list that so that integers and floats end up as their native python types:

data_list = [list(row) for row in data.itertuples(index=False)] 

Alternate approach

I think you could accomplish the same thing in fewer lines of code by using pandas to_sql:

import sqlalchemy
import pandas as pd
data = pd.read_excel(fh, sheet_name=sheetname)
engine = sqlalchemy.create_engine("postgresql://username@hostname/dbname")
data.to_sql(engine, 'classification', if_exists='append', index=False)
Michael
  • 969
  • 6
  • 19
  • In my environment, generating the list using a comprehension still results in numpy data types: `In [341]: type([list(row) for row in data.itertuples(index=False)][0][0]) Out[341]: numpy.int64 ` – qboomerang May 31 '18 at 19:51
  • Actually noted that the data_list generated from the comprehension as suggested is parsed withouth any issues by pyscopg2 despite the numpy data types. – qboomerang May 31 '18 at 22:23
2

I had the same issue and fixed it using: df = df.convert_dtypes()