2

I try to insert data from pandas DataFrame into the PostgreSQL table,

table that a try to insert looks like:

city_id date        forecast
5       29.05.2019  0
1       29.05.2019  0
151     29.05.2019  0
55      29.05.2019  0
...

types:

  • city_id - numpy.int64
  • date - datetime.date
  • forecast - numpy.int64

And the block of code, that inserting data to db:

        with psycopg2.connect(f"host='{hostname}' \
                              dbname='{database}' \
                              user='{username}' \
                              password='{password}'") as connection:
            with connection.cursor() as cursor:
                connection.set_client_encoding('UTF8')
                for i in df_with_new_one.index:
                    date = df_with_new_one['date'][i]
                    city_id = df_with_new_one['city_id'][i]
                    value = df_with_new_one['forecast'][i]

                    cursor.execute("INSERT INTO forecast \
                    (city_id, computed_time, date, value) \
                    VALUES (%s, %s, %s, %s)", (city_id, now, date, value)) 

Where now is time saved as datetime.datetime.now()

And i get ProgrammingError:

    ProgrammingError: can't adapt type 'numpy.int64'

I checked type type(df_with_new_one['forecast'][0]) type is numpy.int64

So I get that PostreSQL can read only pythonic int and float, and the first thing i've tried was converting np.int64 into simple int with:

  • tolist()
  • pd.to_numeric()
  • int() for ((int(city_id), now, date, int(value))
  • .astype(int)
  • .value.astype('int')

Upd.:

  • city_id = int(df_with_new_one['city_id'][i]) value = int(df_with_new_one['forecast'][i])

Unfortunately none of them works for me

When I tried int() I get another error:

    TypeError: cannot convert the series to <class 'int'>

Answers that i found, but no one of them helped me:

Are there any other methods to change type of values?

Lumos
  • 570
  • 1
  • 11
  • 24
  • 1
    The problems seems to be that you try to push a non-scalar (a Pandas series) as a field of your database. – norok2 Jul 03 '19 at 07:59

3 Answers3

2

The problem was in wrong indexation:

  • first index was from 83 to 1161 and after 1161, where should've been 1161, was 83 again and next values were 83 + 1 etc.

Thus, problem was solved by .reset_index()

df_with_new_one.reset_index(drop = True, inplace = True)

Thanks you all for answers!

Lumos
  • 570
  • 1
  • 11
  • 24
1

You can use numpy.ndarray.item()

type(np.arange(1)[0])
# numpy.int64
type(np.arange(1)[0].item())
# int
Nils Werner
  • 34,832
  • 7
  • 76
  • 98
  • Should i do it like this: ```df_with_new_one['forecast'] = df_with_new_one['forecast'].apply(lambda x: x.item())``` ? – Lumos Jul 03 '19 at 08:46
0

First, your method is not very efficient because you repeatedly call cursor.execute instead of calling cursor.executemany.

But as you pass individuals values, it is easy to convert them just in time:

            ...
            for i in df_with_new_one.index:
                date = df_with_new_one['date'][i]
                city_id = int(df_with_new_one['city_id'][i])
                value = int(df_with_new_one['forecast'][i])
            ...
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • I tried, but get another error: ```TypeError: cannot convert the series to ``` I'll read about ```cursor.executemany```, thank you! – Lumos Jul 03 '19 at 08:12
  • @Doppie: Are you sure you have correctly copied my code? `df_with_new_one['city_id']` is certainly a `Series`, but `df_with_new_one['city_id'][i]` should not. – Serge Ballesta Jul 03 '19 at 08:16
  • Yes, except ```date = int(df_with_new_one['date'][i])```, because ```data``` is ```datetime.date```. Thus ```int()``` for ```city_id``` and ```value``` – Lumos Jul 03 '19 at 08:21
  • @Doppie: My bad. I had put one `int()` in the wrong place. Should be better that way. – Serge Ballesta Jul 03 '19 at 08:24