2

I'm using pandas to dynamically insert datasets into a Redshift database.

But as I try to change the length of the VARCHAR field, it doesn't seems to work.

import pandas as pd
import sqlalchemy

engine = sqlalchemy.create_engine("postgresql://credentials")

df = pd.DataFrame([[1,2,3,4,5,6,7], ['a','b','a','c','a','a','a']])

df.to_sql('test_table',con=engine, 
                  if_exists = 'append', 
                  schema='test_schema',
                  index = False, 
                  dtype={'0': sqlalchemy.types.VARCHAR(length=500)})

In psql, when I ask for \d test, it returns:

    Table "test_schema.test_table"
Column |          Type          | Modifiers
-------+------------------------+-----------
0      | character varying(256) |
1      | character varying(256) |
2      | character varying(256) |
3      | character varying(256) |
4      | character varying(256) |
5      | character varying(256) |
6      | character varying(256) |

The column 0 still only accepts 256 characters.

  • It seems to me that you can't change the type of a table while inserting. If you want to send `str` you might consider to convert `dtype` with `df["0"]=df["0"].astype('str')`. Finally: in general insert to RS is pretty slow, usually is better to unload to S3 and then copy. – rpanai Oct 11 '18 at 18:08

0 Answers0