10

I've successfully brought in one table using dask read_sql_table from a oracle database. However, when I try to bring in another table I get this error KeyError: 'Only a column name can be used for the key in a dtype mappings argument.'

I've checked my connection string and schema and all of that is fine. I know the table name exists and the column i'm trying to use as an index is a primary key on the table in the oracle database.

Can someone please explain why this error occurs when the column name clearly exists?

I know I can use Pandas chunk, but would rather use dask in this scenario.

below is how i'm connecting to the oracle database and the last bit of the error message

host='*******'
port='*****'
sid='****'
user='******'
password='*****'

con_string = 'oracle://' + user + ':' + password + '@' + host + ':' + port + '/' + sid 
engine = create_engine(con_string)

df =ddf.read_sql_table('table_name', uri=con_string, index_col='id', npartitions=None, schema='*****')

~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors, **kwargs) 5855
if col_name not in self: 5856 raise KeyError( -> 5857 "Only a column name can be used for the " 5858 "key in a dtype mappings argument." 5859 )

KeyError: 'Only a column name can be used for the key in a dtype mappings argument.'

Today for another table I added all the column names in ddf.read_sql_table and the query worked. But for another table I tried to do the same thing by listing all the column names and I got the KeyError above.

Thanks everyone.

Pete
  • 107
  • 1
  • 1
  • 6
  • This is the read_sql_table calling from its original documentaction: read_sql_table(table_name, con, schema=None, index_col=None, coerce_float=True, parse_dates=None, columns=None, chunksize=None). I believe schema is not necessary, you are already informing that as 'user' in your conn string. – powerPixie Nov 11 '19 at 06:18
  • Thanks powerPixie, I took the schema out but then got the error of the table doesn't exists. I believe that I need to specify the schema because the oracle database has several other schemas. But when I add the schema again I got the error I mentioned in my original post. As far as I know the user that has been created is so I can just read the database. Cheers. – Pete Nov 11 '19 at 21:58

2 Answers2

16

This error generally occurs when there is a mismatch in the column name or there exists a column in the dtype mapping which is not present in the table.

Juhi Sharma
  • 161
  • 1
  • 4
0

The index_col is not one of the columns (as it is the index of the dataframe). To fix your issue you'd have to provide the columns arg to read_sql_table and provide a list of all the columns except the index_col.

E.g. in your case, let's say your table has columns id, foo and bar: df = ddf.read_sql_table('table_name', uri=con_string, index_col='id', npartitions=None, schema='*****', columns=['foo', 'bar'])

Pieter
  • 11
  • 3