2

I tried to write a pandas dataframe into a HIVE table on a remote server using to_sql method. Currently I have the Jaydebeapi connection object created and I'm able to use read_sql method from pandas plus this connection object to query data from that table. However when I tried to write back the code generates errors. Could you please help me understand this issue?

Below is the code sample I tried:

import pandas as pd
import jaydebeapi
conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver","jdbc:hive2://server:port/;transportMode=http;httpPath=gateway/default/hive;ssl=true", ["user", "password"])
dff = pd.DataFrame([[1,2],[4,7]], columns=['A','B'])
dff.to_sql(name='table', con=conn, if_exists='append')

And the error message I got is:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 501, in execute
    is_rs = self._prep.execute()
jpype._jexception.org.apache.hive.service.cli.HiveSQLExceptionPyRaisable: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:59 cannot recognize input near ''table'' ';' '<EOF>' in expression specification
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1378, in execute
    cur.execute(*args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 503, in execute
    _handle_sql_exception()
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 156, in _handle_sql_exception_jpype
    reraise(exc_type, exc_info[1], exc_info[2])
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 57, in reraise
    raise value.with_traceback(tb)
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 501, in execute
    is_rs = self._prep.execute()
jaydebeapi.DatabaseError: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:59 cannot recognize input near ''table'' ';' '<EOF>' in expression specification
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 417, in rollback
    self.jconn.rollback()
jpype._jexception.java.sql.SQLExceptionPyRaisable: java.sql.SQLException: Method not supported
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1382, in execute
    self.con.rollback()
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 419, in rollback
    _handle_sql_exception()
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 156, in _handle_sql_exception_jpype
    reraise(exc_type, exc_info[1], exc_info[2])
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 57, in reraise
    raise value.with_traceback(tb)
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 417, in rollback
    self.jconn.rollback()
jaydebeapi.DatabaseError: java.sql.SQLException: Method not supported
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3267, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-19-b63282bc3842>", line 1, in <module>
    dff.to_sql(name='table', con=conn, if_exists='append')
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py", line 2130, in to_sql
    dtype=dtype)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 450, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1480, in to_sql
    table.create()
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 561, in create
    if self.exists():
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 549, in exists
    return self.pd_sql.has_table(self.name, self.schema)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1492, in has_table
    return len(self.execute(query, [name, ]).fetchall()) > 0
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1386, in execute
    raise_with_traceback(ex)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\compat\__init__.py", line 404, in raise_with_traceback
    raise exc.with_traceback(traceback)
  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\io\sql.py", line 1382, in execute
    self.con.rollback()
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 419, in rollback
    _handle_sql_exception()
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 156, in _handle_sql_exception_jpype
    reraise(exc_type, exc_info[1], exc_info[2])
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 57, in reraise
    raise value.with_traceback(tb)
  File "C:\ProgramData\Anaconda3\lib\site-packages\jaydebeapi\__init__.py", line 417, in rollback
    self.jconn.rollback()
pandas.io.sql.DatabaseError: Execution failed on sql: SELECT name FROM sqlite_master WHERE type='table' AND name=?;
org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: ParseException line 1:59 cannot recognize input near ''table'' ';' '<EOF>' in expression specification
unable to rollback
sanyassh
  • 8,100
  • 13
  • 36
  • 70
Enos
  • 21
  • 3

1 Answers1

0

You cannot do it that way. You need to create manually the table (or with an SQL script executed on the cursor, not on the connection) and then you run an UPDATE statement with all the rows of the DB that are well formatted. Sorry about that.

Luigi Neri
  • 36
  • 5