Questions tagged [pyhive]

107 questions
2
votes
1 answer

How to add new dialect to Alembic besides built-in dialects?

Alembic support 5 built-in dialects only: https://github.com/sqlalchemy/alembic/tree/master/alembic/ddl Now I want to manages schema in Apache Hive via alembic and noticed that PyHive supports SQLAlchemy interfaces so technically Alembic can support…
shawnzhu
  • 7,233
  • 4
  • 35
  • 51
2
votes
1 answer

Using pyhive with kerberos ticket to connect to kerberized hadoop cluster

I would like to connect to Hive on our kerberized Hadoop cluster and then run some hql queries (obviously haha :)) from machine, which already has its own Kerberose Client and it works, keytab has been passed and tested. Our Hadoop runs HWS 3.1 and…
la_femme_it
  • 632
  • 10
  • 24
2
votes
0 answers

PyHive - Long running query timeout - [Errno 110] Connection timed out

I am running Hive queries from Python using PyHive. One of the query is taking around 12-15 minutes to complete. I could see it completed on Hadoop ResourceManager UI, however I am seeing bellow timeout error on Python. Error:Traceback (most recent…
Mahendra
  • 21
  • 3
2
votes
3 answers

Python/PyHive - extract specific error message from exception

I'm facing an exception similar to this one and I'm trying to handle it based on the error itself. The problem is that pyhive.exc.OperationalError is very generic and handles errors from timeouts to non-existent tables so I would need the exact…
Craig
  • 1,929
  • 5
  • 30
  • 51
2
votes
3 answers

Insert values from file to an existing table on hive

I am new to hadoop ecosystem. I was trying to create hive table from CSV file using the below query. CREATE EXTERNAL TABLE IF NOT EXISTS proxy_data( date_time TIMESTAMP,time_taken INT, c_ip STRING, sc_status INT, s_action STRING, sc_bytes INT, …
Akash
  • 100
  • 1
  • 10
2
votes
0 answers

Error Can't use pyhive to connect to Hive Database

This is the code use to connect to our hive database , it ran fine a week back but now it seems to be be failing to even open a session and get a cursor to execute the queries. The issue was temporarily fixed when i explicitly added a…
2
votes
2 answers

How can i connect to presto pyhive?

I want to connect presto using pyhive in zeppelin now, I follows about https://github.com/dropbox/PyHive I use the connect function and correct parameters. %python from pyhive import presto cursor = presto.connect(host='localhost', …
lil
  • 439
  • 3
  • 8
  • 17
2
votes
1 answer

Python Pyhive module cannot import name hive

I want to connect Python to hive using pyhive. I'm using the below python script to be executed on my local. #!/usr/bin/env python # coding: utf-8 from pyhive import hive from TCLIService.ttypes import TOperationState def mysql_connect(host, port,…
user7422128
  • 902
  • 4
  • 17
  • 41
1
vote
0 answers

Standalone program runs on python 3.9 to connect hive using pyhive but AWS lambda throws No module named 'sasl.saslwrapper'

I have a small program that gets the list of databases in hive via thrift server endpoint. I am using pyhive. When I run it as a standalone program it works perfectly fine. I am using Python3.9. Now when I run the same code from from Lambda, it…
1
vote
0 answers

Connecting to multiple hosts in Hive with SqlAlchemy

I've already had a working connection through ODBC using Cloudera ODBC Driver for Apache Hive, where I had my DSN set and all I needed was to call pyodbc.connect(f"DSN={mydsn}", autocommit=True). Since I'm planning to use pandas on the query result,…
Kropiciel
  • 75
  • 6
1
vote
1 answer

How to use dbt seed properly with dbt-spark[PyHive] running in EMR?

Problem I am trying to implement a new process using dbt seeds. When I use it in a Redshift connection there is no problem, but when I try to use it with dbt-spark[PyHive] in EMR some problems arise. First Try seed-paths: ["seeds"] seeds: …
1
vote
0 answers

pyhive is it possible to stop a query job if hit ctrl+c

I uses pyhive in jupyter to connect to hive/presto for some adhoc analysis. Something annoying is if I cancel a submitted query job via 'ctrl + c', it only stops the jupyter, but won't stop the query job remotely. Is there a way, when 'ctrl + c', it…
user1269298
  • 717
  • 2
  • 8
  • 26
1
vote
0 answers

Error while installing SASL in Python 3.9.5 windows

Am trying to install PyHive to execute Hive queries in Python. As a pre-requisite I tried installing SASL and it gives the below error. My python version is 3.9.5 and am using Windows OS. Can someone please help with this error? ERROR: Command…
1
vote
0 answers

pyHive error TTransportException: Could not start SASL

I am trying to connect to remote hive using pyHive. conn = hive.Connection(host='*********', port=10001, database='default', username='********', auth='KERBEROS', kerberos_service_name='hive').cursor() But I always got this…
Anthony Lauly
  • 319
  • 2
  • 5
  • 17
1
vote
0 answers

Getting error "UnicodeError: label too long" while trying to connect Hive from python

I am trying to connect Hive DB using below host and user creds. Getting error "UnicodeError: label too long". Is there a way I can overcome this issue? I tried below script: from pyhive import hive import re, os, time import pandas as pd import…
Dilip
  • 55
  • 1
  • 1
  • 8