What is the best way to export pandas dataframe to hive table?

Question

I was looking for help here (and in many other place):

But I don't think I completely understand the proposals presented, because I failed with any of them

What am I trying to do is:

Extract data from hive table from schema1 to python dataframe.
Do some operations on columns and save as pandas dataframe.
Export pandas dataframe to hive table schema2.

I made points 1-2 as follows:

Extract data from hive table to python dataframe.

transport = puretransport.transport_factory(host='my_host_name',
                                            port=10000,
                                            username='my_username',
                                            password='my_password',
                                            use_ssl=True)

engine = db.create_engine(f"hive://my_username@/schema1",
                          connect_args={'thrift_transport': transport})

print("Selecting data from table", end=" ")
tab1 = []
for chunk in pd.read_sql_query(
        """select * from schema1.my_table""", con=engine, chunksize=5):
    tab1.append(chunk)
df = pd.concat(tab1)
print("DONE")

Do some operations on columns and save as pandas dataframe.

my_code_returning_dataframe...

Export pandas dataframe to hive table schema2.

what_should_i_do_there?

Thank you in advance for any help.

What is the best way to export pandas dataframe to hive table?

0 Answers0