I'm intermittently getting the error message
DAG did not succeed due to VERTEX_FAILURE.
when running Hive queries via PyHive. Hive is running on an EMR cluster where hive.vectorized.execution.enabled
is set to false
in the hive-site.xml file for this reason.
I can set the above property through the configuration on the Hive connection and my query has run successfully every time I've executed it, however I want to confirm that this has fixed the issue and that it is definitely the case that hive-site.xml is being ignored.
Can anyone confirm if this is the expected behavior, or alternatively is there any way to inspect the Hive configuration via PyHive as I've not been able to find any way of doing this?
Thanks!