Running a simple hive query on Amazon EMR
SELECT COUNT(*) FROM TABLENAME
,
on "any" external table, gives the following error
InputFormatWrapper can not support RecordReaders that don't return same key & value objects. current reader class : class org.apache.hadoop.mapreduce.lib.db.MySQLDBRecordReader
On running set mapred.map.tasks=1;
the above error gets resolved, but it still persists for the View.
The tasks run fine if I remove Tez as hive execution engine
set hive.execution.engine=mr;
Used Qubole and MysqlJdbcConnector Jars to connect the external database..
Sample External Table
CREATE EXTERNAL TABLE `TEST` (
`TEST_ID` int ,
`TEST` int ,
`STATE` string ,
`CITY` string ,
`CITY_TYPE` string,
`INTERNAL_TAT` int ,
`LP_COMMIT_TAT` int ) STORED BY 'org.apache.hadoop.hive.jdbc.storagehandler.JdbcStorageHandler' TBLPROPERTIES ( "mapred.jdbc.driver.class"="com.mysql.jdbc.Driver", "mapred.jdbc.url"="jdbc:mysql://TEST_URL", "mapred.jdbc.username"="USERNAME", "mapred.jdbc.input.table.name"="TEST", "mapred.jdbc.output.table.name" = "TEST", "mapred.jdbc.hive.lazy.split"= "true","mapred.jdbc.password"="PASSWORD");
Couldn't find anything related online. The same question is unanswered at many other places