I want to know how many records processed or % of records proccessed by a query to fetch result in hive.
I tried describe formatted for query, but unable to do.
describe formatted (select * from sample)
Use explain command:
explain extended select * from sample
But the number of rows in the plan is taken from statistics because query was not actually executed yet. The number of processed
rows will become known only after execution.
See manual here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Explain
Counters in the log after command finished look like this:
Counters=FileSystemCounters.FILE_BYTES_READ:165364556525,
FileSystemCounters.FILE_BYTES_WRITTEN:398475913171,
FileSystemCounters.FILE_READ_OPS:0,
FileSystemCounters.FILE_LARGE_READ_OPS:0,
FileSystemCounters.FILE_WRITE_OPS:0,
FileSystemCounters.HDFS_BYTES_READ:2403609087417,
FileSystemCounters.HDFS_BYTES_WRITTEN:2401487507859,
FileSystemCounters.HDFS_READ_OPS:185667,
FileSystemCounters.HDFS_LARGE_READ_OPS:0 HIVE.RECORDS_IN:204428194,
HIVE.RECORDS_OUT_0:63070586,
HIVE.RECORDS_OUT_1_schema.table_name:39980068,
HIVE.RECORDS_OUT_INTERMEDIATE:126141195,
HIVE.SKEWJOINFOLLOWUPJOBS:0,
Shuffle Errors.BAD_ID:0,Shuffle