hive query BlockMissingException

Question

I am having issues on both TEZ and MapReduce execution engines. Both appear related to permissions but for the life of me, I am lost.

When I execute it through TEZ I get this message:

org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-300459168-127.0.1.1-1478287363661:blk_1073741961_1140 file=/tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd/hive-hcatalog-core.jar

Looking at the file permissions in HDFS however they appear correct:

drwx------ - hiveuser hadoop 0 2016-11-11 09:54 /tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd

drwx------ - hiveuser hadoop 0 2016-11-11 09:54 /tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd/.tez

-rw-r--r-- 3 hiveuser hadoop 259706 2016-11-11 09:54 /tmp/hive/hiveuser/_tez_session_dir/03029ffd-a9c2-43de-8532-1e1f322ec0cd/hive-hcatalog-core.jar

On MapReduce the message is this

Could not obtain block: BP-300459168-127.0.1.1-1478287363661:blk_1073741825_1001 file=/hdp/apps/2.5.0.0-1245/mapreduce/mapreduce.tar.gz

File permissions on that one

-r--r--r-- 3 hdfsuser hadoop 51232019 2016-11-04 16:40 /hdp/apps/2.5.0.0-1245/mapreduce/mapreduce.tar.gz

Can anyone tell me what I am missing there? Please?

hdfs fsck -delete deletes the missing blocks. – Bharat Kul Ratan Mar 03 '21 at 12:01 — Bharat Kul Ratan, Mar 03 '21 at 12:01

score 1 · Answer 1 · answered Nov 11 '16 at 17:53

1

1) type hadoop fsck HDFS_FILE check if the particular hdfs file is healthy If not, then the particular file is corrupted. remove corrupted file, and try copying that jar and try below command

2) type hadoop dfsadmin -report check if the value of Missing blocks: 0

3) check name node web UI Startup Progress -> Safe Mode is 100% else leave safe mode

hadoop dfsadmin -safemode leave

then run fsck delete missing blocks

answered Nov 11 '16 at 17:53

Nirmal Ram

1,180
2
9
18

Not in safe mode. No corrupted blocks. I had already run these checks but also, since I use Ambari they are all available from the main overview page. All is well. It is definitely not related to any of that sadly. – Eva Donaldson Nov 11 '16 at 18:05

score 0 · Accepted Answer · answered Nov 14 '16 at 14:44

I finally figured this out and thought it would be friendly of me to post the solution. one of those that when you finally get it you think, "Ugh, that was so obvious". One important note, if you are having trouble with Hive make sure to check the Yarn logs too!

My solution to this and so many other issues was ensuring all my nodes had all the other nodes ip addresses in their host files. This ensures Ambari picks up all the correct IPs by hostname. I am on Ubuntu so I did the following:

$ vim /etc/hosts And then the file came out looking like this:

127.0.0.1       localhost
#127.0.1.1      ambarihost.com ambarihost
# Assigning static IP here so ambari gets it right
192.168.0.20    ambarihost.com ambarihost

#Other hadoop nodes
192.168.0.21    kafkahost.com kafkahost
192.168.0.22    hdfshost.com hdfshost

hive query BlockMissingException

2 Answers2