4

I got my directory structure as follows.

  • /data/year=/month=/day=/source1/abc.log
  • /data/year=/month=/day=/source2/def.log
  • /data/year=/month=/day=/source3/xyz.log

I wanted to create a hive table with year, month, date as partitions but it is complaining about the subfolder 'source1' when I tried to do MSCK REPAIR TABLE.

Create table statement

CREATE EXTERNAL TABLE SAMPLE ( col1 STRING, col2 STRING ) PARTITIONED BY (year STRING, month STRING, date STRING) STORED as ORC Location "s3n://blah/data/" TBLPROPERTIES ("orc.compress"="SNAPPY");

MSCK REPAIR TABLE give "unexpected component source1". Any idea how to create an external table without moving files around? Thanks for your help.

kamoor
  • 2,909
  • 2
  • 19
  • 34

2 Answers2

4

Could you please try setting the following property

  hive.msck.path.validation = skip (or) ignore

in hive-site.xml and then perform 'MSCK REPAIR TABLE' on your table

(refered from Hive Manual under 'Recover Partitions (MSCK REPAIR TABLE)' section)

Aditya
  • 2,385
  • 18
  • 25
  • 1
    Thanks for your reply. skip did not work but "ignore" worked. I had to look at hive source to find the fix, Refer DDLTask.java, method = msck. Please edit your answer so that I can mark it an answer – kamoor Aug 11 '16 at 17:18
0

Probably this error is because your path contains a folder after your partitions (source2 and source3), I had a similar issue when I forgot a partition in the creation statement.

Paulo Moreira
  • 411
  • 5
  • 13