Mapr-Db bulkloading is failing at reducer end

Question

Bulkload failed while processing the reducer with following error. We are running the mapreduce on M5 cluster trying to update a m7 table.

java.io.IOException: bulkLoaderClose() on '/home/test/account122' failed
with error: Function not implemented (38). 
at com.mapr.fs.Inode.checkError(Inode.java:1611) 
at com.mapr.fs.Inode.checkError(Inode.java:1583) 
at com.mapr.fs.Inode.bulkLoaderClose(Inode.java:1278) 
at com.mapr.fs.MapRHTable.bulkLoaderClose(MapRHTable.java:119) 
at com.mapr.fs.hbase.BulkLoadRecordWriter.close(BulkLoadRecordWriter.java:160)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:621)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:458)
at org.apache.hadoop.mapred.Child$4.run(Child.java:278)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1566) at org.apache.hadoop.mapred.Child.main(Child.java:267)

hbase version is 0.98.12-mapr-1506.

The Mapr-Db table is enabled for bulkloading. Reducers process all the data and at the time of bulkload it fails some how.

Please help.

score 0 · Answer 1 · answered Jan 06 '16 at 23:45

0

What is the utility you are using to Bulk load? You can use ImportTsv to bulk load into MapR-DB tables.

answered Jan 06 '16 at 23:45

Ranjit

48
5

I was using MapReduce to do the bulk load using HFileOutputFormat2. The job runs but fails while moving the Hfiles. Did try ImportTsv. It also fails with the same error. – Abhiram Jan 07 '16 at 01:24

score 0 · Answer 2 · answered Jan 07 '16 at 17:25

I tested below data file and import. Try if this works on your cluster.

1) Add below data to a mfs in your cluster. Replace it with your path ./mapr/demo.mapr.com/home/datafile.csv: 2014,1,1,1,3,2014-01-01,AA,N338AA,1,JFK,LAX,0914,14.00,1238,13.00,0.00,,385.00,359.00,2475.00,,,,,, 2014,1,1,2,4,2014-01-02,AA,N338AA,1,JFK,LAX,0857,-3.00,1226,1.00,0.00,,385.00,340.00,2475.00,,,,,,

export CF="cf1"

2) maprcli table delete -path /home/test/account122 3) maprcli table create -path /home/test/account122 3) maprcli table cf create -path /home/test/account122 -cfname $CF

run import job

4)java -cp hbase classpath org.apache.hadoop.hbase.mapreduce.ImportTsv \ -Dimporttsv.separator=, \ -Dimporttsv.columns=$CF:year,$CF:qtr,$CF:month,$CF:dom,$CF:dow,HBASE_ROW_KEY,$CF:carrier,$CF:tailnum,$CF:flightnumber,$CF:origin,$CF:dest,$CF:deptime,$CF:depdelay,$CF:arrtime,$CF:arrdelay,$CF:cncl,$CF:cnclcode,$CF:elaptime,$CF:airtime,$CF:distance,$CF:carrierdelay,$CF:weatherdelay,$CF:nasdelay,$CF:securitydelay,$CF:aircraftdelay,$CF:dummy \ /home/test/account122 \ /mapr/demo.mapr.com/home/datafile.csv

I realized after I sent you wanted to do bulk load. Please use below for bulk load:[mapr@maprdemo ~]$ cat voter_data 1,david Davidson,49,socialist,369.78,5108 2,priscilla steinbeck,61,democrat,111.76,2987 % hbase org.apache.hadoop.hbase.mapreduce.ImportTsv -Dimporttsv. columns=HBASE_ROW_KEY,cf1:name,cf2:age,cf2:party,cf3:contribution_amount,cf3: voter_number -Dimporttsv.separator=, -Dimporttsv.bulk.output=/user/mapr/dummy /user/mapr/voter_data_table /user/mapr/voter_data No HFiles are created in the output directory [mapr@maprdemo ~]$ ls /user/mapr/dummy _SUCCESS [mapr@maprdemo ~]$ — Ranjit, Jan 07 '16 at 19:50
Before running importtsv create the table. hbase> create '/user/mapr/voter_data_table', {NAME=>'cf1'}, — Ranjit, Jan 07 '16 at 19:54

score 0 · Accepted Answer · answered Jan 08 '16 at 02:30

0

By default the MaprdDb tables do not support BulkLoading. The error Function not implemented (38) indicates that this feature is not supported.

answered Jan 08 '16 at 02:30

Abhiram

362
1
2
14

Mapr-Db bulkloading is failing at reducer end

3 Answers3

run import job