1

I am unable to create the partition into a new table from the table which is already present on hive.

The query that I am running on hive after the Table creation is

INSERT INTO TABLE ba_data.PNR_INFO1_partitioned PARTITION(pnr_create_dt) select * from pnr_info1_external;

The error that I am getting is

    Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hive/warehouse/ba_data.db/pnr_info1_partitioned/.hive-staging_hive_2016-08-09_17-47-47_508_8688474345886508021-1/_task_tmp.-ext-10002/pnr_create_dt=18%2F12%2F2013/_tmp.000000_3 could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.
        at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1549)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3200)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:641)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:482)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033)

        at org.apache.hadoop.ipc.Client.call(Client.java:1468)
        at org.apache.hadoop.ipc.Client.call(Client.java:1399)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
        at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:399)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
        at com.sun.proxy.$Proxy13.addBlock(Unknown Source)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1532)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1349)
        at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:588)

After I browsed and found that namenode,datanode folders needs to be deleted and namenode should be formatted.I have done that sanitary task as well.But still the same error I am getting.

Also I have set the replication factor to 1 and all the Hadoop processes are running well.

Please suggest me how to proceed in order to get away from this issue.Your suggestions are much appreciated.

Avinash
  • 127
  • 2
  • 13

2 Answers2

0

perform partitioned in valuefirst you need to create 1. table with all field 2. load data into table 3. create table with partition column with their type 4. copy data from first table to partition table

this image show how to create table ,load ,create partitoned and copy data from  table to partitioned table

Mahesh Gupta
  • 1,882
  • 12
  • 16
  • Hi, I have followed the same approach and tried.While loading the data into partitioned table, I am getting this error. – Avinash Aug 11 '16 at 04:53
  • you haven't load data into partition table, just copy data from previous created table to partition taBLE ;) – Mahesh Gupta Aug 11 '16 at 07:49
  • I suspect that it is due to the infrastructure (4GB Ram) and running on Native Hadoop and Hive Platform. My table size to partition is close to 1 GB (1.9 million records). I have created multiple tables with 50,000 records each and tried to copy the data into actual partition table which worked fine for now.Anyway.Hope this won't be the same for large cluster size.Thanks for your response. – Avinash Aug 11 '16 at 13:02
0

I think the dynamic partition needs to be enabled. following works. set hive.exec.dynamic.partition.mode=nonstrict;

create table parttable (id int) partitioned by (partcolumn string) 
row format delimited fields terminated by '\t'
lines terminated by '\n'
;

create table source_table (id int,partcolumn string)
row format delimited fields terminated by '\t'
lines terminated by '\n'
;

insert into source_table values (1,'Chicago');
insert into source_table values (2,'Chicago');
insert into source_table values (3,'Orlando');

set hive.exec.dynamic.partition=true; insert overwrite table parttable partition(partcolumn) select id,partcolumn from source_table;

Sagar Shah
  • 118
  • 4
  • Hi, I have followed the same approach and tried.While loading the data into partitioned table, I am getting this error. – Avinash Aug 11 '16 at 04:53
  • if you think it is capacity issue, you can rule it out by limiting the table size by creating temp table. create table temp as select * from pnr_info1_external limit 500; INSERT INTO TABLE ba_data.PNR_INFO1_partitioned PARTITION(pnr_create_dt) select * from temp; – Sagar Shah Aug 15 '16 at 21:04
  • Yes, you are correct.We can LIMIT to 500 but I have around 1.9 mollion records.There is not concept of row id in hive.We have to manually write UDF to generate which will be another task to perform – Avinash Aug 16 '16 at 05:52