2

I have a pojo class of emp like below: I am able to read streaming data and I want to insert data to Hbase

@JsonInclude(Include.NON_NULL)
public class empData implements Serializable {

    private String id;
    private String name;

    @Override
    public String toString() {
        return "id=" + id + ", name="+ name ;
    }
    public String id() {
        return id;
    }
    public void setId(String id) {
        this.id = id;
    }
    public String getName() {
        return name;
    }
    public void setName(String name) {
        this.name = name;
    }

}

Below is spark code:

empRecords.foreachRDD(new Function<JavaRDD<empData>, Void>() {

            private static final long serialVersionUID = 1L;

            @Override
            public Void call(JavaRDD<empData> empDataEvent)throws Exception {       

                Configuration conf = HBaseConfiguration.create();
                Configuration config = null;
                config = HBaseConfiguration.create();
                config.set("hbase.zookeeper.quorum", "**********);
                HBaseAdmin.checkHBaseAvailable(config);
                config.set(TableInputFormat.INPUT_TABLE, "tableName");
                Job newAPIJobConfiguration1 = Job.getInstance(config);
                newAPIJobConfiguration1.getConfiguration().set(TableOutputFormat.OUTPUT_TABLE, "empHbase");
                newAPIJobConfiguration1.setOutputFormatClass(org.apache.hadoop.hbase.mapreduce.TableOutputFormat.class);        
                JavaPairRDD<ImmutableBytesWritable, Put> inesrts = empData.mapToPair(new PairFunction<Row, ImmutableBytesWritable, Put>() {

                            public Tuple2<ImmutableBytesWritable, Put> call(Row row) throws Exception

                            {
                                Put put = new Put(Bytes.toBytes(row.getString(0)));
                                put.add(Bytes.toBytes("empA"),Bytes.toBytes("id"),Bytes.toBytes(row.getString(1)));
                                put.add(Bytes.toBytes("empA"),Bytes.toBytes("name"),Bytes.toBytes(row.getString(2)));
                                return new Tuple2<ImmutableBytesWritable, Put>(new ImmutableBytesWritable(), put);
                            }
                                });

                            inserts.saveAsNewAPIHadoopDataset(newAPIJobConfiguration1.getConfiguration());
                                        }
        });
        jssc.start();
        jssc.awaitTermination();
    }  

The problem in the code is this step:

JavaPairRDD<ImmutableBytesWritable, Put> inesrts =empDataEvent.mapToPair(new PairFunction<Row, ImmutableBytesWritable, Put>()  

How to use empDataEvent and how to insert.. How do I insert as mapToPair empDataEvent class object so that I can insert into Hbase. Any help appreciated..

Amaresh
  • 3,231
  • 7
  • 37
  • 60

1 Answers1

0

Aman,

In your code you have refer "Row", can you please elaborate where it is coming from? because there is no reference for it.

See updated code below, use class name "empData" instead of "Row" object.

JavaPairRDD<ImmutableBytesWritable, Put> inesrts = empData.mapToPair(new PairFunction<empData, ImmutableBytesWritable, Put>() {

                        public Tuple2<ImmutableBytesWritable, Put> call(empData row) throws Exception
                        {
                            Put put = new Put(Bytes.toBytes(row.id));
                            put.add(Bytes.toBytes("empA"),Bytes.toBytes("id"),Bytes.toBytes(row.id));
                            put.add(Bytes.toBytes("empA"),Bytes.toBytes("name"),Bytes.toBytes(row.getName));
                            return new Tuple2<ImmutableBytesWritable, Put>(new ImmutableBytesWritable(), put);
                        }
                            });
Explorer
  • 1,491
  • 4
  • 26
  • 67
sandip44
  • 71
  • 5