3

I'm having an odd problem, it seems that when im fetching the data from HBase using the Spark phoenix

val rdd = sc.phoenixTableAsRDD(tableName, allColumns, zkUrl = Some(hostPort).map(tupleToObject)

I'm getting an RDD with all the record, but few dont have some of the non primary key fields valorized (ramdomly it seems). If i query such records with a simple Phoenix client i find that these fields are in fact with value.

Example:
rdd.foreach(x => {
      println("Field A -> " + x._1) //not a primary key value
      println("Field B -> " + x._2) //not a primary key value
 })

Output:
  Field A 2
  Field B 3
  Field A null
  Field B 4
  Field A null  
  Field B null  

What am i missing? Is it possible that Apache Phoenix does not guarantee data consistency

Felix
  • 140
  • 10

0 Answers0