I'm having an odd problem, it seems that when im fetching the data from HBase using the Spark phoenix
val rdd = sc.phoenixTableAsRDD(tableName, allColumns, zkUrl = Some(hostPort).map(tupleToObject)
I'm getting an RDD with all the record, but few dont have some of the non primary key fields valorized (ramdomly it seems). If i query such records with a simple Phoenix client i find that these fields are in fact with value.
Example:
rdd.foreach(x => {
println("Field A -> " + x._1) //not a primary key value
println("Field B -> " + x._2) //not a primary key value
})
Output:
Field A 2
Field B 3
Field A null
Field B 4
Field A null
Field B null
What am i missing? Is it possible that Apache Phoenix does not guarantee data consistency