I have a issue with HBase cluster.
I have hosted an HBase cluster with Phoenix on EMR emr-5.8.0 and storage as S3. I have 1 master and 5 slaves 4.x large nodes. I’m losing the data while querying a table after a region server dies. I face this issue only if storage mode as S3, but with HDFS its working fine. Here are the steps I followed.
- Launched the cluster with hfs replication factory as 3.
- Created the tables and loaded the data using Phoenix.
- Cross checked the data I loaded into tables and I see the data.
- Wantedly terminated a EC2 machine which is part of cluster, meaning killing region server.
- I could see EMR resizing and bringing up the new node.
- When I query the table after the whole cluster is stable, which usually took 5-10 minutes, I see losing some data which is on dead RS.
I believe HBase replays the WAL once new node is brought up and I could also see the WAL file on HDFS new RS’s directory. But somehow I don’t see the complete data from the table.
Could you please let me know what possibly could go wrong. Also please let me know if I have to set any properties.
I would be happy to provide more details if you need.