1

I met the problem of a small number of shards of a large table remains in the state of "Underreplicated". My small cluster has 5 nods to hold this single table. This table has 200M records, 5 shards, 2 replicas (partitioned into 200 partitions). And it was OK. And after some test (i.e. turn down a node and bring it back). There are a small number of 3 shards, become underreplicated.

And if I look into the table "sys.shards", I find the status of the three shard remain in the state of "initializing".

Please advise, what could be the problem? By the way, each nodes have 100 GB (%20 used) storage, and 4GB (50% used) heap.

Thanks!

admdrew
  • 3,790
  • 4
  • 27
  • 39
Newair
  • 33
  • 2
  • Strange, can you please tell us the used Crate Data, JVM and OS version? Do you maybe have any related log output? Does a full cluster restart changes anything? Maybe also better to move this to a crate github issue.. Thx – Sebastian Utz Jul 22 '14 at 11:45
  • Hi Sebastian, So full cluster restart doesn't seem to solve the problem. I drop the table and reload the data, trying to reproduce, so far not be able to reproduce the problem yet. Crate: 0.39.3 JVM: IBM J9VM (build 2.5, JRE 1.7.0) OS: ORHL 6.3 (Santiago) – Newair Jul 22 '14 at 18:20
  • Thanks for your information. To be honest, we have no glue why this happened, we've never experienced such problem in any development and production setup, nor can we reproduce it. The only thing come to my mind is that we've never tested Crate Data with the IBM J9VM, only with Oracle's JVM and OpenJDK. On Linux we always recommend using OpenJDK. If this happens to you again, a log output would be really helpful. – Sebastian Utz Jul 25 '14 at 09:33
  • Thanks, for safety I will change to OpenJDK. – Newair Jul 25 '14 at 14:24

0 Answers0