5

I've 3 node cluster with replication 2 and the replicated table stats.

Recently saw that there is a delay on the replica db using /replica_satatus

db.stats:   Absolute delay: 0. Relative delay: 0.
db2.stats:  Absolute delay: 912916. Relative delay: 912916.

Here is data from system.replication_queue

Row 1:
──────
database: db2
table: stats
replica_name:           replica_2
position:               3
node_name:              queue-0001743101
type:                   GET_PART
create_time:            2018-06-19 20:57:42
required_quorum:        0
source_replica:         replica_1
new_part_name:          20180619_20180619_823572_823572_0
parts_to_merge:         []
is_detach:              0
is_currently_executing: 0
num_tries:              917943
last_exception:
last_attempt_time:      2018-06-29 15:32:50
num_postponed:          118617
postpone_reason:
last_postpone_time:     2018-06-29 15:32:23

Row 2:
──────
database: db2
table: stats
replica_name:           replica_2
position:               4
node_name:              queue-0001743103
type:                   MERGE_PARTS
create_time:            2018-06-19 20:57:48
required_quorum:        0
source_replica:         replica_1
new_part_name:          20180619_20180619_823568_823573_1
parts_to_merge:         ['20180619_20180619_823568_823568_0','20180619_20180619_823569_823569_0','20180619_20180619_823570_823570_0','20180619_20180619_823571_823571_0','20180619_20180619_823572_823572_0','20180619_20180619_823573_823573_0']
is_detach:              0
is_currently_executing: 0
num_tries:              917943
last_exception:         Code: 234, e.displayText() = DB::Exception: No active replica has part 20180619_20180619_823568_823573_1 or covering part, e.what() = DB::Exception
last_attempt_time:      2018-06-29 15:32:50
num_postponed:          199384
postpone_reason:        Not merging into part 20180619_20180619_823568_823573_1 because part 20180619_20180619_823572_823572_0 is not ready yet (log entry for that part is being processed).
last_postpone_time:     2018-06-29 15:32:35

Any clue how to deal with it?.

Should I detach broken replika partition and attach it again?

wedi
  • 51
  • 3

2 Answers2

0

Stop all inserts to this cluster, it should auto clear the replication queue.

delsanic
  • 777
  • 1
  • 5
  • 14
0

find one possible reason because I've got the similar question.
check your disk free space and clean some table
then this replication_queue problem can be solved.

james.peng
  • 373
  • 1
  • 3
  • 13