0

My cluster is composed of 3 linux_x64 servers. It contains 1 controller node and 3 data nodes. The server version of DolphinDB is v2.0.1.1.

dfsReplicationFactor=2
dataSync=1

The schema of the database is:

2021.09.09 08:42:57.180: execution was completed [34ms]
partitionSchema->[2021.09.06,2021.09.05,2021.09.04,2021.09.03,2021.09.02,2021.09.01,2021.08.31,2021.08.30,...]
databaseDir->dfs://dwd
engineType->OLAP
partitionSites->
partitionTypeName->VALUE
partitionType->1

When I insert data to the database “dfs://dwd”, I get an error:

Failed to add new value partitions to database dfs://dwd.Please manaually add new partitions [2021.09.07].

Then I use the following script to manually add partitions:

db=database("dfs://dwd")
addValuePartitions(db,2021.09.07..2021.09.09)

The error is:

<ChunkInRecovery>openChunks failed on '/dwd/domain', chunk cf57375e-b4b3-dc87-9b41-667a5e91a757 is in RECOVERING state
jinwandalaohu
  • 226
  • 1
  • 7

1 Answers1

1

The repair method is shown as follows:

Step 1: use etClusterChunksStatus to get chunkid of `/dwd/domain' at the controller node. The sample cade is shown following:

select * from rpc(getControllerAlias(), getClusterChunksStatus) where  file like "%/domain%" and state != 'COMPLETE'

enter image description here

Step 2: use getAllChunks to get the partition information for that chunkid at the data node. In the code below, The chunkid "4503a64f-4f5f-eea4-4247-a0d0fc3941a1" is obtained by step 1.

select * from pnodeRun(getAllChunks)  where chunkId="4503a64f-4f5f-eea4-4247-a0d0fc3941a1"

enter image description here

Step 3: Use copyReplicas to copy the partition copy. Assuming that the result of step 2 shows that the partition copy is on datanode3, now copy to datanode1:

rpc(getControllerAlias(), copyReplicas{`datanode3, `datanode1, "4503a64f-4f5f-eea4-4247-a0d0fc3941a1"})

Step 4: use getClusterChunksStatus to check if the status is COMPLETE. If it is, then the repair is successful.

enter image description here

dbaa9948
  • 189
  • 2
  • 10