0

I am testing mongodb sharding on my local machine. Everything seems to run fine, except that my first chunk is redundant. I initially had a mongod instance running, which had 100,000 entries like this,

{ "_id" : ObjectId("53d788d26d664906cb359203"), "ind" : 123, "123" : 123, "someThing" : 5656 } .

I had indexed this instance on 'ind'. So, basically there were hundred thousand entries, ind is in the range from 0 to 100000.

After the whole deployment of sharding.

I run sh.status() on my mongos instance i got this.

--- Sharding Status-- sharding version: { "_id" : 1, "version" : 4, "minCompatibleVersion" : 4, "currentVersion" : 5, "clusterId" : ObjectId("53d74302e7df70cc9b8394e3") } shards: { "_id" : "shard0000", "host" : "localhost:104" } { "_id" : "shard0001", "host" : "localhost:105" } { "_id" : "shard0002", "host" : "localhost:106" } databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : true, "primary" : "shard0000" } test.data shard key: { "ind" : 1 } chunks: shard0001 1 shard0002 2 shard0000 1 { "ind" : { "$minKey" : 1 } } -->> { "ind" : 0 } on : shard0001 Timestamp(2, 0) { "ind" : 0 } -->> { "ind" : 25000 } on : shard0002 Timestamp(3, 2) { "ind" : 25000 } -->> { "ind" : 50000 } on : shard0002 Timestamp(3, 3) { "ind" : 50000 } -->> { "ind" : { "$maxKey" : 1 } } on : shard0000 Timestamp(3, 1)

This was my initial state. After adding some data :

--- Sharding Status --- sharding version: { "_id" : 1, "version" : 4, "minCompatibleVersion" : 4, "currentVersion" : 5, "clusterId" : ObjectId("53d74302e7df70cc9b8394e3") } shards: { "_id" : "shard0000", "host" : "localhost:104" } { "_id" : "shard0001", "host" : "localhost:105" } { "_id" : "shard0002", "host" : "localhost:106" } databases: { "_id" : "admin", "partitioned" : false, "primary" : "config" } { "_id" : "test", "partitioned" : true, "primary" : "shard0000" } test.data shard key: { "ind" : 1 } chunks: shard0001 4 shard0002 4 shard0000 4 { "ind" : { "$minKey" : 1 } } -->> { "ind" : 0 } on : shard0001 Timestamp(2, 0) { "ind" : 0 } -->> { "ind" : 25000 } on : shard0002 Timestamp(3, 2) { "ind" : 25000 } -->> { "ind" : 50000 } on : shard0002 Timestamp(3, 3) { "ind" : 50000 } -->> { "ind" : 87449 } on : shard0001 Timestamp(4, 0) { "ind" : 87449 } -->> { "ind" : 149796 } on : shard0001 Timestamp(5, 0) { "ind" : 149796 } -->> { "ind" : 224694 } on : shard0002 Timestamp(6, 0) { "ind" : 224694 } -->> { "ind" : 299592 } on : shard0001 Timestamp(7, 0) { "ind" : 299592 } -->> { "ind" : 374490 } on : shard0002 Timestamp(8, 0) { "ind" : 374490 } -->> { "ind" : 524286 } on : shard0000 Timestamp(8, 1) { "ind" : 524286 } -->> { "ind" : 674082 } on : shard0000 Timestamp(7, 2) { "ind" : 674082 } -->> { "ind" : 992211 } on : shard0000 Timestamp(7, 3) { "ind" : 992211 } -->> { "ind" : { "$maxKey" : 1 } } on : shard0000 Timestamp(3,5) My shard key was also ind. But "ind" : { "$minKey" : 1 } } -->> { "ind" : 0 } on : shard0001 Timestamp(2, 0) is clearly not right `.

My first chunk will always be empty.

What should I do?

user3848191
  • 103
  • 1
  • 6
  • Could you clarify what the problem is? The first chunk won't always be empty. You could add a document with `ind < 0`. It might not be empty now, if you added any documents with `ind < 0`. – wdberkeley Jul 31 '14 at 18:06

1 Answers1

0

Looks like you have got an empty first chunk. From MongoDB docs : A chunk is empty if it has no documents associated with its shard key range.

IMPORTANT Empty chunks can make the balancer assess the cluster as properly balanced when it is not. Empty chunks can occur under various circumstances, including:

If a pre-split creates too many chunks, the distribution of data to chunks may be uneven. If you delete many documents from a sharded collection, some chunks may no longer contain data.

The problem in your case is there are no consecutive chunks (empty + non empty) on the same shard which can be merged. So for now you will have to live with it. Refer the below link: Merge chunks in sharded cluster

vmr
  • 1,895
  • 13
  • 24