I've built up a CouchDB cluster of 4 nodes to store the tweets I retrieved
The cluster was configured to have 8 shards and keep 3 copies of each document
[cluster]
q=8
r=2
w=2
n=3
I haven't added any views or additional indexes and the size of the database shown in Fauxton is 4.3 GB
However, CouchDB is taking up exceptionally large disk space in one of the nodes
$ ansible -i hosts -s -m shell -a 'du /vol/couchdb/shards/* -sh' couchdb
crake.couchdb.cloud | SUCCESS | rc=0 >>
363M /vol/couchdb/shards/00000000-1fffffff
990M /vol/couchdb/shards/20000000-3fffffff
17G /vol/couchdb/shards/40000000-5fffffff
1.4G /vol/couchdb/shards/60000000-7fffffff
359M /vol/couchdb/shards/80000000-9fffffff
989M /vol/couchdb/shards/a0000000-bfffffff
12G /vol/couchdb/shards/c0000000-dfffffff
1.6G /vol/couchdb/shards/e0000000-ffffffff
darter.couchdb.cloud | SUCCESS | rc=0 >>
1.4G /vol/couchdb/shards/00000000-1fffffff
367M /vol/couchdb/shards/20000000-3fffffff
1001M /vol/couchdb/shards/40000000-5fffffff
1.4G /vol/couchdb/shards/60000000-7fffffff
1.4G /vol/couchdb/shards/80000000-9fffffff
364M /vol/couchdb/shards/a0000000-bfffffff
998M /vol/couchdb/shards/c0000000-dfffffff
1.4G /vol/couchdb/shards/e0000000-ffffffff
bustard.couchdb.cloud | SUCCESS | rc=0 >>
1004M /vol/couchdb/shards/00000000-1fffffff
1.4G /vol/couchdb/shards/20000000-3fffffff
1.4G /vol/couchdb/shards/40000000-5fffffff
365M /vol/couchdb/shards/60000000-7fffffff
1001M /vol/couchdb/shards/80000000-9fffffff
1.4G /vol/couchdb/shards/a0000000-bfffffff
1.4G /vol/couchdb/shards/c0000000-dfffffff
364M /vol/couchdb/shards/e0000000-ffffffff
avocet.couchdb.cloud | SUCCESS | rc=0 >>
1.4G /vol/couchdb/shards/00000000-1fffffff
1.4G /vol/couchdb/shards/20000000-3fffffff
368M /vol/couchdb/shards/40000000-5fffffff
999M /vol/couchdb/shards/60000000-7fffffff
1.4G /vol/couchdb/shards/80000000-9fffffff
1.4G /vol/couchdb/shards/a0000000-bfffffff
364M /vol/couchdb/shards/c0000000-dfffffff
1001M /vol/couchdb/shards/e0000000-ffffffff
In crake.couchdb.cloud
, two of the shards, 40000000-5fffffff
and c0000000-dfffffff
, are far larger than others.
I once tried deleting those large shards in crake.couchdb.cloud
and waited CouchDB itself to rebuild. The disk usage was balanced after the rebuild however it gradually went unbalanced again after I started adding new documents to the database.
I'm using MD5(tweet[id_str])
as the document ID. Could this be the reason of the issue?
I feel really confused about this as I think even if I've made any mistakes, it should have eaten up the resources of 3 different nodes as the data are replicated across the cluster.
Please help, thanks.
UPDATE
Later I deleted all the VPS instances and rebuilt the cluster with 3 CouchDB nodes, namely Avocet
, Bustard
and Crake
. The new cluster configuration is as following:
[cluster]
q=12
r=2
w=2
n=2
Before the rebuilding, I replicated all the data to an alternative CouchDB instance so I could transfer them back after it's done. The disk usage was balanced after the restoration.
Additionally, I introduced a HAProxy on the 4th node, namely Darter
, as a load balancer.
So this time, all my twitter retrievers send their requests to the balancer. However the disk usage became unbalanced again, and it was exactly the 3rd node Crake
which took up much more space.
bustard.couchdb.cloud | SUCCESS | rc=0 >>
Filesystem Size Used Avail Use% Mounted on
/dev/vdc 81G 9.4G 68G 13% /vol
avocet.couchdb.cloud | SUCCESS | rc=0 >>
Filesystem Size Used Avail Use% Mounted on
/dev/vdc 81G 9.3G 68G 13% /vol
crake.couchdb.cloud | SUCCESS | rc=0 >>
Filesystem Size Used Avail Use% Mounted on
/dev/vdc 81G 30G 48G 39% /vol
The database size is only 4.2 GB
and Crake
used approximately 7 times larger than that!
I'm completely clueless now...
UPDATE 2
The _dbs
info from all the nodes
crake.couchdb.cloud | SUCCESS | rc=0 >>
{
"db_name": "_dbs",
"update_seq": "11-g2wAAAABaANkABtjb3VjaGRiQGNyYWtlLmNvdWNoZGIuY2xvdWRsAAAAAmEAbgQA_____2phC2o",
"sizes": {
"file": 131281,
"external": 8313,
"active": 9975
},
"purge_seq": 0,
"other": {
"data_size": 8313
},
"doc_del_count": 0,
"doc_count": 7,
"disk_size": 131281,
"disk_format_version": 6,
"data_size": 9975,
"compact_running": false,
"instance_start_time": "0"
}
avocet.couchdb.cloud | SUCCESS | rc=0 >>
{
"db_name": "_dbs",
"update_seq": "15-g2wAAAABaANkABxjb3VjaGRiQGF2b2NldC5jb3VjaGRiLmNsb3VkbAAAAAJhAG4EAP____9qYQ9q",
"sizes": {
"file": 159954,
"external": 8313,
"active": 10444
},
"purge_seq": 0,
"other": {
"data_size": 8313
},
"doc_del_count": 0,
"doc_count": 7,
"disk_size": 159954,
"disk_format_version": 6,
"data_size": 10444,
"compact_running": false,
"instance_start_time": "0"
}
bustard.couchdb.cloud | SUCCESS | rc=0 >>
{
"db_name": "_dbs",
"update_seq": "15-g2wAAAABaANkAB1jb3VjaGRiQGJ1c3RhcmQuY291Y2hkYi5jbG91ZGwAAAACYQBuBAD_____amEPag",
"sizes": {
"file": 159955,
"external": 8313,
"active": 9999
},
"purge_seq": 0,
"other": {
"data_size": 8313
},
"doc_del_count": 0,
"doc_count": 7,
"disk_size": 159955,
"disk_format_version": 6,
"data_size": 9999,
"compact_running": false,
"instance_start_time": "0"
}