Sharding on single machine

Question

I want to run shard on my local server. please help me step by step. I want to create multi instance of mongod on my cpu(16 core). By the way, I'm Collection that consist of 7 million documents, whether they went missing when I run shard?

I use this script for creating 3 shard on one replicaSet:

# clean everything up
echo "killing mongod and mongos"
killall mongod
killall mongos
echo "removing data files"
rm -rf /media/mongo/data/config
rm -rf /media/mongo/data/shard*
rm -rf /data/config
rm -rf /data/shard*

# For mac make sure rlimits are high enough to open all necessary connections
ulimit -n 2048

# start a replica set and tell it that it will be shard0
mkdir -p /media/mongo/data/shard0/rs0 
mongod --replSet s0 --logpath "/var/log/mongodb/s0-r0.log" --dbpath /media/mongo/data/shard0/rs0 --port 37017 --fork --shardsvr --smallfiles



sleep 5
# connect to one server and initiate the set
mongo --port 37017 << 'EOF'
config = { _id: "s0", members:[
{ _id : 0, host : "localhost:37017" },
]};
rs.initiate(config)
EOF

# start a replicate set and tell it that it will be a shard1
mkdir -p /media/mongo/data/shard1/rs0 
mongod --replSet s1 --logpath "/var/log/mongodb/s1-r0.log" --dbpath /media/mongo/data/shard1/rs0 --port 47017 --fork --shardsvr --smallfiles



sleep 5

mongo --port 47017 << 'EOF'
config = { _id: "s1", members:[
{ _id : 0, host : "localhost:47017" },
]};
rs.initiate(config)
EOF

# start a replicate set and tell it that it will be a shard2
mkdir -p /media/mongo/data/shard2/rs0 
mongod --replSet s2 --logpath "/var/log/mongodb/s2-r0.log" --dbpath /media/mongo/data/shard2/rs0 --port 57017 --fork --shardsvr --smallfiles



sleep 5

mongo --port 57017 << 'EOF'
config = { _id: "s2", members:[
{ _id : 0, host : "localhost:57017" },
]};
rs.initiate(config)
EOF


# now start 3 config servers
rm cfg-a.log cfg-b.log cfg-c.log
mkdir -p /media/mongo/data/config/config-a /media/mongo/data/config/config-b /media/mongo/data/config/config-c
mongod --logpath "/var/log/mongodb/cfg-a.log" --dbpath /media/mongo/data/config/config-a --port 57040 --fork --configsvr --smallfiles
mongod --logpath "/var/log/mongodb/cfg-b.log" --dbpath /media/mongo/data/config/config-b --port 57041 --fork --configsvr --smallfiles
mongod --logpath "/var/log/mongodb/cfg-c.log" --dbpath /media/mongo/data/config/config-c --port 57042 --fork --configsvr --smallfiles


# now start the mongos on port 27018
rm mongos-1.log
sleep 5
mongos --port 27018 --logpath "/var/log/mongodb/mongos-1.log" --configdb localhost:57040,localhost:57041,localhost:57042 --fork
echo "Waiting 60 seconds for the replica sets to fully come online"
sleep 60
echo "Connnecting to mongos and enabling sharding"

# add shards and enable sharding on the test db
mongo --port 27018 << 'EOF'
db.adminCommand( { addshard : "s0/"+"localhost:37017" } );
db.adminCommand( { addshard : "s1/"+"localhost:47017" } );
db.adminCommand( { addshard : "s2/"+"localhost:57017" } );
db.adminCommand({enableSharding: "IBSng"});
EOF

sleep 5
echo "Done setting up sharded environment on localhost"

But I don't know that how to add shard key on my collection. I speed write/read(I/O) is very important.

I hope you are using this for a test/trial setup, because in setting up a shard on a single machine defeats the very purpose of read/write scalability. — vmr, Oct 09 '14 at 09:38
Then how can I improve my query on single machine? In my case aggregate query is very lower than of postgresql. please guide me to execute query in fast way. — user255327, Oct 11 '14 at 05:50
What is the best idea for faster query in a single powerful server? — user255327, Oct 22 '14 at 08:13

score 0 · Answer 1 · answered Oct 22 '14 at 08:31

0

Sharding will not help improve query performance if it is done on a single machine which has huge number of documents(7 million is not small).

Reason : MongoDB uses memory mapped files which means copy of your data and indexes is stored in RAM and whenever there is a query it fetches it from the RAM itself. In the current scenario your queries are slower because your data + indexes size is so large that it will not fit in RAM , hence there will be lot of I/O activity to get data from disk which is the bottleneck.

What else can be done to improve query performance (incase Sharding is not an option):

Increase RAM on your machine
Use indexes
Redesign schema

Note - Even if you implement the above points, there is a limit to how much query performance can be improved, with sharding linear scaling of read/write throughput can be expected.

answered Oct 22 '14 at 08:31

vmr

1,895
13
24

Thank you very much. But why not better, when I have multi instance of mongod on multi core cpu? Also We have 22GB RAM. Are you see this [post](http://stackoverflow.com/questions/6477613/mongodb-sharding-on-single-machine-does-it-make-sense)? – user255327 Oct 22 '14 at 09:33
1

If you are using a server machine and not x86 commodity machine for a data of 7 million documents then the entire working set will fit into RAM. You need not do sharding at all. Now the problem might be with the schema of your data(I am presuming you are using indexes). Your query plan might not be optimal. To help us help you, please show us your schema and queries. – vmr Oct 22 '14 at 09:45
Ok I raised this issue a few elsewhere. Please view this [page](https://groups.google.com/forum/#!topic/mongodb-user/RcC3EYRT7IY) and this [page](https://groups.google.com/forum/#!topic/mongodb-user/3Cplrcg0_js) – user255327 Oct 22 '14 at 11:51

Sharding on single machine

1 Answers1