How to decide when to use replicate sets for mongodb in production

Question

We are currently hosting the MongoDB using its official docker image in ec2, for our production environment, its 32gb memory server dedicated to just this service.

How can using replica sets help us in the improvement of the performance of our MongoDB, we are currently facing that the response for queries is getting slower day by day.

Are there any measures through which we can determine that investing in the replica set will provide worthy benefits as well and will not be premature optimization.

Replica set is for high availability. It is not for scaling. If your queries are getting slower with more data, it's more likely indexing issues. — kevinadi, Jan 09 '20 at 05:14
The queries are performing IXSCAN so I guess its the best we can achive, can replica sets help us by dividing the load on a single mongo node to the other replicas? — Raj Saraogi, Jan 09 '20 at 06:09
Even though it's doing IXSCAN, it might not be the best index so it can still be slow. I suggest you open another question describing your queries, your indexes, and the explain() output of the queries. — kevinadi, Jan 09 '20 at 20:48

score 1 · Answer 1 · answered Jan 09 '20 at 06:16

MongoDB replication is a high availability solution (see note at the end of the post for more details on Replication). Replication is not a performance improvement solution.

MongoDB query performance depends upon various factors: size of collection, size of document, database design, query definition and indexes. Inadequate hardware (memory, hard drive, cpu and network) can affect the query performance. The number of operations at a given time can also affect the performance.

For faster query performance the main consideration is using indexes. Indexes affect directly the query filter and sort operations. To find if your query is performing optimally and using the proper indexes generate a query plan using the explainwith "executionStats" mode; study the plan. Explain can be run on MongoDB find, update, delete and aggregation queries. All these queries can benefit from indexes. See Query Optimization.

Adding capabilities to the existing hardware is known as vertical scaling; and replication is not vertical scaling.

Replication:

This is configured as a replica-set - a primary node and multiple secondary nodes. The primary is the main point of contact for application - all writes happen on the primary, (and reads, by default). The data written to the primary is replicated to the secondaries. This way data redundancy is accomplished. When the primary goes down one of the secondaries takes over as primary and keep the system running via a failover process. Data durability, high availability, redundancy and failover are the man concepts with replication. In MongoDB a replica-set cluster can have up to fifty nodes.

Yes, I understand that, can the replica sets be used for the read queries, like by using this can we divert our write queries to master node and read queries to replica nodes, if this can be done will the performance improve? — Raj Saraogi, Jan 09 '20 at 06:45
You can specify _Read Preference_ for queries and it can direct the reads to secondary nodes. I think it is not meant for query performnace; see [Read Preference Use Cases](https://docs.mongodb.com/manual/core/read-preference/index.html#use-cases). — prasad_, Jan 09 '20 at 07:09
Yeah, the query performance will not improve I got that, but can it reduce the load on the master node who is facing a lot of write operations will this lead to performance operations? — Raj Saraogi, Jan 09 '20 at 09:07
But, there are write operations going on on the secondaries too (the replication is a write). You should generally follow the _Read preference use cases_ link in the previous comment. — prasad_, Jan 09 '20 at 09:27

score 0 · Answer 2 · answered Jan 11 '20 at 18:15

It is recommended to use replica-set in production due to HA functionality.

As a result of source limits on one hand and the need of HA in production on the other hand, I would suggest you to create a minimal replica-set which will consist of Primary, Secondary and an Arbiter (an arbiter does not contain any data and is very low memory consumer).

Also, Writes typically effect your memory performance much more than reads. In order to achieve better write performance I would advice you to create more shards (the more masters you have, the more writes you can handle at the same time).

However, I'm not sure what case your mongo's performance to slow so fast. I think you should:

Check what is most effect your production's performance (complicated queries or hard writes).
Change your read preference to "nearest".
Consider to disable Read Concern "majority" (remember that by default there is a write "majority" concern. Members should be up to date).
Check for a better index.
And of curse create a replica-set!

Good Luck! :P

How to decide when to use replicate sets for mongodb in production

2 Answers2