Questions tagged [sharding]

Sharding is a technique of partitioning database tables by row ("horizontally"); typically this technique requires a key to be selected that determines how the rows are to be partitioned.

Sharding is a concept in database design; it refers to the technique of physically partitioning a table or collection by row (also known as horizontal partitioning). To execute the partition, a key or key collection must be defined, which tells the database engine how to determine to which partition each record should belong.

References

1666 questions
22
votes
2 answers

120 mongodb collections vs single collection - which one is more efficient?

I'm new to mongodb and I'm facing a dilemma regarding my DB Schema design: Should I create one single collection or put my data into several collections (we could call these categories I suppose). Now I know many such questions have been asked, but…
user2297996
  • 1,382
  • 3
  • 18
  • 28
21
votes
4 answers

How does django handle multiple memcached servers?

In the django documentation it says this: ... One excellent feature of Memcached is its ability to share cache over multiple servers. This means you can run Memcached daemons on multiple machines, and the program will treat the group of…
Apreche
  • 30,042
  • 8
  • 41
  • 52
20
votes
3 answers

Sharding GridFS on MongoDB

I'm documenting about the GridFS and the possibility to shard it among different machines. Reading the documentation here, the suggested shard key is chunks.files_id. This key will be linked to the _id of the files collection, thus this _id is…
ALoR
  • 4,904
  • 2
  • 23
  • 25
20
votes
4 answers

Primary shard is not active or isn't assigned is a known node ?

I am running an elastic search version 4.1 on windows 8. I tried to index a document through java. When running a JUNIT test the error appears as below. org.elasticsearch.action.UnavailableShardsException: [wms][3] Primary shard is not active or…
Prem Singh Bist
  • 1,273
  • 5
  • 22
  • 37
20
votes
5 answers

Auto sharding postgresql?

I have a problem where I need to load alot of data (5+ billion rows) into a database very quickly (ideally less than an 30 min but quicker is better), and I was recently suggested to look into postgresql (I failed with mysql and was looking at…
Lostsoul
  • 25,013
  • 48
  • 144
  • 239
19
votes
4 answers

what is a good way to horizontal shard in postgresql

what is a good way to horizontal shard in postgresql 1. pgpool 2 2. gridsql which is a better way to use sharding also is it possible to paritition without changing client code It would be great if some one can share a simple tutorial or cookbook…
pylabs
  • 31
  • 1
  • 1
  • 6
19
votes
4 answers

How to change the shard key

I Know that impossible to change shard key. But, when I set incorrect shard key, How to change that?
Bill
  • 191
  • 1
  • 1
  • 3
19
votes
3 answers

Why NoSQL say traditional RDBMS is not good at scalable

I've read some article say that RDBMS such as MySQL is not good at scalable,but NoSQL such as MongoDB can shard well. I want to know which feature that RDBMS provided make itself can not shard well.
shuitu
  • 253
  • 1
  • 2
  • 7
19
votes
1 answer

How to Programmatically Pre-Split a GUID Based Shard Key with MongoDB

Let's say I am using a fairly standard 32 character hex GUID, and I have determined that, because it is randomly generated for my users, it is perfect for use as a shard key to horizontally scale writes to the MongoDB collection that I will be…
Adam Comerford
  • 21,336
  • 4
  • 65
  • 85
17
votes
1 answer

Conceptually, how does database sharding differ from a federation

Can anyone provide the key conceptual differences between: MongoDB Sharding SQL Server Federated instances. They appear to be very similar but I don't know if I've missed anything major. Thanks.
dubs
  • 6,511
  • 4
  • 19
  • 35
17
votes
3 answers

Does RabbitMQ call the callback function for a consumer when it has some message for it?

Does RabbitMQ call the callback function for a consumer when it has some message for it, or does the consumer have to poll the RabbitMQ client? So on the consumer side, if there is a PHP script, can RabbitMQ call it and pass the message/parameters…
jeff musk
  • 1,032
  • 1
  • 10
  • 31
17
votes
4 answers

Clustering, Sharding or simple Partition / Replication

We have created a Facebook application and it got a lot of virality. The problem is that our database started getting REALLY FULL (some tables have more than 25 million rows now). It got to the point that the app just stopped working because there…
albertosh
  • 2,416
  • 7
  • 25
  • 32
16
votes
2 answers

what is the difference in indexing and sharding

What is the difference between indexing and sharding. What is the role of both?
rajan sthapit
  • 4,194
  • 10
  • 42
  • 66
16
votes
1 answer

Why is full text search of MongoDB shards directly much faster than going through the cluster manager (mongos) instance?

I have been very unhappy with full text search performance in MongoDB so I have been looking for outside-the-box solutions. With a relatively small collection of 25 million documents sharded across 8 beefy machines (4 shards with redundancy) I see…
Chris Seline
  • 6,943
  • 1
  • 16
  • 14
16
votes
1 answer

Why MongoDB config servers must be one or three only?

After reading the official documentation for the MongoDB sharding architecture I have not found out why you need to have one or three config servers, and not another number. The MongoDB documentation on Config Servers says: "If one or two config…
sephiroth66
  • 185
  • 1
  • 1
  • 7