I am designing the schema for a geographically distributed application that would be spread across different countries and cities. There are related collections such as -
Shops (spread across countries and cities) 15 days transactions for the shops (rest goes to historical store) etc.
Is it possible to ensure that shops and the transactions of the shops are co-located in the same shard? Currently in the transactions collection, say I am storing just the unique _id of shop as reference.
Suppose I shard the Shop collection with a key such as {region, country, city, shop_id}. Do I have to store the same columns/attributes for the transactions table - i.e. region, country, city, shop_id instead of just the shop_id and then choose a shard key like - {region, country, city, shop_id, tx_id} to ensure that it is placed in the same shard as the Shop collection?
In other words if a 'child' collection has records logically related to records of a 'parent' collection, then should the entire shard key that we apply to the 'parent' collection have to be a part of the shard key of the 'child' to ensure that they are co-located on the same shard?
Thanks and Regards, Archanaa Panda