0

I am very new to cosmosdb(documentdb), while going through the documentation I keep on reading one thing repeatedly that documentdb is schema free but I feel like collection in analogous to schema and both are logical view.

Wikipedia defined schema as 'The term "schema" refers to the organization of data as a blueprint of how the database is constructed'. I believe collection is also same it's the organization of document, stored prcedure, triggers and UDF.

So my question is, how schema is different from collection?

lambodar
  • 3,495
  • 5
  • 34
  • 58
  • 2
    A collection does not enforce a specific kind of document as a schema does. So one document collection can have a documents with user data or a document with log data or whatsoever. A collection is merely a set of possible unrelated documents. A collection is like a database, a database can have multiple (possible unrelated) tables. But a database itself does not describe/enforce what data is stored. – Peter Bons May 29 '17 at 07:38
  • thanks @PeterBons sounds logical! – lambodar May 29 '17 at 07:46

2 Answers2

4

Collections really have nothing to do with schema. They are just an organizational construct for documents. With Cosmos DB, they serve as:

  • a transaction boundary. Within a collection, you can perform multiple queries / updates within a transaction, utilizing stored procedures. These updates are constrained to a single collection (more specifically, to a single partition within a collection).
  • a billing/performance boundary. Cosmos DB lets you specify the number of Request Units (RU) / second to allocate to a collection. Every collection can have a different RU setting. Every collection has a minimum cost (due to minimum amount of RU that must be allocated), regardless of how much storage you consume.
  • a server-side code boundary. Stored procedures, triggers, etc. are uploaded to a specific collection.

Whether you choose to create a single collection per object type, or store multiple object types within a single collection, is entirely up to you. And unrelated to the shape of your data.

David Makogon
  • 69,407
  • 21
  • 141
  • 189
3

The schema of relational databases is slightly different from the schema of document databases. In simple terms, a relational database is stricter than that of a document schema. In other words, records in an RDBMS table must strictly adhere to the schema, where as we have some amount of flexibility while storing a document into a Document collection.

Conventionally a collection is a set of documents which follows the same schema. But document DBs don't stop one from storing documents with different schema in a single collection. It is the flexibility it gives to the users.

Let us take an example. Let us assume we are storing some customer information. In relational DB, we might have some structure like

Customer ID INT
Name        VARCHAR(50)
Phone       VARCHAR(15)
Email       VARCHAR(255)

Depending on customer having an email or phone number, they will be recorded as proper values or null values.

ID, Name, Phone, Email
1, John, 83453452, -
2, Victor, -, -
3, Smith, 34535345, smith@jjjj

However in document databases, some columns need to appear in the collection, if they don't have any values.

[
{
  id: "123",
  name: "John",
  phone:"2572525",
},
{
  id: "456",
  name: "Stephen",
},
{
  id: "789",
  name: "King",
  phone:"2572525",
  email:"king@asfaf"
}
]

However it is always advisable to stick to a schema in document db's even if they provide flexibility to store schema-less documents to a collection for maintainability purposes.

Ravi Chandra
  • 677
  • 12
  • 24
  • Thanks for the answer @Ravi, I appreciate your effort. Coming to point, forgot about relation db, I have worked in Cassandra and its schema based, and the one you pointed about null value, it's just a choice in nosql and nothing to do with collection or schema is what my understading – lambodar May 29 '17 at 07:36
  • Conventionally a collection is a set of documents which follows the same schema. But document DBs don't stop one from storing documents with different schema in a single collection. It is the flexibility it gives to the users. – Ravi Chandra May 29 '17 at 07:39
  • 3
    I disagree that a collection conventionally contains documents with the same schema. There's no such convention. Perhaps people have been taught to do this (e.g. one collection per object type), but there's no rule for this, nor a convention. – David Makogon May 29 '17 at 12:31
  • 2
    Agree with @David. You can store any type of documents in a single collection and use a Type/Repository pattern by implementing a base `type` attribute in documents. Here is a [related question & answer](https://stackoverflow.com/a/27466029/5641598) regarding this scenario. – Matias Quaranta May 29 '17 at 15:07