If you want to partition away cold data then a leading boolean index column is a valid way to do that. That column must be added to all indexes that you want to hot/cold partition. This includes the clustered index. This is quite awkward. The query optimizer requires that you add a dummy predicate where IsActive IN (0, 1)
to make it able to still seek on such indexes. Of course, this will now also touch the cold data. So you probably need to know the concrete value of IsActive
or try the 1
value first and be sure that it matches 99% of the time.
Depending on the schema this can be impractical. I have never seen a good case for this but I'm sure it exists.
A different way to do that is to use partitioning. Here, the query optimizer is used to probing multiple partitions anyway but again you don't want it to probe cold data. Even if it does not find anything this will pull pages into memory making the partitioning moot.
The historical table idea (e.g. HistoricalOrders
) is the same thing in different clothes.
So in order to make this work you need:
- Modify all indexes that you are about (likely all), or partition, or create a history table.
- Have a way to almost never need to probe the cold partition.
I think (2) kills it for most cases.
Among the 3 solutions I'd probably pick the indexing solution because it is simplest. If you are worried about people making mistakes by writing bad queries all the time, I'd pick a separate table. That makes mistakes hard but makes the code quite awkward.
Note, that many indexes are already naturally partitioned. Indexes on the identity column or on an increasing datetime column are hot at the end and cold elsewhere. An index on (OrderStatus INT, CreateDateTime datetime2)
would have one hot spot per order status and be cold otherwise. So those are already solved.
Some further discussion.