While it might be tempting to use tag aware sharding for this, it's actually not simple, nor is it very efficient. Here is why:
1) your range of keys which should exist on the "old" shard is changing every day. If your cut-off is five days ago, at midnight you will need to update the tags to reflect that it's a new day.
2) as soon as you add the day that was five days ago to the range that should be on the "old" shard the balancer process will need to migrate that data to the old shard. The problem is that this shard will have loads of old data so probably really huge indexes so it'll be much slower to write to it, and reading and removing data from day-5 from your "active" shard(s) may be interfering with the queries on "current" data.
So, maybe it's not such a great option - although it is a valid option to consider.
I would suggest considering something else - maybe insert the data into this cluster and also into another "archival" replica set and then use TTL (time to live) index to "expire" data after it gets to be older than, say, a week. Just something to consider if you don't actually need to query on older data very often.
Another option is leave things the way they are. If your data is well balanced, it means you are already handling more TPS than you would if you were querying against "old" data - remember, only data actually being used is loaded into physical RAM - if you aren't reading some old data, then it'll just quietly sit there on disk. Just make sure that all your queries are using indexes efficiently - a collection scan can negate what I described in an instant!