The docs confirm your understanding of how DynamoDB scales out in the On-Demand capacity mode (emphasis mine):
Peak Traffic and Scaling Properties
[...] On-demand capacity mode instantly accommodates up to double the previous peak traffic on a table. For example, if your application’s traffic pattern varies between 25,000 and 50,000 strongly consistent reads per second where 50,000 reads per second is the previous traffic peak, on-demand capacity mode instantly accommodates sustained traffic of up to 100,000 reads per second. If your application sustains traffic of 100,000 reads per second, that peak becomes your new previous peak, enabling subsequent traffic to reach up to 200,000 reads per second.
If you need more than double your previous peak on table, DynamoDB automatically allocates more capacity as your traffic volume increases to help ensure that your workload does not experience throttling. However, throttling can occur if you exceed double your previous peak within 30 minutes. For example, if your application’s traffic pattern varies between 25,000 and 50,000 strongly consistent reads per second where 50,000 reads per second is the previously reached traffic peak, DynamoDB recommends spacing your traffic growth over at least 30 minutes before driving more than 100,000 reads per second.
Concerning the strategy of setting the initial peak value for new tables by first deploying them in the Provisioned Capacity mode and large RCU/WCU values and then switching it over to On-Demand - that also works. It automatically allows for the same throughput by setting the starting value for the previous peak to half the RCUs/WCUs and since double that number is always supported you retain your capacity.
The docs don't explicitly state that it stays like this indefinitely, but they also don't talk about scaling down. In practice I also haven't seen that happen. In my experience AWS wouldn't leave something like this out of the docs.
It's also unlikely based on the architecture of DynamoDB, which AWS explains in this really cool tech talk at re:invent 2018. DynamoDB scales in partitions and the number of partitions for a table can only increase. Each storage partition is capable of:
- Serving up to 3000 RCUs
- Serving up to 1000 WCUs
- Storing 10GBs of data
As soon as any of those limits is reached, a partition split happens - two new partitions are created and the data is distributed among them. This happens as many times as necessary until the newly configured parameters (RCU, WCU, storage) can be accommodated.
It's not stated explicitly, but since you can pretty much instantly change from on-demand to provisioned capacity and vice versa, it's fair to assume that the underlying architecture is the same or at least very similar with a different billing model on top of it.
Since the number of partitions can always only go up, it's unlikely that the peak capacity will go down.
That being said: it's not part of the published API and considered an implementation detail, so there is no guarantee or promise that it will always stay like this.