Retention policy drops entire chunk and chunks are measured by time intervals, thus there is no sense to define policy in size and not in time. The policy drops a chunk after entire chunk is older than given interval, thus if chunk size is 7 days and retention policy is 3 days, then the oldest dropped data will be 10 days old (the dropped chunk contains data from 10 to 3 days old). Chunks are represented by tables internally, thus dropping a chunk is dropping a table, which is the most efficient way to delete data in PostgreSQL. Deleting by row is much more expensive than dropping or truncating a table and doesn't free space until VACUUM is run.
TimescaleDB expects that you know your application load well and can correctly estimate desired size in time interval.
Time dimension column is not required to have time type, but can be a number. It is important that time dimension column increases over time and it is clear how to use in queries and define chunk size. So it is possible to use a counter for the time dimension column and increment it for each row by 1 or by row size. Notice that syncing counter can be a bottleneck.
It is also possible to write a user-defined action, where own action can be defined to be executed on regular basis as a custom policy.
Summary of thee possible solutions:
- Give good estimate of chunk size, which is expected way by TimescaleDB.
- Define numerical Time dimension column with counter-like implementation.
- Write custom policy using user-defined action.