1

[Question posted by a user on YugabyteDB Community Slack]

I have a table with TTL and a secondary index, using YugabyteDB 2.9.0 and I’m getting the following error when I try to insert a row:

SyntaxException: Feature Not Supported

Below is my schema:

CREATE TABLE lists.list_table (
    item_value text,
    list_id uuid,
    created_at timestamp,
    updated_at timestamp,
    is_deleted boolean,
    valid_from timestamp,
    valid_till timestamp,
    metadata jsonb,
    PRIMARY KEY ((item_value, list_id))
) WITH default_time_to_live = 0
    AND transactions = {'enabled': 'true'};

CREATE INDEX list_created_at_idx ON lists.list_table (list_id, created_at)
WITH transactions = {'enabled': 'true'};

We have two types of queries (80% & 20% distribution):

select * from list_table where list_id= <id> and item_value = <value> 
select * from list_table where list_id= <id> and created_at>= <created_at>

We expect per list_id there would be around 1000-10000 entries. The TTL would be around 1 month.

dh YB
  • 965
  • 3
  • 10

1 Answers1

2

It is a restriction, it’s currently not supported to transactionally expire rows using TTL out of a table which are indexed (i.e. atomic expiry of TTL entries in both table and index). There are several workarounds to this:

a) In YCQL, we also support an index with a weaker consistency. This is not well documented today, but you can see the details here: https://github.com/YugaByte/yugabyte-db/issues/1696

The main issue to call out when using this variant of index is that error handling (on INSERT failure), is that it is an application side responsibility to retry the INSERT on failure. As noted in the above issue << If an insert/update or batch of such operations fails, it is the app's responsibility to retry the operation so that the index is consistent. Much like in a 2-table case, it would have been the apps responsibility to retry (in case of a failure between the update to the two tables) to make sure both tables are in sync again. >>

This type of index supports a TTL at the table & index level. (which is recommended to keep the same): https://github.com/yugabyte/yugabyte-db/issues/2481#issuecomment-537177471

b)Another workaround is to use a background cleanup job to periodically delete stale records (instead of using TTL).

c)Avoid using indexes and store data in two tables. one organized by the original primary key and one organized by the index columns you wanted (as the primary key). Both tables can have TTL. But it is an application side responsibility to INSERT to both tables when data is added to the database.

The first table's PK would be ((list_id, item_value)), identical to the current main table. nstead of an index you'll have a second table; the second table's PK would be ((list_id), created_at) and both tables would have a TTL. The application must insert the data into both tables. In the 2nd table you have a choice: (option 1) Duplicate all the columns from the main table here including your JSON columns etc. This makes Q2 lookup fast, the row has everything it needs; but increases your storage requirements. (option 2): In addition to the PK, just store the item_value column in the second table. For Q2, you must first lookup the 2nd table and get the item_value, and then use list_id and item_value and retrieve the data from the main table (much like an index would do under the covers).

d)Another workaround, is if we could avoid the index and pick the PK to be ((list_id, item_value), created_at).

This would not affect the performance of Q1 because with (where list_id and item_value) provided it can use the PK to find the rows. But it would be slower for Q2 where list_id and created_at are provided because while it can still use list_id, it must filter out the data using the created_at value without the help of an index. So if Q2 is really 20% of your queries, you probably do not want to scan 1 to 10k items to find your matching row.

To clarify option (c), with the example in mind: The first table's PK would be ((list_id, item_value)); it is the same as your current main table. Instead of an index you'll have a second table; the second table's PK would be ((list_id), created_at). both tables would have a TTL The application would have to insert entries into both tables. In the 2nd table you have a choice: (option 1) duplicate all the columns from the main table, including your JSON columns etc. This makes Q2 lookup fast, the row has everything it needs; but increases your storage requirements. (option 2): in addition to the Primary Key, just store the item_value column in the second table. For Q2, you must first lookup the 2nd table and get the item_value, and then use list_id and item_value and retrieve the data from the main table (much like an index would do under the covers)

dh YB
  • 965
  • 3
  • 10