-1

we have set expiry for columns in bigtable. Over a period of time, the number of rows not holding any data(only keys) has been increased. I am looking for an efficient way to delete these empty rows from a table.

For ex:

key: key1 column1: value1(ttl 1 day) colum2:value2(ttl 1 day)

In my use case once both these value are garbage collected.Key Don't have any importance, so this key is eligible for garbage collection.

deep
  • 31
  • 5

1 Answers1

0

You can use the Cloud Bigtable CLI documentation:

Delete a row:

Example: cbt deleterow <table-id> <row-key> app-profile=<app-profile-id>

Delete all rows:

cbt deleteallrows <table-id>

Optionally, you can use the Cloud Bigtable Client Libraries, as per the documentation.


Check out the following documentation that explains Cloud Bigtable garbage collector, which is the automatic, ongoing process of removing expired and obsolete data from Cloud Bigtable tables.

Note that it can take up to a week for data to be garbage-collected, you should never rely solely on garbage-collection policies to ensure that read requests return the desired data.

Edit1:

Garbage collect based on age

You can use the cbt command-line tool to set the maximum age (1 day) for data in a column family.

cbt createfamily your-table cf1

cbt setgcpolicy your-table cf1 maxage=1d

More information about configuring Garbage Collection can be found on the documentation.

sllopis
  • 2,292
  • 1
  • 8
  • 13
  • Thanks, got to know about different policies, most of them are around columns Just want to understand, If we can set any policy for row key also. For ex: if there is no cell containing any value for row for over a day, just garbage collect it. – deep Apr 28 '20 at 10:24
  • Just updated my answer. Before configuring Garbage Collection, make sure that you first are familiar with [garbage collection overview](https://cloud.google.com/bigtable/docs/garbage-collection). Don't forget that you can also use the Cloud Bigtable Client Libraries to set and/or update garbage collection policies. If you don't need to keep old data, or old versions of your current data, using garbage collection can help you minimize the size of each row. – sllopis Apr 28 '20 at 13:45
  • No, you can't because Garbage-collection policies are set at the column-family level. – sllopis Apr 28 '20 at 13:53