0

From Cloud Bigtable schema design documentation: Grouping data into column families allows you to retrieve data from a single family, or multiple families, rather than retrieving all of the data in each row. Group data as closely as you can to get just the information that you need, but no more, in your most frequent API calls.

In my use case, I can group all the data into one single column family (currently the access pattern is retrieve all fields), or group them to say 3 logical column families and specify these column families all the time while querying. Is there any performance difference between these two designs? Which design is recommended?

mmziyad
  • 298
  • 1
  • 4
  • 16

1 Answers1

1

In your case, there isn't a performance advantage either way. I would use the 3 logical column families so that you have cleaner code.

Solomon Duskis
  • 2,691
  • 16
  • 12
  • In the case of logical grouping of column families, is it a good practice to query only using the row-key, without specifying column families? As I mentioned the primary access pattern is to retrieve most of the fields. – mmziyad May 15 '17 at 12:08
  • It really depends on the situation. You're not likely to see a performance hit either way if the row is small. You can do some experimentation, and take a look at the metrics provided by https://cloud.google.com/bigtable/docs/hbase-metrics. – Solomon Duskis May 15 '17 at 14:15