Ways to organize row-keys for range scans in Cassandra

Question

I am trying to find a good way to organize my row-keys to perform range scans on them without creating my own index lists.

I am having a MySQL Database with currently about 15.000 Databases, each ~50 Tables = 75.000 Tables. Because 99% of the data is always read with an unique identifier that data is planned to move into a Cassandra cluster.

For some maintenance (listing the contents of a complete table, removing a complete table or dropping a database) cases I need to get the contents of a complete table or even a database. Range-Scans seem to be the perfect fit for that.

Currently I am planning to generate UUIDs for each part of the old structure and put them together separated by a | (DB + Table + Id = UUID1|UUID2|UUID2).

Example:

07424eaa-4761-11e1-ac67-12313c033ac4|0619a6ec-4525-11e1-906e-12313c033ac4|0619a6ec-4795-12e9-906e-78313c033ac4

The CF with the data should be sorted with org.apache.cassandra.db.marshal.AsciiType.

As client I am using phpcassa.

For the range scans I want to use an UUID| as start key and as an end for the range, the same key but with chr(255) or z appended to it. The ascii-value for both characters are bigger any other of the UUID characters that are following in that keys.

Is this a solid approach that allows me to achieve the explained goals for the range scans?

score 5 · Accepted Answer · answered Jan 31 '12 at 20:49

Cassandra best practices are to use the RandomPartitioner - this gives you 'free' load balancing, as long as your tokens are evenly distributed. Unfortunately, with the random partitioner, row range queries (ie get_range_slices) returns keys in a random order.

This is fine for paging through the entire column family (and if that is what you want to, then you approach will work). But if you just want to page through a smaller, contiguous range of row keys, it will not work.

One option to solve this is to use wide rows and composite columns. For example, a column family which looks like this:

{ 
  row1 -> {column1: value1, column2: value2},
  row2 -> {column3: value3, column4: value4},
  ... 
}

Would be transposed to look like this:

{
  row1-10 -> {
              [row1, column1]: value1, [row1, column2]: value2,
              [row2, column3]: value3, [row2, column4]: value4,
              ...
             }
  ...
}

And you can do a range query by doing a column slice (get_slice) on the right row, between the right columns. ie

get_range_slice(start=row1, end=row2)

becomes:

get_slice(row=row1-10, start=[row1, null], end=[row2, null])

Note the null second dimension on the column keys.

The trick is to pick your row ('bucket') keys such that your columns won't grow too large (this will perform badly for normal Cassandra), but that you queries won't need to get too many rows. This will depend on your average query size, and the distribution of your uuids, but a good candidate might be to use UUID1 as the row keys and [UUID2, UUID3] as the first dimensions of the column keys.

Thank you for confirming that the approach is so far solid. The sort ordering is of no concern for me, only having access to some ranges is important. Is using the chr(255) or z character as little helper to find a range a good idea? (shortened example: af-07-01|39-ef-98|12-52-98 … to get all keys beginning with the first part, the start_key would be af-07-01| and the end_key af-07-01|z) — favo, Feb 01 '12 at 19:31
If you require access to a sub range (and not the entire CF) as you say then your approach will _not_ work. All the rows in the CF are in a random order, so picking a range between your two keys will not only return keys in a random order, but will return keys not "logically" in that range, as all the keys are in random order. — tom.wilkie, Feb 02 '12 at 19:48
Thank you for getting back to this question! So the approach still needs a manual index or splitting the data into chunks. What about working with more ColumnFamilies? Lets say I create 15.000 ColumnFamilies to provide an easier level of administration (one for each Database)? Can this be a workable alternate approach? From what I read this will likely be a big memory problem because of how Cassandra assigns memory for each CF. Is this in the current version(s) still the case? — favo, Feb 05 '12 at 15:42

Ways to organize row-keys for range scans in Cassandra

1 Answers1