3

are "Row Counts" (in a CF) in Cassandra meanwhile supported for

a) RAndomPartitioner ?

b) OrderPreservingPartitioner?

http://www.datastax.com/dev/blog/whats-new-in-cassandra-0-8-part-2-counters implies this is easily possible? Quote: " “counting,” we mean here to provide an atomic increment operation in a single column value, as opposed to counting the number of columns in a row, or rows in a column family, both of which were already supported."

Two years ago it was defenitely not supported for RP: Row count of a column family in Cassandra

Furthermoe even with OrderPreservingPartitioner, it was(??) a very heavy Operation (as far as I understood i have to retrieve all objects, this is/was not only a lightweight count operation to the row-count, but rather read also all data (rows?) ?)

Update: I am absolutely aware of, that the new counting feature is completely different to row-counts. But the text above implies row-counts are also easily possible and supported quote "...both of which are supported..."? Is this marketing language meaning it is only possible as an extremely heaving operation using get_range_slice? Or is there something new that I am completly missing, that does this lightweight for both partitioniers?

Thanks

Markus

Community
  • 1
  • 1
Markus
  • 4,062
  • 4
  • 39
  • 42

3 Answers3

3

Counters and counting the number of rows / columns are two different topics.

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Count-rows-td5420889.html

I would suggest, as you add new rows to a column family, simply increment +1 a counter CF/row/key and you wont have to page through all of the rows (as the link above says, what if you have billions?) -- This also allows you to not care which partitioner you use ...

sdolgy
  • 6,963
  • 3
  • 41
  • 61
  • To keep an accurate count, however, you would have to check for existence before adding keys and this would hinder a write-only insert process. – libjack Oct 17 '11 at 20:01
1

Sasha hit the important points. Just wanted to clear this up:

The text above implies row-counts are also easily possible

Yes, my answer from Dec 09 is outdated. Counting rows the brute-force way (seq scan) is supported on RandomPartitioner for a while now.

jbellis
  • 19,347
  • 2
  • 38
  • 47
1

Row counting in Cassandra is really an anti-pattern. In the rare event that you have to do this, you can use Range Slice Query to iterate through all rows and tally the total. An example using Hector is given here: http://randomizedsort.blogspot.com/2011/10/counting-all-rows-in-cassandra.html

Nehc
  • 760
  • 7
  • 4