0

I am trying to store netflow packets that I receive from a netflow probe into Cassandra. I want to store each packet in a separate row for efficiency purposes. Can someone suggest a rowkey that I can use that has enough precision for storing netflow packets? I was thinking of using some time function. Is it precise enough not to have collisions between packets? I am using libQtCassandra library for accessing Cassandra. Thanks....

bnsk
  • 131
  • 9

1 Answers1

0

You could potentially use a time function of any precision available to you outside Cassandra, and simply insert the values. Most platforms provide functions to get times at millisecond precision.

On linux-based systems, you can use the Unix timestamp at millisecond precision for your rowkey. You rowkey would then probably be a LongType I assume.

On a different note, will your model have "skinny rows" or "wide rows". You don't want to distribute your data across too many rows, because you can't scan for rows in ranges. Maybe you can consider a model where the time up to seconds is your rowkey, and the specific millisecond within that second as the column name which will then point to the actual value.

Something like:

unix_timestamp_in_seconds => [ { millisecond_count: value}, { millisecond_count: value}, ...]

Of course, here I am assuming millisecond precision is sufficient. If you need microsecond precision, then it really comes down to your platform.

Nikhil
  • 2,298
  • 13
  • 14