1

I have Cassandra model as

import uuid
from cassandra.cqlengine import columns
from cassandra.cqlengine.models import Model

class MyModel(Model):
    ...
    ...
    created_at = columns.TimeUUID(primary_key=True,
                         clustering_order='DESC',
                         default=uuid.uuid1)
    ...
    ...

Recentrly app hit the uuid1 creation doesn't close files - hits file descriptor limit. I try to find the solution, but seems what options I think might be not work

  • Replace uuid1 in default with uuid4, but TimeUUID need time part in it, and only uuid1 provide that.
  • Relace uuid1 with cassandra.util.uuid_from_time(time.time()), when check the code for both uuid1 and uuid_from_time, both are looks same, so that also not solve the problem.

Last option is to replace TimeUUID with Timestamp type, but this created_at column is primary_key and clustering_order, so dont know I can do that or not.

My column family has already 1,000,000+ data, so I cant just drop them.

I also want to know, what is the advantage of using TimeUUID instead of timestamp ?

Nilesh
  • 20,521
  • 16
  • 92
  • 148

1 Answers1

1

Are you certain you're hitting the libuuid issue you linked? Your code snippet shows the standard library uuid, which probably doesn't have that issue. Is it possible there's a different file descriptor leak in your program?

If it is libuuid, the easiest course would be to use the standard library implementation. If speed is a major concern for you, you might look into building a different version of libuuid to use with python-libuuid. I tried this one quickly and didn't notice any file descriptors leaking: http://www.ossp.org/pkg/lib/uuid/

I also want to know, what is the advantage of using TimeUUID instead of timestamp ?

You won't be able to change the type of the column on your existing table, but to answer your question: TimeUUID is usually used to avoid collisions where multiple events could be written in the same timestamp value.

Adam Holmberg
  • 7,245
  • 3
  • 30
  • 53
  • Adam, I am hitting the same issue which I linked. I am now using https://datastax.github.io/python-driver/api/cassandra/util.html#cassandra.util.uuid_from_time now to generate new `uuid` – Nilesh May 19 '17 at 17:32
  • You are using the driver utility method and hitting [this](https://github.com/dln/python-libuuid/issues/1) issue? That would be very surprising unless you're somehow monkey patching the `uuid` standard library. Are you? What symptoms lead you to believe it's the same issue? – Adam Holmberg May 19 '17 at 20:27
  • Is there any way to check, I am using `uuid` form `libuuid` or `standard` uuid ? – Nilesh May 19 '17 at 20:37
  • Unless you are monkey patching somewhere else, your code snippet indicates that you're using the standard lib version. You can verify by `import uuid; print uuid.__file__`. It should be something like `.../lib/python2.7/uuid.pyc` and not `.../lib/python2.7/site-packages/libuuid/__init__.pyc` – Adam Holmberg May 22 '17 at 16:15
  • Thanks Adam, will check this – Nilesh May 22 '17 at 16:27