1

I have code like below.

from uuid import uuid4
from uuid import uuid1

from cassandra.cqlengine import columns, connection
from cassandra.cqlengine.models import Model
from cassandra.cqlengine.management import sync_table


class BaseModel(Model):
    __abstract__ = True

    id = columns.UUID(primary_key=True, default=uuid4)
    created_timestamp = columns.TimeUUID(primary_key=True,
                                         clustering_order='DESC',
                                         default=uuid1)
    deleted = columns.Boolean(required=True, default=False)

class OtherModel(BaseModel):
    __table_name__ = 'other_table'



if __name__ == '__main__':
    connection.setup(hosts=['localhost'],
                     default_keyspace='test')
    sync_table(OtherModel)

    OtherModel.create()

After first execution, I can see the record in db when run query as.

cqlsh> select * from test.other_table;

 id                                   | created_timestamp                    | deleted
--------------------------------------+--------------------------------------+---------
 febc7789-5806-44d8-bbd5-45321676def9 | 840e1b66-cc73-11e6-a66c-38c986054a88 |   False

(1 rows)

After this, I added new column name in OtherModel it and run same program.

class OtherModel(BaseModel):
    __table_name__ = 'other_table'
    name = columns.Text(required=True, default='')




if __name__ == '__main__':
    connection.setup(hosts=['localhost'],
                     default_keyspace='test')
    sync_table(OtherModel)

    OtherModel.create(name='test')

When check db entry

cqlsh> select * from test.other_table;

 id                                   | created_timestamp                    | deleted | name
--------------------------------------+--------------------------------------+---------+------
 936cfd6c-44a4-43d3-a3c0-fdd493144f4b | 4d7fd78c-cc74-11e6-bb49-38c986054a88 |   False | test
 febc7789-5806-44d8-bbd5-45321676def9 | 840e1b66-cc73-11e6-a66c-38c986054a88 |   False | null

(2 rows)

There is one row with name as null.

But I can't query on null value.

cqlsh> select * from test.other_table where name=null;
InvalidRequest: code=2200 [Invalid query] message="Unsupported null value for indexed column name"

I got reference How Can I Search for Records That Have A Null/Empty Field Using CQL?.

When I set default='' in the Model, why it not set for all the null value in table?

Is there any way to set null value in name to default value '' with query?

Community
  • 1
  • 1
Nilesh
  • 20,521
  • 16
  • 92
  • 148

1 Answers1

2

The null cell is actually it just not being set. And the absence of data isn't something you can query on, since its a filtering operation. Its not scalable or possible to do efficiently, so its not something C* will encourage (or in this case even allow).

Going back and retroactively setting values to all the previously created rows would be very expensive (has to read everything, then do as many writes). Its pretty easy in application side to just say if name is null its '' though.

Chris Lohfink
  • 16,150
  • 1
  • 29
  • 38
  • Thanks Chris, my problem is, when I make query with `name=XYZ` i also want to get data for `null` is there any way to get data with that ? – Nilesh Dec 28 '16 at 14:17
  • For adhoc things like this consider using Spark. Without it theres not even a way to query for name=XYZ with this schema unless you create a new table, materialized view, or secondary index. Note that MVs and especially 2is have negative impacts and I recommend you only use them once your fully aware of how they work to make sure you don't get burned. Id recommend a new table here fwiw. Cassandra is designed for huge datasets that are massively distributed so things like querying for any piece of data isnt supported because it wont work in many scenarios. The schema needs to support the query. – Chris Lohfink Dec 28 '16 at 16:45