I'm trying to insert rows into a table after changing its schema in Cassandra with the CQLEngine python library. Before the change, the model looked like:
class MetricsByDevice(Model):
device = columns.Text(primary_key=True, partition_key=True)
datetime = columns.DateTime(primary_key=True, clustering_order="DESC")
load_power = columns.Double()
inverter_power = columns.Double()
I've changed the schema to this, adding four columns (DSO, node, park and commercializer):
class MetricsByDevice(Model):
device = columns.Text(primary_key=True, partition_key=True)
datetime = columns.DateTime(primary_key=True, clustering_order="DESC")
DSO = columns.Text(index=True, default='DSO_1'),
node = columns.Text(index=True, default='Node_1'),
park = columns.Integer(index=True, default=6),
commercializer = columns.Text(index=True, default='Commercializer_1'),
load_power = columns.Double()
inverter_power = columns.Double()
Then, I've synced the table with a script containing the line
sync_table(MetricsByDate)
I've checked the database and the four columns have been created. The existing rows has these fields with value NULL (as expected).
Then I've modified the script in charge of inserting in batch rows including the values corresponding to the new fields. It looks like:
batch = BatchQuery()
for idx, message in enumerate(consumer):
data = message.value
ts_to_insert = dateutil.parser.parse(data['timestamp'])
filters = get_filters(message.partition_key)
MetricsByDate.batch(batch).create(
device=device,
date=str(ts_to_insert.date()),
time=str(ts_to_insert.time()),
created_at=now,
DSO=str(filters['DSO']),
node=str(filters['node']),
park=int(filters['park']),
commercializer=str(filters['commercializer']),
load_power=data['loadPower'],
inverter_power=data['inverterPower'],
)
if idx % 100 == 0: # Insert every 100 messages
batch.execute()
# Reset batch
batch = BatchQuery()
I've already checked that the values corresponding to the new fields aren't None and have the correct type. Nevertheless, it's inserting all the row correctly but the values in the new fields, that are NULL in Cassandra.
The batch insertion does not return any errors. I don't know if I'm missing something, or if I need to do an extra step to update the schema. I've been looking in the docs, but I can't find anything that helps.
Is there anything I'm doing wrong?
EDIT
After Alex Ott suggestion, I've inserted the lines one by one. Changing the code to:
for idx, message in enumerate(consumer):
data = message.value
ts_to_insert = dateutil.parser.parse(data['timestamp'])
filters = get_filters(message.partition_key)
metrics_by_date = MetricsByDate(
device=device,
date=str(ts_to_insert.date()),
time=str(ts_to_insert.time()),
created_at=now,
DSO=str(filters['DSO']),
node=str(filters['node']),
park=int(filters['park']),
commercializer=str(filters['commercializer']),
load_power=data['loadPower'],
inverter_power=data['inverterPower'],
)
metrics_by_date.save()
If before executing the line metrics_by_date.save()
I add these print statements:
print(metrics_by_date.DSO)
print(metrics_by_date.park)
print(metrics_by_date.load_power)
print(metrics_by_date.device)
print(metrics_by_date.date)
The output is:
(<cassandra.cqlengine.columns.Text object at 0x7ff0b492a670>,)
(<cassandra.cqlengine.columns.Integer object at 0x7ff0b492d190>,)
256.99
SQ3-3.2.3.1-70-17444
2020-04-22
In the fields that are new I'm getting a cassandra object, but in the others I get their values. It maybe is a clue, because it continues to insert NULL in the new column.