Try to put your your CREATE TABLE
statement in a flat file (schema.cql for example) and then execute cqlsh -f schema.cql
By they way, 56k columns is HUGE and no sane developer will ever create a table with more than 1k columns ... What are you trying to test and assert with this scenario ?
---- Answer for 1st comment --
Schema is all about metadata because raw data are written as byte[]
on disk anyway. The more you have columns in a table, the bigger the metadata will be in memory.
So while retrieving, I will pass the specific column name in select query (keeping performance in mind) so it wont retrieve all the columns
It's not that simple. All the 56k columns are stored on disk contiguously. When reading data, Cassandra has index structures to skip partition keys and clustering columns. For normal columns, as in your case, there is no index to get the exact column requested by the client so for example, if you're doing a SELECT field1293 FROM usertable WHERE y_id = xxx
, Cassandra will need to scan the whole block from field1
until field56000
into memory before picking the right column and this is very very horribly inefficient
--- Answer for Nth comment --
I do agree it would become a very slow/inefficient, but I need to achieve this scenario to simulate genotype data.
I recommend to try and test this schema:
create table usertable (
y_id varchar,
field_index int,
field_value varchard,
PRIMARY KEY(y_id, field_index)
);
//INSERT/UPDATE data into field N
INSERT INTO usertable(y_id, field_index, field_value)
VALUES('xxx', N, 'fieldN value');
//DELETE field N
DELETE FROM usertable WHERE y_id='xxx' AND field_index=N;
// Read EXACTLY field N
SELECT field_value FROM usertable WHERE y_id='xxx' AND field_index=N;
// Read field N to M, N <= M
SELECT field_value FROM usertable WHERE y_id='xxx'
AND field_index >=N
AND field_index <= M;
You'll see that it works wayyyyyyy better