given i have the following Objects to persist in Cassandra:
Array of Foo:
{
"id":1,
"name": "this is a name",
"bundleFields" : [
"bundleByMe",
"me2",
"me also",
]
},
{
"id":2,
"name": "anotherName",
"bundleFields" : [
"bundleByMe",
"me2",
"me also",
]
},
{
"id":3,
"name": "thridName",
"bundleFields" : [
"differentBundleCriteria"
]
}
I wanna query something like SELECT * FROM FOO where bundleFields = ["...", "..."]
.
This obviously does not work, since queries by list<>
are not possible (no Primarykey).
This is the Schema i currently have:
CREATE TABLE IF NOT EXISTS Foo (
id int,
name varchar,
bundleFields list<varchar>,
PRIMARY KEY(id)
);
The only solution i can imagine is another table where the PRIMARY KEY contains the concatenated values of the bundleFields-Array, which would allow a lookup by the bundleString:
CREATE TABLE IF NOT EXISTS fooByBundleString (
bundleString varchar,
fooId int,
PRIMARY KEY(bundleString)
);
Is this the recomended approach to this problem in cassandra.
The idea of having to serialize/deserialize the bundleFields-array does not feel "right" to me.
Thanks for advice!
Edit: As @rs_atl suggested the correct DDL for table fooByBundleString should be (note additional fooId
in PRIMARY KEY):
CREATE TABLE IF NOT EXISTS fooByBundleString (
bundleString varchar,
fooId int,
PRIMARY KEY(bundleString, fooId)
);
to create a covering-Index, since otherwise it would not be possible to store the same bundleString for different fooId's.