I'm working with the movielens dataset and I have a column called 'genres' which has entries such as 'Action|War', 'Action|Adventure|Comedy|Sci-Fi'. I wish to count the number of rows that have the text 'Comedy' in them.
SELECT COUNT(*) FROM movielens.data_movies WHERE genres = 'Comedy' ALLOW FILTERING
But this counts only the exact instances of 'Comedy'. It does not count 'Action|Adventure|Comedy|Sci-Fi' which I want it to do. So I tried,
SELECT COUNT(*) FROM movielens.data_movies WHERE genres CONTAINS 'Comedy' ALLOW FILTERING
However, that gives me the error
Cannot use CONTAINS on non-collection column genres
From this it seems that there is no easy way to do what I'm asking. Does anyone know of a simpler solution?