We are modeling account table in cassandra with social logins, we chose email as primary key and skinny row implementation. Our cassandra is on version 2.1.6
. Here is table definition:
CREATE TABLE account_by_email (
email_address text,
account_password text,
first_name text,
last_name text,
registered_at timestamp,
roles set<text>,
facebook_id text,
twitter_id text,
linkedin_id text,
password_reset_token blob,
password_reset_token_valid_until timestamp,
profile_image_url text,
PRIMARY KEY (email_address) ) WITH COMMENT='Accounts in system by email.';
This workks fine for email access since we can access fast each account when we know email address which is situation after login.
User has, in addition to email login option to login / signup with social accounts. When social account login is used flow is go to social network, receive social id (facebook, twitter, linkedin) and maybe email and query account table by social id to get full account or just email and continue on using email on each API request.
We currently added indexes on facebook_id
, twitter_id
, linkedin_id
to support this since we are in MVP phase with one node and we chose fats implementation over performance.
What is proper way to model this? Here are couple of suggestions we are thinking of:
- leave index implementation since fetch by social id is happening only on login once (after that email is used)
- have one table for each social id which will hold social id email pair
- have one table for each social id which will hold full account (account can be edited so this will add complexity to update)
- something else?
And another question, is index implementation with high cardinality field (as social id) really that bad when you model access path which happens rarely?