I'm building a star schema to allow reporting against usage of an application, but have hit a stumbling block on two points:
- Some answers can be free text.
- I need to report on what questions were not answered by what users.
To elaborate:
Application has a section where user answers questions.
A question can have the following types of answers:
- Multiple choice (radio buttons), "choose 1 out of 4."
- List item.
- Multiple choice (tick box), "choose one or more of the following."
- free text (text box), "what are your thoughts on...?", or used if "other" radio button is chosen.
I was considering a questions dimension, and an answers dimension. The problem with this is the free text will pollute the answers dimension with lots of answers that are specific to only one user on one question.
This model also means that there is a row in the fact table for each answer for every user, which I don't think is ideal - but I can live with this if necessary. The fact table is mostly just to link the other tables together. Fact table will be roughly: Fact_Table_ID, UserID, QuestionID, AnswerID, Is_Answered (boolean, yes/no). This is all a way to see which questions are being answered, and which are not being answered.
I will have to report on questions that have not been answered by users as well as what has been answered, and I don't know how to address this in a star schema.
I'd appreciate some suggestions on how best to approach this.