I have some data, and am trying to keep track of various issues with paperwork for work. The data looks something like this:
ID | Paperwork 1 | Paperwork 2 | Paperwork 3
1 | No Signature | No Signature | Missing Date
2 | Completed | No Signature | Missing Signature
3 | Missing Date | Completed | Completed
4 | None | Completed | Missing Signature
Over N categories of paperwork, and M different ID's or rows, and X different possible responses in each field, I am trying to determine what response for which ID and which category of paperwork is where.
I've thought of two different table designs.
Two Column Design:
ID | Paperwork
1 | 1
2 | 4
3 | 10
4 | 257
With the corresponding code base
0 = No Paperwork 1
1 = No Paperwork 2
2 = No Paperwork 3
4 = Paperwork 1 Submitted Successfully
8 = Paperwork 2 Submitted Successfully
16 = Paperwork 3 Submitted Successfully
32 = Paperwork 1 Missing Date
64 = Paperwork 2 Missing Date
128 = Paperwork 3 Missing Date
256 = Paperwork 1 Missing Signature
512 = Paperwork 2 Missing Signature
1024 = Paperwork 3 Missing Signature
So that we know simply by looking at whatever number is in the paperwork column, what the issues are with the different paperwork for the period. In the two column design, the errors would then be:
Two Column Design:
ID | Paperwork
1 | No Paperwork 2
2 | Paperwork 1 Submitted Successfully, No Paperwork 2
3 | Paperwork 2 Submitted Successfully, No Paperwork 3
4 | Paperwork 1 Missing Signature, No Paperwork 2
The issues with this design are: 1) It is easy to get the numbers to put in the paperwork column - someone needs to look over all of the paperwork from that given period, and note the various issues or lack of issues with each piece of paperwork. Once those issues are found, a number is assigned to each issue, and then those numbers are summed up to find an overall number which encapsulates all of the issues in that given period. However, I don't know how to make the text outputs from those specific numbers, and I don't want to brute force a list of all the potential text outputs because there are way too many different possibilities. It can be narrowed somewhat, as you cannot have a completed piece of paperwork that's missing a signature, but that sounds like way too many possibilities So my question is: If I get the number 257, how can I tell that 257 is really 256+1? Do I need to, or am I making this unnecessarily complicated?
Alternatively, another way I have thought up was to include all errors in a more binary variable. So instead of the error codes listed above, it would look like:
Many Columns
0 = No Paperwork
1 = Paperwork Completed
2 = There is an error with Paperwork 1
3 = There is an error with Paperwork 2
4 = There is an error with Paperwork 3
So the table would then be:
ID | Paperwork 1 | Paperwork 2 | Paperwork 3
1 | 2 | 3 | 4
2 | 1 | 3 | 4
3 | 2 | 1 | 1
4 | 0 | 1 | 4
And then when I am trying to run a report on the various problems with the paperwork, all I need to do is have a lookup function that counts how many instances of the number 2 there are in paperwork 1 column, and find the corresponding ID with it. I would then give this information to my co-worker and boss, and then they would in turn know that there is an issue with this piece of paperwork for this person; but they would not know specifically what the issue is, and would need to be relying on firsthand knowledge of the books themselves.
The problem with this design is that it implicitly requires people to check with my co-worker who administers the books to find out what the issue is with the paperwork, and the whole point of my involvement in the exercise is to help her with her work and to use the power of computers to help her with her task. So the second design doesn't seemingly use the full potential of computer technology to help with tasks.
So beyond the question posed before, I wholly welcome any questions, points, or directions in this matter. If you have an alternative table design, I welcome that too.
I apologize in advance if this post has violated any stackoverflow policies about not being direct, I tried to make it as direct as I could.