7

I'm new to databases and web development but I'm trying my best to learn my way though making my own dynamic website. I'm taking it step by step and am currently designing the data model on paper. I'm wonder how does one structure a database for a website that allows voting like how stackoverflow does it? If there's a table that contains a list of questions, each question that a user creates gets added to this table. There can't simply be a field on this table that counts votes because that would allow a single person to have unlimited votes right? So there should be a key that connects to another table which counts votes and keeps track of users so they can't vote twice, correct? If this is true, this is the part where I get confused. Each answer given can also be voted on. So does that mean that when a user submits an answer, in addition to adding that answer to probably a separate table for answers per question asked, the model must also generate a new table for each answer dynamically during run-time to keep track of all these votes?

Note that I'm not specifically asking about how stackoverflow does it, but how the concept of what the user experiences works.

One thing I'd also like to do is query the activity of a single user, so if all these tables would have to be created dynamically for every piece of submitted data, creating a crap load of tables over time, wouldn't it be really slow to have to parse through every table to check if a particular user submitted any data or voted?

Is there a better way of doing this that someone could explain in laymen terms? No need for specific code... I can probably figure that out later when the time comes. I'm just theorizing right now and building a paper model to work off of later.

EDIT: Oh, I see. I think in excel-like spreadsheets when I'm thinking of database tables, so correct me if my understanding is wrong. So every vote site-wide is on a single table (listed vertically on a spreadsheet), each having a line of data (horizontally) that links the vote to a variety of owners (user and question OR answer)? Is that correct? I say question OR answer because I don't understand a scenario where it would make sense to have them both as a vote attribute(not sure if that's correct terminology) instead of creating two separate vote datas for an answer and a question which are both being voted on. Basically the way I see it, each line represents a vote and there are 3 fields, 1. Value (+1 or -1), 2. From whom (username), 3. To what (question or answer).

user1263500
  • 83
  • 2
  • 5
  • [Understanding the StackOverflow Database Schema](http://sqlserverpedia.com/wiki/Understanding_the_StackOverflow_Database_Schema) – DOK Mar 12 '12 at 13:07
  • Try google for many-to-many relationship because this is what you would have between question and user. – Sgoettschkes Mar 12 '12 at 13:07
  • First thing you need to do is throw out all your excel thinking when it comes to DB design. I've seen a lot of db's go down the "excel" path and end up a mess to maintain. Look up one to many relationships, many to many relationships and db normalization. Some will argue for total first nomral form, but I've always found that 3rd normal form is a good trade off between data integrity and usability. – Brian Mar 12 '12 at 14:16
  • Well to clarify when I say excel style, I mean visualizing a single table, where as a database is a collection of all these excel spreadsheets. I understand one to many and many to many, but I've never heard of db normalization before. The way I visualize one to many is a spreadsheet that has a key in one of it's columns that reference data in another spreadsheet. Is that a correct perspective? The way I visualize many to many is two one to many relationships between the two many to many and a third table. – user1263500 Mar 12 '12 at 14:41
  • normalization is basically data integrity related where data is never repeated in the database. It is only accessed through foreign keys. First normal form means roughly absolutely no repetition so if you update one value in a table, it propagates through all the other tables via keys. 3rd normal form means some data repetition but it is a little bit easier to query the tables in many cases. – Brian Mar 12 '12 at 14:52
  • db relationships are a different matter. you're correct in your excel analogy and one to many relationships. Many to many are a little trickier but generally are more rare. It is easy to say students to classes is a many to many relationship because many students will have a class and a student will have many classes but in reality IMO many to many relationships can often be broken in to a one to many relationship. If you think of Data as flowing one direction this makes more sense. You just have to find the starting point for your data diagrams and go from there. – Brian Mar 12 '12 at 14:56
  • Understanding the SO DB schema is a 404. Find it here: http://www.toadworld.com/platforms/sql-server/w/wiki/9470.understanding-the-stackoverflow-database-schema.aspx – noonand May 27 '15 at 08:20

3 Answers3

4

You have to look at all the elements. Basically you have

Questions
Users
Answers
Votes

Users are tied to Questions and answers and votes so you will either need adequate foreign keys to handle this or a child table that connects these. For instance you could have

tblQuestions
    questid
    question
    userid

then

tblAnswer
    Answer
    answerid
    userid
    questid 
    accepted (to flag as accepted answer)

and finally

tblVote
    vote (up or down)
    questid
    answerid
    userid

The user table is fairly straightforward and the fun part happens in the behind the scenes logic. this is obviously a very rough layout and a lot of other things need to be considered and there are dozens of ways to accomplish the table layout.

Brian
  • 2,229
  • 17
  • 24
  • Why would you need a questionID and an answerID on the vote table? Wouldn't it be better to have a field that functions as a catch-all "forID" that can hold either a questionID or answerID because there should never be an instance where a vote would count for both a question AND an answer right? – user1263500 Mar 12 '12 at 13:52
  • You could do it that way or have separate tables for question vote and answer vote or just have one table with a vote type to designate whether it is a vote on a question or an answer. You will need some why to tell what they user is voting on, but there are a number of ways to accomplish that. As they say, ask 10 developers a question and you'll get 12 answers. – Brian Mar 12 '12 at 14:12
  • Is there an advantage or disadvantage of doing it with a single vote table with a reference table to determine whether the vote is for a question or an answer as I've suggested (and you've added on to) vs doing it with two separate vote tables, one for answer and question? I can see both ways as viable but I don't think I know enough to see the pros and cons of each method. – user1263500 Mar 12 '12 at 14:33
  • either way is ok. it all will boil down to proper foreign keys and table indexes. Without these, either way will be a performance hit if the db gets large. – Brian Mar 12 '12 at 14:49
3
CREATE TABLE "QuestionVote" (
    "Question" INT NOT NULL, -- To identify the question being voted on
    "User" INT NOT NULL, -- To identify the user who is casting their vote
    "Vote" SMALLINT NOT NULL CHECK ("Vote" = 1 OR "Vote" = -1),
    PRIMARY KEY ("Question", "User"),
    CONSTRAINT "Constr_QuestionVote_Question"
        FOREIGN KEY "QuestionVote_Question" ("Question")
        REFERENCES "Question" ("ID"),
    CONSTRAINT "Constr_QuestionVote_User"
        FOREIGN KEY "QuestionVote_User" ("User")
        REFERENCES "User" ("ID")
)

NB. The answer to a problem in database design is never "Create a new table on-the-fly for every new (user/discussion/item of whatever sort)". If you think it is a good idea to create a new table for every user, you have made a mistake! Stop and work out how you can do what you want to do with a fixed set of tables.

Hammerite
  • 21,755
  • 6
  • 70
  • 91
0

Use A NOSQL Document Approach. (couchdb)

Database: Stacklike

Documents In Database Stacklike:

{
  "type": "question",
  "user": "<userid>"
  ...
}

{
  "type": "answer",
  "user": "<userid>",
  "question": "<questionid>"
  ...
}

{
  "type": "vote",
  "user": "<userid>",
  "question": "<questionid>",
  "weight": "<weight>"
  ...
}

{
  "type": "user",
  ...
}

Views:

To List Votes, Sorted By Question

map(doc){
  if (doc.type === 'vote'){
    emit(doc.questionid, doc)
}

To View Vote Counts and Totals (stats)

map(doc){
  if (doc.type === 'vote'){
    emit(doc.questionid, doc.weight)
}
reduce(keys,values, rereduce){
  _stats
}
Steven Spungin
  • 27,002
  • 5
  • 88
  • 78