13

I created some website which contain users,comments,videos,photos,messages and more.All of the data is in the one table which contain 100 column.I thought one table is better than more because user need just connect one table but I heard that some programmer doesnt like this method.And Can someone say me which one is better?One very large table or a lot of little tables. and Why I need use a lot tables?Why it is useful?Which one is fast for user? What is the advantages and disadvantages of large table and a lot of little tables?

Ibrahim Hasanov
  • 401
  • 1
  • 4
  • 10
  • 9
    Do NOT use just one table. Separate similar things into their own tables. (e.g.: User table, comments table, video table, photos table, etc...) – Siyual Apr 11 '16 at 12:58
  • @Siyual why a lot of tables is useful?A lot of tables means that a lot of connection for user? – Ibrahim Hasanov Apr 11 '16 at 13:43
  • @Ibrahim: no, it means a lot of joins for the database, it's still just 1 connection for user. If you're connecting once per table, you're code is flawed. (and Databases don't mind joins, they like them, they handle them just fine ... ) – Ditto Apr 11 '16 at 13:44
  • @IbrahimHasanov, you use joins. Databases are optimized to use joins as long as you remember to index the join fields. – HLGEM Apr 11 '16 at 13:44
  • @Ditto why large table is not useful?Why little more tables is useful? – Ibrahim Hasanov Apr 11 '16 at 13:45
  • Let's say we store data in boxes (ie tables), in the back room. I ask you to fetch something simple, like say User 123 comment for Apr 5. You have to go back, pull out the 1 HUGE BOX .. and bring the entire thing out ... then sift through it, find the information .. discard the rest .. put the box back, and then give me the comment. If you had smaller, more organized boxes, you could simply go to the appropriate box, retrieve the information ... easier, quicker ... less stress on your poor back ;) – Ditto Apr 11 '16 at 13:50
  • 1
    @IbrahimHasanov Consider this: what is going to happen when a user makes more than one message? Uploads more than one video? Does more than the number of fields you've allocated to your large table? You NEED to separate the elements into their own table, so that you can keep like things together and add all of the columns you need to support that data, with the appropriate keys back to the user that owns them. This way, Bobby and Susie are not constrained by a bad database design when using your application. – Siyual Apr 11 '16 at 13:50
  • 5
    @IbrahimHasanov You should look into database normalization. – Siyual Apr 11 '16 at 13:51
  • @Ditto I am just use user mail - wheretocomment - comment - checkedornot.It is easy to use and I want to understand that what will change if I do all of them for seperate tables?They just require some join issues and they will help me just get rid of some php codes – Ibrahim Hasanov Apr 11 '16 at 13:58
  • 1
    @IbrahimHasanov Easy doesnt mean its a good idea. My companies database has 300 tables with a max of about 20 columns per table. Should i create a single table with 6000 columns? How about you keep your program for the next 15 years and you gradually need to add more columns. At what point will you decide that you need a new table? – CathalMF Apr 11 '16 at 14:02
  • @Ibrahim: Why don't you answer me this: Why have many smaller rooms in a house vs 1 large room? or Why have many smaller boxes vs 1 large box ? – Ditto Apr 11 '16 at 14:03
  • @Siyual Actually now users can share photos videos messages comments whatever they want.Count doesnt matter.I want to understant with computer way(Not box:) Ditto ) what is the advantage of use a lot of tables instead of a lot of column and what is the advantage of use one large database?and disadvantages? – Ibrahim Hasanov Apr 11 '16 at 14:04
  • 3
    @IbrahimHasanov Your questions have already been answered in the answers below. – CathalMF Apr 11 '16 at 14:05
  • @Ditto I understood that you say but I cant understant that what is the advantages and disadvantages for user or server to use large table and more table – Ibrahim Hasanov Apr 11 '16 at 14:06
  • 1
    @IbrahimHasanov The advantage is performance and storage space. Create two databases. One with a single table and one with a split table design. Fill each database with a million records. Then check the performance. – CathalMF Apr 11 '16 at 14:07
  • I have created sistem that is connect each others.For example if someone change his profile image everyone will see him updatimg version I mean that I dont need the update another column it just update one column which contain user and his informations.And also comments photos and other things.And I want to learn that what will happen if I seperate them to the different tables and then I join them with more sql code like JOIN.What will be my advantage in this case? – Ibrahim Hasanov Apr 11 '16 at 14:17
  • @Ibrahim: It's been mentioned already: Performance, less locking/blocking ... less duplication of data ... read other's answers. I also tried to explain via analogy which you avoided - you say you understand, but you obviously do not, since you are still confused. Answer my question re: advantages of many small rooms in a house. You answer that, you answer your own questions ;) – Ditto Apr 11 '16 at 14:19
  • @Ditto for many rooms have a lot of walls and they block connection each others if you want to connect them you should break wall and create door (like Join sql) and also in the begining you need create a lot of wall for a lot of room but You actually need connect them.Why I need create and again break doors?It will take server time to break and my time to create – Ibrahim Hasanov Apr 11 '16 at 14:33
  • 1
    @Ibrahim: so you're ok with 1 large room then? What about when you want to go to the bathroom ? What about if your kids are trying to sleep and you and your spouse are watching tv ? Rooms provide a segregation of utility. They also provide privacy. Likewise, tables can provide a (logical) seperation of information, making the data make more sense. They can also provide security if needed (ie seperate schemas, grants, etc.). There's a reason your kitchen, living room and bedroom are seperated ... similar reason to seperate your data into sensible pieces. Organization, Performance and Security. – Ditto Apr 11 '16 at 15:18
  • @Ditto thank you I understand many thing about room.:)but what about real server ?is Information delay each other?and what is the secury issue in the one table? – Ibrahim Hasanov Apr 11 '16 at 15:26
  • 1
    @Ibrahim: Depends on the tables and their needs, but if you have only 1 table, it's very hard to control access to it. If 1 person can access it, they can access ALL of it. Where as, if you have 4 tables, you can give them access to 3 of them, but secure the last 1. Joins are very easy for databases, so joining 3 or 4 tables is nothing, simple .. easy. Keep in mind retrieving a single row from your big table requires the DB to pull back a LARGE amount of data .. all just because you wanted to check a single field ? That's costly. Disk IO is costly, avoid doing unnecessary IO. – Ditto Apr 11 '16 at 15:30
  • @Ditto thanks I will try to build rooms for children to sleep when I watching TV because I disturb them and they disturb me and also like datas – Ibrahim Hasanov Apr 11 '16 at 15:32
  • This question has received some good answers. Please accept the answer that best meets your needs. – trincot May 07 '16 at 08:13

3 Answers3

15

100 columns in a single table is bad design in most situations.

Read this page: http://www.tutorialspoint.com/sql/sql-rdbms-concepts.htm

Break your data up into related chunks and give each of them their own table.

You said you have this information (users,comments,videos,photos,messages) so you should have something like these tables.

  1. Users which contains (User ID, Name, Email etc)
  2. Comments which contains (Comment ID, User ID, Comment Text etc)
  3. Videos which contains (Video ID, User ID, Comment ID, Video Data etc)
  4. Photos which contains (Photo ID, User ID, Comment ID, Photo Data etc)
  5. Messages which contains (Message ID, User ID, Message Text etc)

Then when your writing your SQL you can write proper SQL to query based on exactly what information you need.

SELECT UserID, MessageID, MessageText
FROM Users as USR
    JOIN Messages as MSG
        on USR.UserID = MSG.UserID
WHERE USR.UserID = 1234567

With your current query your having to deal with rows containing data that you dont need or care about.

EDIT Just to give some further information to the OP as to why this is better design.

Lets take the "Users" as a starting example.

In a proper database design you would have a table called Users which has all the required columns that are required for a user to exist. Username, email, id number etc.

Now we want to create a new user so we want to insert Username, email and id number. But wait i still have to populate 97 other columns with totally unrelated information to our process of creating a new user! Even if you store NULL in all columns its going to use some space in the database.

Also imagine you have hundreds of users all trying to select, update and delete from a single database table. There is a high chance of the table being locked. But if you had one user updating the Users table, another user Inserting into the Messages table then the work is spread out.

And as other users have said, purely performance. The database needs to get all information and filter out what you want. If you have alot of columns this is unnecessary work.

Performance Example.

Lets say your database has been running for years. You have 5000 users, 2,000,000 comments, 300,000 pictures, 1,000,000 messages. Your single table now contains 3,305,000 records.

Now you want to find a User with the ID of 12345 who has more than 20 pictures. You need to search through all 3,305,000 records to get this result.

If you had a split table design then you would only need to search through 305,000 records.

Obvious performance gain!!

EDIT 2

Performance TEST.

I created a dummy table containing 2 million rows and 1 column. I ran the below query which took 120ms on average over 10 executions.

SELECT MyDate1 from dbo.DummyTable where MyDate1 BETWEEN '2015-02-15 16:59:00.000' and '2015-02-15 16:59:59.000'

I then truncated the table and created 6 more columns and populated them with 2 million rows of test data and ran the same query. It took 210ms on average over 10 executions.

So adding more columns decreases performance even though your not viewing the extra data.

CathalMF
  • 9,705
  • 6
  • 70
  • 106
  • 8
    100 columns in a table and even 300 columns are sometimes necessary! Its not crazy at all, so don't mislead people here. **In his case** its unnecessary and can be separated into more then 1 table. – sagi Apr 11 '16 at 13:28
  • 1
    Why I need use a lot of tables.Why it is useful? – Ibrahim Hasanov Apr 11 '16 at 13:41
  • 5
    Why does somebody need multiple rooms in a house. Why not just 1 really large room ? That would be simpler .. no walls, no doors, easy right ? O.o The problem is, "it depends", in most situations, you want a smaller breakdown of things, (ie rooms, or tables) ... in some really rare/specialty cases, you want just a single large "room" (ie gymnasium?) – Ditto Apr 11 '16 at 13:46
  • 1
    @CathalIMF ok I understant some parts but what is the different they connect tables in one database.If user is more for server it means that they are more for database.It doesnt matter tables.A lot of tables again depend on one database if something should locked it will be database.If tables is more or if they are large all of them is in one database one connection.For example 1million user connected to server.They will connect to one database and they will update delete or insert something to different tables but all of them happen is in the one database and database will locked if needed – Ibrahim Hasanov Apr 11 '16 at 14:26
  • 2
    @IbrahimHasanov Databases dont work that way. They lock tables, not the whole database. They are designed to have multiple tables which are connected by Primary and Foreign keys. They are called Relationship Database Management Systems (RDBMS) for a reason. http://www.tutorialspoint.com/sql/sql-rdbms-concepts.htm Check the last section on that page and read them "Database Normalization" – CathalMF Apr 11 '16 at 14:30
  • 1
    @CathaIMF Is there problem with database to search for user id of 12345?I just write select * from database where id = 12345 and more for say database dont choose messages or comments just it – Ibrahim Hasanov Apr 11 '16 at 14:46
  • 1
    @IbrahimHasanov The database still has to spend time and effort filtering out all the information that you dont want. Please spend some time reading about how relational databases work. – CathalMF Apr 11 '16 at 15:26
  • 1
    Hmm this was useful.I will research about it. – Ibrahim Hasanov Apr 11 '16 at 15:30
  • 1
    @CathalMF "RDBMS stands for **Relational** Database Management System.", where ["relational" is not about foreign keys](https://en.wikipedia.org/wiki/Relational_database), at least solely. – Ilja Everilä Jul 06 '17 at 06:43
  • 1
    Depending on how the database evolves over time, if new fields and tables are added, it could be difficult to maintain normalization and performance. Imagine you have to add a new media format much later - it could be difficult to normalize it all. It can also be difficult to avoid orphaned records. For example, deleting a User will also require appropriately handling their media records as well. While I'm not disagreeing with any of the answers that your single-table database is poorly designed, to have a perfectly normalized database may not be a viable solution based on project constraints. – zr00 Dec 13 '18 at 18:01
3

Wide tables can cause performance problems if they are wider than the database can store in one place.

You need to read about normalization as this type of structure is very bad and is not what the database is optimized for. In your case you will have many repeated records that you will have to use distinct (which is a performance killer) to get rid of when you want to only show the user name or the comments.

Additionally, you may have some fields that are repeats like comment1, comment2, etc. Those are very hard to query over time and if you need another one, then you have to change the table structure and potentially change the queries. That is a bad way to do business.

Further when you only have one table, it becomes a hot spot in your database and you will have more locking and blocking.

Now also suppose that one of those pieces of information is updated, now you have to make sure to update all the records not just one. This can also be also a performance killer and if you don't do it, then you will have data integrity problems which will make the data in your database essentially useless. Denormalizing is almost always a bad idea and always is a bad idea when done by someone who is not an expert in database design. There are many ramifications of denormalization that you probably haven't thought of. Overall your strategy is sure loser over time and needs to be fixed ASAP because the more records you have in a database, the harder it is to refactor.

HLGEM
  • 94,695
  • 15
  • 113
  • 186
  • I never used any distinct and I am just use user mail - wheretocomment - comment - checkedornot.It is easy to use and I want to understand that what will change if I do all of them for seperate tables?They just require some join issues and they will help me just get rid of some php codes – Ibrahim Hasanov Apr 11 '16 at 13:56
  • I would suggest you take a look at my reputation score and think about whether I probably s=know what I am talking about. I have worked in databases for over 20 years and have delat with hundreds of poorly designed databases and seen what happens when people do things like this. Your design is irretrievably flawed, it is going to create problems down the road as your database gets bigger and it is short-sighted and stupid not to fix it now while you still can without spending thousands of hours and hiring a very well paid specialist to do the fix. – HLGEM Apr 11 '16 at 14:02
1

For your situation it is better to have multiple tables. The reason for this is because if you put all your data into one table then you will have update anomalies. For example, if a user decides to update his username, you will have to update every single row in your big table that has that user's username. But if you split it into multiple tables then you will only need to update one row in your User table and all the rows in your other tables will reference that updated row.

As far as speed, having one table will be faster than multiple tables with SELECT statements because joining tables is slow. INSERT statements will be about the same speed in either situation because you will be inserting one row. However, updating someone's username with an UPDATE statement will be very slow with one table if they have a lot of data about them because it has to go through each row and update every one of them as opposed to only having to update one row in the User table.

So, you should create tables for everything you mentioned in your first sentence (users, comments, videos, photos, and messages) and connect them using Ids like this:

User
-Id
-Username

Video
-Id
-UploaderId references User.Id
-VideoUrl

Photo
-Id
-UploaderId references User.Id
-PhotoUrl

VideoComment
-CommenterId references User.Id
-VideoId references Video.Id
-CommentText

PhotoComment
-CommenterId reference User.Id
-PhotoId references Photo.Id
-CommentText

Message
-SenderId references User.Id
-ReceiverId references User.Id
-MessageText
Nathan Bierema
  • 1,813
  • 2
  • 14
  • 24
  • when user decide to update he update only one column'row he dont need to update whole thing that connected with him.And if I seperate columns to tables what will change?Just I will need to write more code for join tables – Ibrahim Hasanov Apr 11 '16 at 14:36
  • What columns do you currently have in your table? – Nathan Bierema Apr 11 '16 at 15:03
  • users ids comment towho messages photoes profileimgs likes shares workplace and more than 70 column – Ibrahim Hasanov Apr 11 '16 at 15:05
  • @IbrahimHasanov From what you're saying, I think your lookup will be slower because you will have so many rows in your big table. For example, if you're trying to select a username then you have to search through all the videos, photos, comments, and usernames. If you use multiple tables you will only have to search through the user table. – Nathan Bierema Apr 11 '16 at 15:29
  • Actually I dont search anything from my database sql did for codes.and I am just said choose user and it happen just it – Ibrahim Hasanov Apr 11 '16 at 15:29
  • @IbrahimHasanov Yes, but in the background the SQL will run slower because it has to check every single row in your table. – Nathan Bierema Apr 11 '16 at 15:30
  • Hmm Thanks I will create a lot of tables – Ibrahim Hasanov Apr 11 '16 at 15:33