8

I was asked to make some kind of reporting (logging) service. The employee has locally installed web application (just some dynamic website, written in PHP) in many companies. This web app is some kind of survey. All data is saved on local database, but now the requirement is that this data (result of survey) will also be sent to central server after every form submit.

There are four types of surveys. They have organised it this way, that there are many Projects, and each Project can have only one Survey of each type (STI here?) and Survey belongs to one Project. Each Survey will receive a report from local app, so it will have many Reports. The Rails 3 app that logs this reports should mimic somehow this logic. First question is: does this AR structure make sense for you?

  Project-1--------1-Survey-1-------*-Report

  Project
    has_one :survey
    has_many :reports, :through => :survey

  Survey
    belongs_to :project
    has_many :reports

  Report
    belongs_to :survey

Second question is about having multiple tables for one AR Model. If all data will be stored in reports table, the table will become huge very quickly, and efficient querying for reports that belong to specific survey might be a problem after some time. Maybe it would be better to have separate tables for each Survey? Like reports_<survey_id>. Is this possible?

Also, I am somehow forced to use MySQL, but if there is another, much better solution for this, I could try to push it through.

If you are still here, thank you for reading this :)

Ernest
  • 8,701
  • 5
  • 40
  • 51

2 Answers2

5

Put an index on each of your foreign keys (e.g. Reports.survey_id) and take a breather. You're worrying entirely too much about the performance right now. You will need at least millions of records in your Reports table before you will see any performance problems from MySQL.

coreyward
  • 77,547
  • 20
  • 137
  • 166
  • Yes, maybe I am worrying too much. See my last comment to Anton's answer. Thanks again for your tips. – Ernest May 21 '11 at 20:33
5

Second question is about having multiple tables for one AR Model. If all data will be stored in reports table, the table will become huge very quickly, and efficient querying for reports that belong to specific survey might be a problem after some time. Maybe it would be better to have separate tables for each Survey? Like reports_. Is this possible?

Yes, it is possible.

You can do it this way:

class Report < AR::Base
  table_name_suffix = ''
end

Report.table_name_suffix = 'foo'

UPDATE


# simple example of sharding model

class Survey < AR::Base
  has_many :reports

  def shard_reports
    Report.table_name_suffix = "_survey_#{self.id}"
    returning(reports){ Report.table_name_suffix = "" }
  end

end

Survey.first.reports_shard
Anton
  • 1,401
  • 8
  • 12
  • Thank you, looks easy. I hope this will not cause any problems. – Ernest May 17 '11 at 12:27
  • Problems will be ;) Because, you should control this manually. And create new report table when you create survey – Anton May 17 '11 at 14:13
  • @Anton What? This sounds like a mess. See http://stackoverflow.com/questions/44145/database-sharding-and-rails – coreyward May 17 '11 at 19:23
  • @Anton So this would require to mess with raw SQL? – Ernest May 19 '11 at 11:22
  • @Ernest No this don't require raw SQL. This require select table before you manipulate of data. See update. – Anton May 19 '11 at 13:49
  • Dynamically creating tables for each survey is a *really* bad idea, and wholly unnecessary. – coreyward May 21 '11 at 19:41
  • I appreciate both answers. Even if you don't need/should to create (dynamically) many tables for this case (as coreyward wrote), I am choosing Anton's reply because it answers the stated problem in a topic (multiple tables, one AR Model, is it possible?). But probably I won't go this way myself now. Thank you. – Ernest May 21 '11 at 20:31
  • @coreyward I'm not saying that the tables should be created dynamically. You must do this manually – Anton May 22 '11 at 10:18
  • @Ernest Of course, sharding - an extreme case, when really a lot of data and have possible to make optimal use of shards. But I answered the question – Anton May 22 '11 at 10:23