5

I'm looking at file upload gems and there seems to be a tendency to put all assets in single "Assets" table and using STI to subclass them. Like ImageAsset, VideoAsset, AudioAsset, etc.

I'm new to Rails and I've never used STI. Previously I would have just made images, videos, audios separate tables. Granted they may share a few columns but I'd say they'll also have some different ones (sample rate doesn't apply to image, e.g.)

Pros of stuffing all this in one "assets" table: easier to run queries on all assets. Cons: table is going to get bigger quicker. I guess I could always shard on the "type" column if that's a problem. Also I anticipate all the audio-only columns will be null on image rows, etc.

My whole app is going to be based around assets so I want to make sure I have the pros and cons straight before making a decision. Has anyone done STI assets and regretted it? Has anyone done it and not regretted it (for a large amount of assets)? Would CTI (class-table inheritance) be a better solution here?

aspect
  • 53
  • 3

1 Answers1

7

I'd say that one of the biggest cons of using STI is that if you ever add a column to the table that is not shared (and that it means the same thing) between ALL models of the STI, then you've just blown your data integrity right there.

Null fields in (relational) databases are something that generally causes more problems than they solve.

I've beene bitten by this a couple of times, and it gets especially frustrating when the classes in your STI relation start having subclasses of their own, which in turn add even more columns to the table.

I'd say if you want to make this structure as good as possible, I really think CTI is a much better alternative, even though it can be a bit tricky to get it to work with rails. It's a lot trickier than STI at least.

At the top of my head I can just come to think of one scenario where STI might actually be reasonable, and that's when you're dealing with transaction models (for example deposits and withdrawals from a bank account or so). In those cases, both models are essentially the same, apart from the "direction" of the transaction.

It can also be argued that STI is "good" for fast prototyping, and if you just want to put something together fast just to see whether it works at all, you could use STI, but as soon as you start adding columns that doesn't make sense to all models in the relation, you should probably refactor it into CTI or something else.

Frost
  • 11,121
  • 3
  • 37
  • 44
  • The problem is that prototypes have the annoying habit of turning into real "mission critical" applications without the recommended re-architecting and such. I tend to think that STI is almost always a mistake, you shouldn't build your foundations out of sand. – mu is too short Jan 16 '12 at 18:56
  • @muistooshort I know that story! Logistically this could be a tough change to make on a busy site with large tables. Otherwise, STI => CTI does seem like a pretty straight forward change though? I am leaning to using STI because there doesn't seem to be a good strong CTI solution out there. Also, my queries are mostly on the "base class", only specializing when rendering views (eg. use ` – aspect Jan 16 '12 at 19:56
  • I will say though. STI still bugs me. If these were not persisted objects I would avoid inheritance. I would just have them implement a common interface. For example they may all respond_to `#url`. And, to handle the "hand-sorted collection" of mixed types of assets, no problem, that would just be a matter of an array in Ruby. – aspect Jan 16 '12 at 20:11
  • Martin to avoid the NULL columns issues, what do you think about STI with a PostgreSQL hstore column for the subclass attributes? It wouldn't suit every case, but I think it could be OK for me. If you aren't familiar with hstore, it's a hashmap/key-value column type that you can index on and query on. I believe Rails even deserializes hstore columns too. – aspect Jan 16 '12 at 20:23
  • @aspect That does sound slightly better, but I would probably just avoid STI all together anyway. That might be because I've felt the pain it can cause (and my boss at the time told me we didn't have the time to refactor it properly). As mu is too short said, prototypes tend to turn mission critical, which is bad. When they contain STI, they're generally just bombs waiting to go off. – Frost Jan 16 '12 at 22:41
  • I remember http://mediumexposure.com/multiple-table-inheritance-active-record/ to be very interesting when I read it, while we're on the subject. – Frost Jan 16 '12 at 22:46
  • Would you still avoid STI if the main queries are: "Get all assets for user": `select * from assets where user_id = x;` and "Get a hand-positioned arrangement of assets": `select a.* from arrangements s join assets a on s.asset_id = a.id where s.id = ? order by s.position asc;`? It seems like breaking up into separate tables would make those queries difficult. I guess I could UNION them into an SQL view that looks like an STI table, then make the arrangements table use a polymorphic association `arrangeable_id, arrangeable_type` instead of `asset_id`? Is that right, and is it better than STI? – aspect Jan 16 '12 at 23:17
  • That does seem a bit tedious, but I would still try to avoid STI. I'd basically ask myself this question: Do they have the same data model, and are they that closely related? If so, then I *might consider* using STI. Otherwise, no, I would avoid it like the plague. – Frost Jan 16 '12 at 23:26
  • muchas gracias both for all of your input – aspect Jan 17 '12 at 00:00