14

I've been reading through some of guides on database optimization and best practices and a lot of them suggest not using boolean flags at all in the DB schema (ex http://forge.mysql.com/wiki/Top10SQLPerformanceTips). However, they never provide any reason as to why this is bad. Is it a peformance issue? is it hard to index or query properly?

Furthermore, if boolean flags are bad, what should you use to store boolean values in a database? Is it better to store boolean flags as an integer and use a bitmask? This seems like it would be less readable.

David Chanin
  • 533
  • 6
  • 17
  • 7
    Never take seriously anyone who will just tell you "Don't use boolean flags" or "Use Indexes" without any reason. – cherouvim Jun 15 '10 at 04:43
  • 1
    Seems like they don't provide reasons for *anything* on that page. – animuson Jun 15 '10 at 04:44
  • OK, to be fair these are notes from an event/camp. Still I don't know why true/false type of fields are bad. – cherouvim Jun 15 '10 at 04:45
  • I could see how having an index of a boolean field would be a really bad idea. – Omnifarious May 18 '11 at 07:35
  • @Omnifarious Why is that so bad idea? – Vladislav Rastrusny Sep 02 '11 at 18:22
  • 1
    @FractalizeR - It would be an index with two key values each mapping to hundreds of thousands of records. Or, if the distribution were lopsided, one key value mapping to a small number of records and the other to hundreds of thousands. The value of such an index is dubious at best. It also represents a degenerate case that I bet most indexing engines handle very poorly. I could see though, in the lopsided case, where being able to find all the records that have the unusual value might be helpful and faster with an index than a table scan. – Omnifarious Sep 02 '11 at 19:43
  • 2
    I am late to this posting, but I know one legitamate reason to not use a boolean would be cross-DB compatibility. For instance, I don't think MS SQL has a "boolean" type, only a BIT. There is also the indexing issue pointed out by @Omnifarious, but I am not sure if it matter if you are not indexing on the boolean. – CodeChimp Jan 11 '13 at 13:34

4 Answers4

7

The only reason I could think of would be cases where you should use ENUM instead. Sure, you only want true and false now, but if you'd want to add something else later than you'd need to do an ALTER TABLE operation, which could be very expensive.

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • 1
    Enums rock no matter how they are implemented (enum, varchar, int): http://www.mysqlperformanceblog.com/2008/01/24/enum-fields-vs-varchar-vs-int-joined-table-what-is-faster/ – cherouvim Jun 15 '10 at 04:49
  • 1
    @cherouvim: I don't see why BOOL types need be implemented any differently than ENUM types. They're just integers underneath. – Billy ONeal Jun 15 '10 at 04:52
  • @cherouvim: But it does have plain integers. Booleans and enums are simply wrappers around integers. Therefore there should be little if any performance difference between them. EDIT: This was in response to a now deleted comment. – Billy ONeal Jun 15 '10 at 04:57
  • @Billy: MySQL store ENUMs as bitfields (efficiently), but at the same time it stores BOOL fields as integers (inefficiently). BOOL in MySQL is just a synonym for TINYINT(1). – Vladislav Rastrusny Sep 01 '11 at 14:44
  • @FractalizeR: I don't see what you mean. At the end of the day they both end up as integers. (The smallest possible bitfield is an integer) – Billy ONeal Sep 01 '11 at 15:34
  • @Billy Ops, sorry, I misinterpreted documentation. – Vladislav Rastrusny Sep 01 '11 at 19:19
6

I don't think it is bad and I've never seen a reason stated for this either. Perhaps some old database engines couldn't store them efficiently, but modern ones do. As you say, it's a lot more readable to use booleans than bitmasks. See this question for a similar discussion: Is adding a bit mask to all tables in a database useful?

Community
  • 1
  • 1
EMP
  • 59,148
  • 53
  • 164
  • 220
1

My guess: portability of your design.

e.g.

  1. Microsoft Access treats boolean as -1 as true or 0 as false while other databases may treat boolean differently.

  2. In MySQL (version 4+) on the other hand, value of zero is considered false. Non-zero values are considered true.

Syd
  • 1,526
  • 1
  • 15
  • 16
0

Granted database practice has little to do with theory, I'll still attempt theoretical explanation. Tables are finite relations. Each relation is an extension of predicate. A Boolean attribute is a misnomer for a predicate.

Tegiri Nenashi
  • 3,066
  • 2
  • 19
  • 20
  • Well, may be my english or math is bad, but your post sounded like mumbo-jumbo to me ;) Also how boolean attribute differs from any other non-boolean attribute, that can have only two values? Also - boolean attributes does not form any relations in the same way as integers does not, I think. – Vladislav Rastrusny Sep 01 '11 at 14:49
  • The analogs of boolean TRUE and FALSE are TABLE_DEE and TABLE_DUM. There is nothing wrong with the idea of nested relations, of course, but the other point of view is that boolean values are already present in RDBMS, even if there is no explicit boolean domain with boolean values. – Tegiri Nenashi Sep 25 '12 at 00:56