14

I've been working mostly with Oracle for the past few years, and am quite used to seeing single character varchar columns used as boolean values.

I can also see (per stack overflow answers), that suggested type for MySQL is TINYINT.

Now I've taken on my little side project - using DerbyDB, and it supports BOOLEAN columns, but not until after version 10 or so.

So, the question is, why is it so hard to incorporate a BOOLEAN column while designing a relational database? Am I missing something, or is it just pushed down the to-do list as unimportant, since you can use another column type meanwhile?

pafau k.
  • 1,667
  • 12
  • 20
  • And to add to your list SQL Server only supports `bit` rather than the `ANSI` boolean datatype. – Martin Smith Jul 15 '12 at 13:33
  • yeah and with all that you made me pretty excited about the quantic bit implementation in databases... – Sebas Jul 15 '12 at 19:00
  • This debate was fought in C++ and other languages and won by `bool` proponents a long time ago and most DBMSes support it in one way or another too. Unfortunately, some are still lagging behind, most notably Oracle. Frankly, all explanations I heard so far as to why don't make a whole lot of sense to me... – Branko Dimitrijevic Jul 15 '12 at 19:16
  • 2
    +1; a very good question. I would like to see an authoritative answer to this. – Carl Manaster Jul 15 '12 at 19:17

4 Answers4

14

In the case of Derby, specifically, the answer is a bit of strange history: Derby, the open source database, was once called Cloudscape, and was a proprietary product. At that time, it fully supported BOOLEAN.

Subsequently, Cloudscape was purchased by Informix which was purchased by IBM, and IBM engineering decided to make Derby compatible with DB2. The reason for this was that, if the two databases were compatible, it would be easier for users to migrate their applications between Derby databases and DB2 databases. The engineering staff, however, did not remove the non-DB2-compatible features from Derby, they simply disabled them in the SQL grammar, leaving most of the implementation in place.

Subsequently, IBM open-sourced Cloudscape to the Apache Software Foundation, naming it Derby. The open source community, no longer bound by the requirement that Derby be completely compatible with DB2, decided to revive the BOOLEAN datatype support. And so Derby now has BOOLEAN datatype support.

Bryan Pendleton
  • 16,128
  • 3
  • 32
  • 56
7

Tom Kyte pretty much echoes your last sentence in this blog entry:

"It just isn't a type we have -- I can say no more and no less. ANSI doesn't have it -- many databases don't have it (we are certainly not alone). In the grand scheme of things -- I would say the priotization of this is pretty "low" (thats my opinion there)."

He's speaking from the Oracle perspective, but it applies to any relational RDBMS.

Priyank Doshi
  • 12,895
  • 18
  • 59
  • 82
DCookie
  • 42,630
  • 11
  • 83
  • 92
  • 4
    Which for me translates to: "We don't give a shit about developers, since they can't switch products anyway" – Jens Schauder Jul 15 '12 at 18:08
  • @JensSchauder: to me it also translates to "*We don't care what the ANSI standard says*". Tom's statement is from 2002 and the `BOOLEAN` type was introduced in SQL99. –  Jul 16 '12 at 07:41
3

PostgreSQL does have support for boolean for as long as I can think.

The oldest online doc I can find is for version 6.3 released 1998-03-01. They mention the boolean type:

http://www.postgresql.org/docs/6.3/static/c0805.htm

In later docs they mention SQL99 as the standard they follow.

Since SQL99 seems to mention this type I would assume, that many DBs did have support for that type quite well before 1999.

A.H.
  • 63,967
  • 15
  • 92
  • 126
0

I don't know as I haven't designed one, but my guess would be that since RDBMS's are about describing and storing sets of things, boolean fields aren't needed because they would also denote what is in a set, but they are extraneous as the membership of sets will be derived from the actual data or structure of the database.

As an example, take a boolean column for roles given to employees where they're either managers, or they're not. You could use a boolean column to describe this, but what you should do is either have a table for managers, and a table for non managers, or (and this would be the more flexible and probably more manageable way) create an extra "look up" table that gives roles (as a single text column) and and key that is then referred to (a foreign key) in the employees table.


I think I should add that most times you see a boolean field in a table it's a code smell, as it will may hit performance - to use a boolean in a where clause would invoke a table scan and make an indexes on the table fairly pointless (but see the comments for a further discussion of this). I'd hazard another guess that boolean data types have been added to most RDBMS's for use in their procedural language extensions (T-SQL, PLSQL) to help with the odd conditional statement that's required.

ian
  • 12,003
  • 9
  • 51
  • 107
  • 2
    A boolean in a where clause does *not* necessarily mean that a table scan will be used. That depends on how many rows the condition will return (as with any other condition) –  Jul 15 '12 at 14:58
  • @a_horse_with_no_name That's interesting, do you have useful link on that? – ian Jul 15 '12 at 15:03
  • @lain: not really but it's the general rule for index usage (non covering indexes): once the number of rows retrieved from a table exceeds a certain threshold the table scan is faster than the index scan. Consequently if the number of rows is small enough the index scan will be used. It doesn't matter what datatype the indexed column has. –  Jul 15 '12 at 16:49
  • 1
    In real world applications booleans or similar are pretty useful. Any user interface where there is a checkbox and you need to store the value the user entered for example. – Martin Smith Jul 15 '12 at 16:50
  • @Iain In the case of your "look up table" suggestion, the table with the roles would just have a boolean value anyway since the role is either manager or employee. – Michael Mior Jul 15 '12 at 19:04
  • @MichaelMior yes, but as I pointed out it would be more flexible as it can be extended easily. – ian Jul 15 '12 at 21:33
  • @Jens Schauder, that's interesting, but I'm unsure of your point, as according to the article it would produce a poor cardinality estimate. – ian Jul 15 '12 at 21:33
  • @MartinSmith Perhaps, but a checkbox should be a result or a null, as you can't tell if a user has actually made a choice when leaving a checkbox unticked. A radio button might be more appropriate if it only had two choices that must be selected. – ian Jul 15 '12 at 21:44
  • @lain it only produces a low cardinaltiy if it used alone. if used in combination with other conditions it will increase the cardinality by a factor of 2. Although I agree this might not be obvious from the article. – Jens Schauder Jul 16 '12 at 06:29
  • Due to histograms, an index might even get used with a single boolean condition: http://www.dba-oracle.com/art_otn_cbo_p4.htm – Jens Schauder Jul 16 '12 at 06:31
  • If a unchecked checkbox is actually a decision or the absence of a decision is a question of UI-design, psychology and philosophy. But it shouldn't cause you to introduce superfluent NULL values in your database – Jens Schauder Jul 16 '12 at 06:34
  • @a_horse_with_no_name thanks for clarifying, I updated the answer to take into account what you wrote. – ian Jul 16 '12 at 07:38
  • @JensSchauder thanks for adding that, although it was already covered by a_horse_with_no_name you provided some interesting (though Oracle centric) links. I do, however, disagree that a null would be superfluous in the case of absence of a decision if you've used a checkbox. Nulls are for unknowns. As my answer (and comment) points out, I would advocate a design where this isn't the case. – ian Jul 16 '12 at 07:42
  • @MichaelMior in case you felt I was saying that the boolean fields should be moved to a separate table (which was not my intention) I've clarified the answer. – ian Jul 16 '12 at 07:44