2

Recently a team I support encountered an issue with a value which they were storing in (and retrieving from) a MySQL database. They tell me that they were storing a comma delimited value in a (VARCHAR(255)?) column.

They reported that their value was being silently truncated (their code is mostly Ruby, some other processing is done in Scala) and asked us to double the size of that column.

Naturally I suggested that they normalize ... store each of these values (formerly comma-delimited) in separate rows and use a SELECT to fetch the set (which formerly would have been a single row). They pushed back and insist that they never want these separated values for any other sort of database query and that they'll never need the field to be wider than the new (512) width.

I then suggested that they at least add a sentinel marking the end of their string (a trailing semicolon in particular) and check for that on all fetches. They have implemented that and are happy with the approach.

My questions:

  • How did their data get silently truncated? Is there some setting in MySQL or in their DB drivers for Ruby that's suppressing the error?
  • What are the pros and cons of this sentinel terminated value vs. a more normalized approach?
  • Is there some term of art or nickname for this sentinel terminated value?
  • What would be a better way to address their issue?
Jim Dennis
  • 17,054
  • 13
  • 68
  • 116

2 Answers2

1

It's actually pretty easy to "miss" truncation in MySQL, but also pretty easy to get it to sternly notify you instead of trying to be helpful.

From the MySQL manual:

If strict SQL mode is not enabled and you assign a value to a CHAR or VARCHAR column that exceeds the column's maximum length, the value is truncated to fit and a warning is generated. For truncation of nonspace characters, you can cause an error to occur (rather than a warning) and suppress insertion of the value by using strict SQL mode. See Section 5.1.6, “Server SQL Modes”.

As the manual indicates, you can set one of MySQL's several "strict" modes to cause an error to be raised in this situation (and many others that would otherwise cause silent truncation or replacement values).

John Flatness
  • 32,469
  • 5
  • 79
  • 81
1

If you don't have strict mode enabled in your database, MySQL will truncate strings that don't fit in a varchar(n) column and produce only a warning about the truncation. If you enable strict mode, then you will get an error (and more peace of mind) in such cases.

The big problem with your sentinel value is that they're only checking it on reads. That means that they can (and certainly will) end up storing invalid/broken data but they won't know about the problem until the try to unpack that data; but, when they're unpacking the data, it will be too late to recover what's been lost.

There are various better approaches:

  1. Properly normalized tables.
  2. Enable strict mode in MySQL.
  3. Pre-insert or pre-update length validation.
  4. If they're actually using Rails, then perhaps they could use serialize to convert the data to/from (cringe) YAML automatically; they'd have to use a larger TEXT column type in this case and they'd still have a truncation problem.

So 2 and 3 should be done immediately, 1 would be best, 4 might be a middle-ground option if they're afraid of (or don't understand) normalization.

mu is too short
  • 426,620
  • 70
  • 833
  • 800
  • Is there a way to enable a strict (warning check mode?) in the Ruby drivers? How would one access the SHOW WARNINGS functionality of the MySQL shell from within Ruby (over a DB connection)? – Jim Dennis Apr 25 '12 at 02:28
  • @JimDennis: That depends on how they're connecting to MySQL; AFAIK you just need to get a `SET sql_mode = ...` to MySQL before you do anything with the connection. I think putting the setting in the MySQL config would be best, that way the "some other processing is done in Scala" couldn't cause problems. – mu is too short Apr 25 '12 at 02:34