-1

I am about to convert a series of MySQL databases to the utf8mb4 character set. Most of the databases are using the utf8 character set.

This article points out two potential problem areas that should be addressed prior to conversion:

  1. Character maximum lengths of string types are reduced (example: tinytext drops from 85 to 63 characters)
  2. Index key maximum character widths are reduced (191 down from 255) because maximum key lengths are 767 bytes. This maximum was raised to 3072 bytes in MySQL 5.7.7 according to this page but the situation would still occur if keys are larger than 768 characters.

Are there scripts or programs that exist that I can use to check the databases for these issues? I would like to find all of the potential problems prior to attempting the conversion.

Follow-up question: Are there any other scenarios that should be identified prior to a utf8mb4 character set conversion?

Alf47
  • 563
  • 1
  • 5
  • 12
  • Try it by yourself! – SaidbakR Jun 24 '20 at 18:07
  • 1
    find length of index in mysql: https://stackoverflow.com/questions/781873/how-to-figure-out-size-of-indexes-in-mysql – Luuk Jun 24 '20 at 18:28
  • for question 1, type something like "character max length mysql utf8mb4" in the beautiful search box on top of this page. – Luuk Jun 24 '20 at 18:30
  • @Luuk thanks for that. I did come across that post and the difference I believe is that is showing me the total size of an index on a table whereas I'd like to know what the maximum possible key size is for a row. I guess I can just join on information_schema.columns and sum up the total maximum character_octet_length for character columns and write some kind of case statement for other types then divide by 4...? – Alf47 Jun 24 '20 at 18:44

1 Answers1

1
  1. Dump the schema only via mysqldump --no-data
  2. Load that schema in another database or server.
  3. Write all the ALTERs that you will ultimately need.
  4. Performs the ALTERs. This will show you what problems you will have.

Other tips:

  • Don't use TINYTEXT for anything. Either use a VARCHAR or a bigger TEXT.
  • See this for 5 workarounds for the 767 problem you described: http://mysql.rjweb.org/doc.php/limits#767_limit_in_innodb_indexes
  • That link also provides a tip on creating the set of ALTER...CONVERT TO that you need.
  • When using CONVERT TO, keep in mind that it changes all the text columns; this may not be desired if you have some columns that are deliberately ascii such as postal_code, md5, uuid, etc.
Rick James
  • 135,179
  • 13
  • 127
  • 222