1

Info: I am using AWS RDS Mysql 5.6.34 (500GB) instance (Without replica, just the master)

Note: Binlog is enabled & set to Row

Target: Modify a column field_1 from enum to tinyint

Extra info: I am using Rails application. So everytime I wanted to add a value to the enum, I need to write a migration. So converting the enum field to tinyint so I can add or delete a enum value without writing a migration using Active Enum

Other info: I also tried LHM but the RDS instance went out of memory at 93%

Database info before running gh-ost:

mysql> select table_schema, sum((data_length+index_length)/1024/1024) AS MB from information_schema.tables group by 1;
+--------------------+-----------------+
| table_schema       | MB              |
+--------------------+-----------------+
| information_schema |      0.00976560 |
| mysql              |      5.96277428 |
| performance_schema |      0.00000000 |
| my_app_db          | 223941.79882818 |
+--------------------+-----------------+

Size of the original table before gh-ost: (showing only the table needed to be modified from the list)

mysql> SELECT table_name AS `Table`, round(((data_length + index_length) / 1024 / 1024), 2) `Size (MB)` FROM information_schema.TABLES WHERE table_schema = "my_app_db";
+----------------------------------------+-----------+
| Table                                  | Size (MB) |
+----------------------------------------+-----------+
| table_abc                              |     70.41 |
| my_table                               |  86058.73 |

Started the migration:

gh-ost \
--user="user" \
--password="password" \
--host="my-endpoint.rds.amazonaws.com" \
--database="my_app_db" \
--table="my_table" \
--alter="MODIFY field_1 TINYINT(2) DEFAULT 1 NOT NULL" \
--assume-rbr \
--allow-on-master \
--verbose \
--execute

When the migration was nearly 93% complete, the RDS free-memory went down to 20GB. So I stopped gh-ost.

Database info after stopping gh-ost:

mysql> select table_schema, sum((data_length+index_length)/1024/1024) AS MB from information_schema.tables group by 1;
+--------------------+-----------------+
| table_schema       | MB              |
+--------------------+-----------------+
| information_schema |      0.00976560 |
| mysql              |      5.96277428 |
| performance_schema |      0.00000000 |
| my_app_db          | 446299.17968756 |
+--------------------+-----------------+

Size of the original table after stopping gh-ost:

mysql> SELECT table_name AS `Table`, round(((data_length + index_length) / 1024 / 1024), 2) `Size (MB)` FROM information_schema.TABLES WHERE table_schema = "my_app_db";
+----------------------------------------+-----------+
| Table                                  | Size (MB) |
+----------------------------------------+-----------+
| _my_table_ghc                       |      0.41 |
| _my_table_gho                       | 273157.00 |
| my_table                            |  85011.62 |

Question:

Why is the gh-ost table many times larger than the original table?

If more info is required about the table, index or database, I can provide :)

This is the link to the issue created in gh-ost repo: https://github.com/github/gh-ost/issues/890

  • Have you checked tabke defintions and record counts in the two tables? – Shadow Nov 01 '20 at 09:59
  • What happens if you use `OPTIMIZE TABLE table_name`? – O. Jones Nov 01 '20 at 11:50
  • @O.Jones Optimizing locks the table. I have a table with 230 million records. So optimizing will take a lot of time & I can't do that. – Dineshkumar Raman Nov 02 '20 at 06:54
  • @Shadow Yes. The table definitions are identical. Records count in ghost/copy table is lesser because only 93% of the records were copied before I cancelled the copying process. Records count in original table: 229699180 Records count in copy table: 175215000 I have also checked the words count https://github.com/github/gh-ost/issues/890#issuecomment-720231952 – Dineshkumar Raman Nov 02 '20 at 07:10

1 Answers1

1

I created the mysql DB from a backup of production db.

Production had innodb_file_format parameter as Barracuda

The new environment had innodb_file_format parameter as Antelope

The ROW_FORMAT for the table in production was COMPRESSED

Unfortunately Antelope db doesn't support ROW_FORMAT as COMPRESSED

If I had looked more into the details of information_schema, I could have resolved it eatlier!