MySQL Best Practice: Forget ALTER, use RENAME instead

MySQL is notorious for a few bummer scenarios. One of them is when you have a table with a couple hundred thousand rows or more and an indexed column or two, and you’ve been asked to add a column to meet a data model requirement. You know this will cause an outage or at least a headache while MySQL copies the table contents into the new structure.

One way to handle it is to perform the atomic ALTER manually. Instead of using the ALTER TABLE command, follow these basic steps:

  1. Identify the objective of the ALTER and determine that it’s necessary.
  2. Get a picture of what current table structure is and how many rows you’re dealing with.
  3. Next, create a new table that includes the changes that would be implemented in your ALTER TABLE table_name statement.
  4. Next, write a script to loop through your new table and your old table and compare values row by row. This will come in handy later.
  5. Then, copy a few rows at a time from a set that are least likely to change to your new table. Tip: this is where those “last_updated DATETIME” columns come in handy.
  6. Once you’ve got all the old rows copied, take your service offline for maintenance, and copy the balance of rows.
  7. Since all the data has been copied, replacing your old structure is as simple as renaming your old table to table_BACKUP and your new table to table_name.
  8. Test, Test, Test and bring your service back online!

Now, let’s break it down:

1. Identify the objective of the ALTER and determine that it’s necessary. In this case, I want to repartition. Why do an alter???

#Adding partitions...
ALTER TABLE dim_datetime 
REORGANIZE PARTITION (P5) INTO 
PARTITION P5 VALUES LESS THAN (2011), 
PARTITION P6 VALUES LESS THAN (MAXVALUE);

2. Get a picture of what current table structure is and how many rows you’re dealing with.

mysql> SHOW CREATE TABLE dim_datetime\G
*************************** 1. row ***************************
       Table: dim_datetime
Create Table: CREATE TABLE `dim_datetime` (
  `date_key` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `date_datetime` datetime NOT NULL,
  `day_name` enum('SUNDAY','MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY','SATURDAY') DEFAULT NULL,
  `month_name` enum('JANUARY','FEBRUARY','MARCH','APRIL','MAY','JUNE','JULY','AUGUST','SEPTEMBER','OCTOBER','NOVEMBER','DECEMBER') DEFAULT NULL,
  `quarter` enum('Q1','Q2','Q3','Q4') DEFAULT NULL,
  `is_weekend` tinyint(3) unsigned DEFAULT NULL,
  `is_holiday` tinyint(3) unsigned DEFAULT NULL,
  PRIMARY KEY (`date_key`,`date_datetime`),
  UNIQUE KEY `date_datetime` (`date_datetime`)
) ENGINE=InnoDB AUTO_INCREMENT=219000 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (YEAR(date_datetime))
(PARTITION P1 VALUES LESS THAN (2007) ENGINE = InnoDB,
 PARTITION P2 VALUES LESS THAN (2008) ENGINE = InnoDB,
 PARTITION P3 VALUES LESS THAN (2009) ENGINE = InnoDB,
 PARTITION P4 VALUES LESS THAN (2010) ENGINE = InnoDB,
 PARTITION P5 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.01 sec)
mysql> SELECT COUNT(*) FROM dim_datetime;
+----------+
| COUNT(*) |
+----------+
|   218999 |
+----------+
1 row in set (0.46 sec)

3. Next, create a new table that includes the changes that would be implemented in your ALTER TABLE table_name statement.

CREATE TABLE `dim_datetime_NEW` (
  `date_key` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `date_datetime` datetime NOT NULL,
  `day_name` enum('SUNDAY','MONDAY','TUESDAY','WEDNESDAY','THURSDAY','FRIDAY','SATURDAY') DEFAULT NULL,
  `month_name` enum('JANUARY','FEBRUARY','MARCH','APRIL','MAY','JUNE','JULY','AUGUST','SEPTEMBER','OCTOBER','NOVEMBER','DECEMBER') DEFAULT NULL,
  `quarter` enum('Q1','Q2','Q3','Q4') DEFAULT NULL,
  `is_weekend` tinyint(3) unsigned DEFAULT NULL,
  `is_holiday` tinyint(3) unsigned DEFAULT NULL,
  PRIMARY KEY (`date_key`,`date_datetime`),
  UNIQUE KEY `date_datetime` (`date_datetime`)
) ENGINE=InnoDB AUTO_INCREMENT=219000 DEFAULT CHARSET=latin1
/*!50100 PARTITION BY RANGE (YEAR(date_datetime))
(PARTITION P1 VALUES LESS THAN (2007) ENGINE = InnoDB,
 PARTITION P2 VALUES LESS THAN (2008) ENGINE = InnoDB,
 PARTITION P3 VALUES LESS THAN (2009) ENGINE = InnoDB,
 PARTITION P4 VALUES LESS THAN (2010) ENGINE = InnoDB,
 PARTITION P5 VALUES LESS THAN (2011) ENGINE = InnoDB,
 PARTITION P6 VALUES LESS THAN (2012) ENGINE = InnoDB,
 PARTITION P7 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
;

4. Next, write a script to loop through your new table and your old table and compare values row by row. This will come in handy later.

SELECT d1.*, d2.* 
FROM dim_datetime d1  
LEFT JOIN dim_datetime_NEW d2 ON (d1.date_key = d2.date_key) 
WHERE (d1.date_datetime != d2.date_datetime) 
OR (d2.date_datetime IS NULL)
OR (d1.day_name != d2.day_name)  
OR (d2.day_name IS NULL)
OR (d1.month_name != d2.month_name) 
OR (d2.month_name IS NULL) 
OR (d1.quarter != d2.quarter)  
OR (d2.quarter IS NULL)
OR (d1.is_weekend != d2.is_weekend)  
OR (d2.is_weekend IS NULL)
OR (d1.is_holiday != d2.is_holiday) 
OR (d2.is_holiday IS NULL);

5. Then, copy a few rows at a time from a set that are least likely to change to your new table. Tip: this is where those “last_updated DATETIME” columns come in handy.

mysql> INSERT INTO dim_datetime_NEW SELECT * FROM dim_datetime WHERE date_datetime < '2009-01-01 00:00:00';
Query OK, 26303 rows affected (1.35 sec)
Records: 26303  Duplicates: 0  Warnings: 0
-- leave out a couple years for this example.....
mysql> INSERT INTO dim_datetime_NEW SELECT * FROM dim_datetime WHERE date_datetime > '2010-12-31 23:59:59';
Query OK, 175176 rows affected (8.11 sec)
Records: 175176  Duplicates: 0  Warnings: 0

6. Once you’ve got all the old rows copied, take your service offline for maintenance, and copy the balance of rows.

mysql> INSERT INTO dim_datetime_NEW SELECT * FROM dim_datetime WHERE date_datetime BETWEEN '2009-01-01 00:00:00' AND '2010-12-31 23:59:59';
Query OK, 17520 rows affected (2.22 sec)
Records: 17520  Duplicates: 0  Warnings: 0

Remember that script you wrote back in step 4?

mysql> SELECT d1.*, d2.* 
    -> FROM dim_datetime d1  
    -> LEFT JOIN dim_datetime_NEW d2 ON (d1.date_key = d2.date_key) 
    -> WHERE (d1.date_datetime != d2.date_datetime) 
    -> OR (d2.date_datetime IS NULL)
    -> OR (d1.day_name != d2.day_name)  
    -> OR (d2.day_name IS NULL)
    -> OR (d1.month_name != d2.month_name) 
    -> OR (d2.month_name IS NULL) 
    -> OR (d1.quarter != d2.quarter)  
    -> OR (d2.quarter IS NULL)
    -> OR (d1.is_weekend != d2.is_weekend)  
    -> OR (d2.is_weekend IS NULL)
    -> OR (d1.is_holiday != d2.is_holiday) 
    -> OR (d2.is_holiday IS NULL);
Empty set (10.94 sec)

That’s what you’re after.
7. Since all the data has been copied, replacing your old structure is as simple as renaming your old table to table_BACKUP and your new table to table_name.

mysql> RENAME TABLE dim_datetime TO dim_datetime_ORIG;
Query OK, 0 rows affected (0.06 sec)
mysql> RENAME TABLE dim_datetime_NEW TO dim_datetime;
Query OK, 0 rows affected (0.03 sec)
mysql> COMMIT;
Query OK, 0 rows affected (0.00 sec)

8. Test, Test, Test and bring your service back online!

Had we run the original ALTER, the minimum impact would have been:

SET AUTOCOMMIT = 0;
ALTER TABLE dim_datetime_ORIG 
REORGANIZE PARTITION P5 INTO (
PARTITION P5 VALUES LESS THAN (2011), 
PARTITION P6 VALUES LESS THAN (2012), 
PARTITION P7 VALUES LESS THAN (MAXVALUE)
);
Query OK, 183936 rows affected (9.75 sec)
Records: 183936  Duplicates: 0  Warnings: 0

Of course this example would have only caused a read lock for < 10 seconds, but even so running in a production environment, a table lock could cause an outage, or deadlock or whatever. Using RENAME is much quicker and easier to recover from than ALTER.