Sometimes, you’ll have to reprocess data. Either because source data was incomplete at process time or there was a change requiring the reprocess. In these cases, you may need to perform an INSERT INTO SELECT FROM statement. Often when you do this, you’ll run into duplicate key errors. To overcome this, you’ll need to add ON DUPLICATE KEY UPDATE syntax to the end of your insert. The twist is that you’ll have to summarize the column values between the original table and the updates table. Problem is, that MySQL hates to do this between tables with the same column names. I’ll show you how to stop pulling your hair out.
The Problem – Error 1052
It seems like a simple deal. According to MySQL’s documentation, you alias the table and column values. This doesn’t actually work as one would expect as mysql> will return:
ERROR 1052 (23000): Column 'xxxxxxxxx' in field list is ambiguous -- substitute your field name here...
I’ve purposely omitted the typical erroneous INSERT statement construction to prevent confusion.
The Solution
Given your tables are:
CREATE TABLE tbl_original ( source_id INT UNSIGNED NOT NULL, fname VARCHAR(10), total_values INT UNSIGNED NOT NULL, created_datetime DATETIME NOT NULL, PRIMARY KEY(source_id,created_datetime)); CREATE TABLE tbl_updates ( source_id INT UNSIGNED NOT NULL, fname VARCHAR(10), total_values INT UNSIGNED NOT NULL, created_datetime DATETIME NOT NULL, PRIMARY KEY(source_id,created_datetime));
Your insert statement would be:
INSERT INTO tbl_original SELECT source_id, fname, total_values, created_datetime FROM tbl_updates ON DUPLICATE KEY UPDATE tbl_original.total_values = tbl_original.total_values + VALUES(total_values);
Thus, your session should appear as….
mysql> CREATE TABLE tbl_original (
-> source_id INT UNSIGNED NOT NULL,
-> fname VARCHAR(10),
-> total_values INT UNSIGNED NOT NULL,
-> created_datetime DATETIME NOT NULL,
-> PRIMARY KEY(source_id,created_datetime));
Query OK, 0 rows affected (1.41 sec)
mysql> CREATE TABLE tbl_updates (
-> source_id INT UNSIGNED NOT NULL,
-> fname VARCHAR(10),
-> total_values INT UNSIGNED NOT NULL,
-> created_datetime DATETIME NOT NULL,
-> PRIMARY KEY(source_id,created_datetime));
Query OK, 0 rows affected (0.20 sec)
mysql> INSERT INTO tbl_original VALUES (25,'No Matter',100,'2010-10-27 17:00:00');
Query OK, 1 row affected (0.26 sec)
mysql> INSERT INTO tbl_updates VALUES (25,'No Matter',100,'2010-10-27 17:00:00');
Query OK, 1 row affected (0.03 sec)
mysql> SELECT * FROM tbl_original;
+-----------+-----------+--------------+---------------------+
| source_id | fname | total_values | created_datetime |
+-----------+-----------+--------------+---------------------+
| 25 | No Matter | 100 | 2010-10-27 17:00:00 |
+-----------+-----------+--------------+---------------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM tbl_updates;
+-----------+-----------+--------------+---------------------+
| source_id | fname | total_values | created_datetime |
+-----------+-----------+--------------+---------------------+
| 25 | No Matter | 100 | 2010-10-27 17:00:00 |
+-----------+-----------+--------------+---------------------+
1 row in set (0.00 sec)
mysql> INSERT INTO tbl_original
-> SELECT
-> source_id,
-> fname,
-> total_values,
-> created_datetime
-> FROM tbl_updates
-> ON DUPLICATE KEY UPDATE
-> tbl_original.total_values = tbl_original.total_values + VALUES(total_values);
Query OK, 2 rows affected (0.11 sec)
Records: 1 Duplicates: 1 Warnings: 0
mysql> SELECT * FROM tbl_original;
+-----------+-----------+--------------+---------------------+
| source_id | fname | total_values | created_datetime |
+-----------+-----------+--------------+---------------------+
| 25 | No Matter | 200 | 2010-10-27 17:00:00 |
+-----------+-----------+--------------+---------------------+
1 row in set (0.00 sec)
Tweet



Discussion
No comments for “MySQL Insert On Duplicate Update Madness”