I saw this question online reading “How do I identify duplicate rows in MySQL” and it’s been gnawing at me so here’s my approach:
Make a duplicate of your subject matter
CREATE TABLE a2 LIKE a1; INSERT INTO a2 SELECT * FROM a1;
Add a ‘checksum’ column to the new table
ALTER TABLE a2 ADD checksum VARCHAR(100) DEFAULT NULL;
Iterate your rows, and create a checksum of the data
You can use any language here, so the basic pseudo code is:
QUERY ALL ROWS FOR EACH ROW CREATE A STRING CONTAINING ALL VALUES GENERATE A CHECKSUM VALUE OF STRING UPDATE TABLE SETTING checksum COLUMN EQUAL TO CHECKSUM VALUE WHERE id MATCHES CURRENT ROW
Query for matching checksums
SELECT COUNT(*) AS qty, checksum FROM A2 GROUP BY checksum HAVING qty > 1;
That should produce a report of all duplicate rows.
Discussion
No comments for “Finding duplicate rows in MySQL”