// you’re reading...

MySQL

Finding duplicate rows in MySQL

I saw this question online reading “How do I identify duplicate rows in MySQL” and it’s been gnawing at me so here’s my approach:

Make a duplicate of your subject matter

CREATE TABLE a2 LIKE a1;
INSERT INTO a2 SELECT * FROM a1;

Add a ‘checksum’ column to the new table

ALTER TABLE a2 ADD checksum VARCHAR(100) DEFAULT NULL;

Iterate your rows, and create a checksum of the data

You can use any language here, so the basic pseudo code is:

QUERY ALL ROWS
FOR EACH ROW
    CREATE A STRING CONTAINING ALL VALUES
    GENERATE A CHECKSUM VALUE OF STRING
    UPDATE TABLE SETTING checksum COLUMN EQUAL TO CHECKSUM VALUE WHERE id MATCHES CURRENT ROW

Query for matching checksums

SELECT COUNT(*) AS qty, checksum 
FROM A2
GROUP BY checksum HAVING qty > 1; 

That should produce a report of all duplicate rows.

Discussion

No comments for “Finding duplicate rows in MySQL”

Post a comment

Help support my site and buy a domain name at http://domainsemailhosting.com/

%d bloggers like this: