Finding duplicate rows in MySQL

90% sarcasm, the other half is a learning experience

Search

// you’re reading...

MySQL

Finding duplicate rows in MySQL

By Randy ⋅ October 21, 2014 ⋅ Post a comment

Filed Under mysql, query, sql

I saw this question online reading “How do I identify duplicate rows in MySQL” and it’s been gnawing at me so here’s my approach:

Make a duplicate of your subject matter

CREATE TABLE a2 LIKE a1;
INSERT INTO a2 SELECT * FROM a1;

Add a ‘checksum’ column to the new table

ALTER TABLE a2 ADD checksum VARCHAR(100) DEFAULT NULL;

Iterate your rows, and create a checksum of the data

You can use any language here, so the basic pseudo code is:

QUERY ALL ROWS
FOR EACH ROW
    CREATE A STRING CONTAINING ALL VALUES
    GENERATE A CHECKSUM VALUE OF STRING
    UPDATE TABLE SETTING checksum COLUMN EQUAL TO CHECKSUM VALUE WHERE id MATCHES CURRENT ROW

Query for matching checksums

SELECT COUNT(*) AS qty, checksum 
FROM A2
GROUP BY checksum HAVING qty > 1;

That should produce a report of all duplicate rows.

Search

MySQL

Finding duplicate rows in MySQL

Make a duplicate of your subject matter

Add a ‘checksum’ column to the new table

Iterate your rows, and create a checksum of the data

Query for matching checksums

Like this:

Related

Discussion

No comments for “Finding duplicate rows in MySQL”

Post a comment

Categories

Pages

@randymelder

Search

MySQL

Finding duplicate rows in MySQL

Make a duplicate of your subject matter

Add a ‘checksum’ column to the new table

Iterate your rows, and create a checksum of the data

Query for matching checksums

Share:

Like this:

Related

Discussion

No comments for “Finding duplicate rows in MySQL”

Post a comment

Categories

Pages

Tags

@randymelder