Cacomania: Remove duplicate values from a mysql table


Remove duplicate values from a mysql table

Guido Krömer - 25. October 2013 - Tags:

I needed a way to clean up a MySQL table containing many duplicate rows because of a missing UNIQUE KEY.

One solution would be a script which searches and removes those dups, but depending on the number of rows this would be a heavy lifting job. A better solution is performing this task completely in the database without moving the data into an application and the sanitized data back to the database.

You do not need any PL/SQL…, five simple SQL statements can handle the task. A new table with the structure of the table containing the dubs, with a unique index preventing the dups, has to be created. The rows from the old table has to be inserted into the new one, by using the IGNORE keyword the query will not stop when reaching a duplicate entry, those dups gets just ignored. The last step is replacing the new table with the old one.

CREATE TABLE `table_without_dups` LIKE `my_table`;

ALTER TABLE `table_without_dups`
ADD UNIQUE `my_unique_key` (`col_1`, `col_n`);

INSERT IGNORE INTO `table_without_dups`
SELECT * FROM `my_table`;

DROP TABLE `my_table`;
RENAME TABLE `table_without_dups` TO `my_table`;

I hope my small posting helped you, feel free to leave a comment.