You have a table with duplicate rows – somehow a unique index didn’t get
created and a bug has added duplicate records to your table. :(
Create the table:
Let's add some data now:
On doing a simple select query we get the result as:
The highlighted columns shows that there is duplicacy in the table.
Now to remove the duplicates and fix the indexing issue. We will use ALTER TABLE which will help me add the UNIQUE INDEX along with IGNORE keyword, which will help me fix the duplicates.
The official documentation of MySQL says:
Now let's apply the UNIQUE INDEX using ALTER TABLE command:
We get the output something like:
On doing a simple select query we get the result as:
The highlighted columns shows that the duplicate column is no longer in tabler.
And we have achieved the removal of duplicates from the table and also fixed the UNIQUE INDEX in the table to save ourselves from any further future accidents like these. :)
NOTE:
For some versions of MySQL, this solution does not work. This is basically due to engine being used:
In order to fix the same (Thanks to Kethan for the update):
Let us create a scenario:
Create the table:
CREATE TABLE `t1` (`yearcol` varchar(255) default NULL, `firstname` varchar(255) NOT NULL);
Let's add some data now:
insert into t1 values ('2010', 'ABC'); insert into t1 values ('2010', 'ABC'); insert into t1 values ('2010', 'XYZ'); insert into t1 values ('2010', 'PQR');
On doing a simple select query we get the result as:
The highlighted columns shows that there is duplicacy in the table.
select * from t1;
The result generated is:
yearcol | firstname |
---|---|
2010 | ABC |
2010 | ABC |
2010 | XYZ |
2010 | PQR |
Now to remove the duplicates and fix the indexing issue. We will use ALTER TABLE which will help me add the UNIQUE INDEX along with IGNORE keyword, which will help me fix the duplicates.
The official documentation of MySQL says:
IGNORE
is a MySQL extension to standard SQL. It controls howALTER TABLE
works if there are duplicates on unique keys in the new table or if warnings occur when strict mode is enabled. IfIGNORE
is not specified, the copy is aborted and rolled back if duplicate-key errors occur. IfIGNORE
is specified, only the first row is used of rows with duplicates on a unique key. The other conflicting rows are deleted. Incorrect values are truncated to the closest matching acceptable value.
Now let's apply the UNIQUE INDEX using ALTER TABLE command:
ALTER IGNORE TABLE t1 ADD UNIQUE INDEX dup_idx (year_col, firstname);
We get the output something like:
mysql> ALTER IGNORE TABLE t1 ADD UNIQUE INDEX dup_idx (yearcol, firstname); Query OK, 4 rows affected (0.14 sec) Records: 4 Duplicates: 1 Warnings: 0
On doing a simple select query we get the result as:
The highlighted columns shows that the duplicate column is no longer in tabler.
select * from t1;
The result generated is:
yearcol | firstname |
---|---|
2010 | ABC |
2010 | XYZ |
2010 | PQR |
And we have achieved the removal of duplicates from the table and also fixed the UNIQUE INDEX in the table to save ourselves from any further future accidents like these. :)
NOTE:
For some versions of MySQL, this solution does not work. This is basically due to engine being used:
In order to fix the same (Thanks to Kethan for the update):
ALTER TABLE t1 ENGINE MyISAM; ALTER IGNORE TABLE t1 ADD UNIQUE INDEX dup_idx (yearcol, firstname); ALTER TABLE t1 ENGINE InnoDB;
Good one.
ReplyDeleteI am getting the following error
ReplyDelete0 16:48:58 select * from t1 ALTER Error Code: 1064. You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ALTER' at line 3
Hi Kethan,
DeleteI think you are missing the delimiter here.
Try:
select * from t1;
And for altering use: (If you are following the sample)
ALTER IGNORE TABLE t1 ADD UNIQUE INDEX dup_idx (year_col, firstname);
Nope I found the answer after some googling :)
DeleteThis was due to a bug in Innodb for some versions of mysql. The workaround is to convert the engine and then back to original.
ALTER TABLE t1 ENGINE MyISAM;
ALTER IGNORE TABLE t1 ADD UNIQUE INDEX dupidx (field);
ALTER TABLE t1 ENGINE InnoDB;
Yes, this is a bug with MySQL. Glad you got the answer.
DeleteThanks for this tutorial :)
DeleteThanks for sharing the error scenario, I have updated the blog accordingly.
DeleteThis comment has been removed by the author.
ReplyDeleteGood one, thanks Himanshu,Kethan
ReplyDelete