These are the very common questions to all of us. This post tells you how to prevent duplicates in MySql while creating tables, how to find duplicates in MySql if we missed to prevent while creating table and how to delete them if duplicate records exists.
Sometimes duplicates are allowed in tables, sometimes duplicates should be restrict like username, email in user information table. Here I’m going to give some steps and procedures for preventing duplicates and finding duplicates and deleting duplicates.
Open MySql client in terminal
mysql -u[username] -p[password]
CREATE SCHEMA test;
Use Created Schema
Preventing Duplicates in Table when it being Created
You can use PRIMARY KEY or UNIQUE Constraint to appropriate fields for preventing duplicates when table being created. let’s take ‘user_info’ table as a example, I am going to explain these questions with this example table (user_info).
CREATE TABLE user_info ( username CHAR(50), email CHAR(100) );
This table allows duplicate entries, because we didn’t use any preventing constraints like PRIMARY KEY or UNIQUE Constraint.
Use Drop MySql Command for Dropping table if you created ‘user_info’ table using above MySql table create statement.
DROP TABLE user_info;
Create Table with PRIMARY KEY
CREATE TABLE user_info ( username CHAR(50) NOT NULL, email CHAR(100) NOT NULL, PRIMARY KEY (username, email) );
Or Create Table with UNIQUE Constraint
CREATE TABLE user_info ( username CHAR(50) NOT NULL, email CHAR(100) NOT NULL, UNIQUE (username, email) );
The above both MySql table create statements with PRIMARY KEY and UNIQUE Constraint are prevent or restrict duplicate entries in a user_info table. When you do this the indexed columns must be NOT NULL. Because the PRIMARY KEY or UNIQUE Constraint never allow NULL values in these fields.
Use INSERT IGNORE instead of INSERT. When you use INSERT IGNORE query to insert data in user_info table, it insert data in normal way if it doesn’t exist already. Else the data is already existing, it tells MySql database engine to keep silent. MySql engine doesn’t raise any error during this insertions.
mysql> INSERT IGNORE INTO user_info (username, email) -> VALUES ('Stephen', 'email@example.com'); Query OK, 1 row affected (0.03 sec) mysql> INSERT IGNORE INTO user_info (username, email) -> VALUES ('Stephen', 'firstname.lastname@example.org'); Query OK, 0 rows affected (0.00 sec)
If you use INSERT query when you insert data in user_info table, MySql database engine does raise error as follows if record is already existing.
mysql> INSERT INTO user_info (username, email) -> VALUES ('Stephen', 'email@example.com'); ERROR 1062 (23000): Duplicate entry 'Stephenfirstname.lastname@example.org' for key 'username'
Use REPLACE query instead of INSERT, if you want to keep record updated with latest one. Keep in mind, INSERT IGNORE or INSERT query has a behavior that It keeps the existing record as a original. But REPLACE query change the old record with new one. Use both where it is required.
mysql> REPLACE INTO user_info (username, email) -> VALUES ('Stephen Anto', 'email@example.com'); Query OK, 1 row affected (0.03 sec) mysql> REPLACE INTO user_info (username, email) -> VALUES ('Stephen Anto', 'firstname.lastname@example.org'); Query OK, 1 row affected (0.05 sec)
Fine, All above are prevent duplicates initially. But some how we missed to add PRIMARY KEY or UNIQUE Constraint in MySql table create statement when we create table. In this case what will we do. Again it is very simple unless otherwise that particular table not a foreign key table to other tables.
How to list out duplicate records in a table?
For this example I’ve used above first MySql table create statement with no PRIMARY KEY or UNIQUE Constraint. And inserted records using INSERT query. Use INNER JOIN for getting duplicates records in user_info table.
mysql> SELECT a.username, a.email FROM user_info a -> INNER JOIN (SELECT username FROM user_info GROUP BY username -> HAVING count(id) > 1) b ON a.username = b.username;
| id | username | email |
| 1 | stephen | email@example.com |
| 2 | stephen | firstname.lastname@example.org |
| 3 | anto | email@example.com |
| 4 | anto | firstname.lastname@example.org |
| 5 | joy | email@example.com |
| 6 | joy | firstname.lastname@example.org |
The above records clearly shows there are some duplicates entries. To remove those duplicate records, we do create temp table and insert distinct records of user_info table in to tmp table. And DROP user_info table then RENAME tmp table to user_info. (Here, I assume that this user_info table never used/referred to any other table as foreign key table).
mysql> CREATE TABLE tmp SELECT username, email FROM user_info; -> GROUP BY (username, email); mysql> DROP TABLE user_info; mysql> ALTER TABLE tmp RENAME TO user_info;
After did above actions, do ALTER Table with PRIMARY KEY or UNIQUE Constraint for prevent duplicate entries in future. Always keep in mind, add PRIMARY KEY or UNIQUE Constraint for appropriate fields when you create table initially. It will save lots of time .
mysql> ALTER IGNORE TABLE user_info -> ADD PRIMARY KEY (username, email);
This post may very useful to you. Thank you.