SQL Query to Convert an Integer to Year Month and Days

SQL Query to Delete Duplicate Rows

Last Updated : 14 Apr, 2025

Duplicate rows in a database can cause inaccurate results, waste storage space, and slow down queries. Cleaning duplicate records from our database is an essential maintenance task for ensuring data accuracy and performance. Duplicate rows in a SQL table can lead to data inconsistencies and performance issues, making it crucial to identify and remove them effectively

In this article, we will explain the process of deleting duplicate rows from a SQL table step-by-step, using SQL Server, with examples and outputs. We'll cover techniques using GROUP BY, CTE, and more, incorporating best practices to help us effectively handle duplicates.

What Are Duplicate Rows?

Duplicate rows are records in a database that have identical values in one or more columns. These rows often arise due to issues like multiple imports, user errors, or missing constraints like primary keys or unique indexes. SQL query to delete duplicate rows typically involves identifying duplicates using functions like ROW_NUMBER() or COUNT() and making sure that only one copy of each record is kept in the table. If not handled properly, duplicates can lead to:

Inaccurate Data Reporting: Reports may contain false information.
Storage Waste: Redundant records consume unnecessary space.
Decreased Query Performance: Queries on large tables with duplicates may perform poorly.

Why You Should Remove Duplicate Rows

Data Integrity: Duplicates can distort reports and analyses, leading to incorrect insights.
Optimal Performance: Redundant data can slow down queries, especially when dealing with large datasets.
Efficient Storage: Removing duplicates helps optimize storage usage, keeping your database lean.

Creating the Sample Table and Inserting Data

To effectively remove duplicate rows in SQL, we can follow a structured approach. Let’s begin by creating a table called DETAILS and populating it with some sample data, including duplicate rows.

Step 1: Create the Sample Table

We will create a table named DETAILS to demonstrate how to identify and delete duplicate rows. This step helps in setting up the necessary structure to store sample data and perform operations like detecting duplicates and applying deletion techniques.

Query:

CREATE TABLE DETAILS (
    SN INT IDENTITY(1,1) PRIMARY KEY,
    EMPNAME VARCHAR(25) NOT NULL,
    DEPT VARCHAR(20) NOT NULL,
    CONTACTNO BIGINT NOT NULL,
    CITY VARCHAR(15) NOT NULL
);

Step 2: Insert Data into the Table

Let’s insert some data, including duplicates, into the DETAILS table. This step allows us to copy real-world scenarios where duplicate records might occur, enabling us to demonstrate how to identify and remove them effectively.

Query:

INSERT INTO DETAILS (EMPNAME, DEPT, CONTACTNO, CITY)
VALUES 
    ('VISHAL', 'SALES', 9193458625, 'GAZIABAD'),
    ('VIPIN', 'MANAGER', 7352158944, 'BAREILLY'),
    ('ROHIT', 'IT', 7830246946, 'KANPUR'),
    ('RAHUL', 'MARKETING', 9635688441, 'MEERUT'),
    ('SANJAY', 'SALES', 9149335694, 'MORADABAD'),
    ('VIPIN', 'MANAGER', 7352158944, 'BAREILLY'),
    ('VISHAL', 'SALES', 9193458625, 'GAZIABAD'),
    ('AMAN', 'IT', 78359941265, 'RAMPUR');

Output

How to Identify Duplicate Rows

We Use the GROUP BY clause with the COUNT(*) function to find rows with duplicate values. This step helps us group the records by specific columns and count how many times each combination occurs, making it easier to identify duplicates that appear more than once in the table.

Query:

SELECT EMPNAME, DEPT, CONTACTNO, CITY, 
COUNT(*) FROM DETAILS
GROUP BY EMPNAME, DEPT, CONTACTNO, CITY
HAVING COUNT(*)>1

Output

Explanation: This query will return the duplicate records based on the combination of EMPNAME, DEPT, CONTACTNO, and CITY.

Methods to Delete Duplicate Rows in SQL

There are several ways to delete duplicate rows in SQL. Here, we will explain five methods to handle this task effectively.

Method 1: Using GROUP BY and COUNT()

Use the GROUP BY clause along with MIN(SN) to retain one unique row for each duplicate group. This method identifies the first occurrence of each duplicate combination based on the SN (serial number) and deletes the other duplicate rows.

Query:

DELETE FROM DETAILS
WHERE SN NOT IN (
    SELECT MIN(SN)
    FROM DETAILS
    GROUP BY EMPNAME, DEPT, CONTACTNO, CITY
);
Select * FROM DETAILS;

Output

SN	EMPNAME	DEPT	CONTACTNO	CITY
1	VISHAL	SALES	9193458625	GAZIABAD
2	VIPIN	MANAGER	7352158944	BAREILLY
3	ROHIT	IT	7830246946	KANPUR
4	RAHUL	MARKETING	9635688441	MEERUT
5	SANJAY	SALES	9149335694	MORADABAD
8	AMAN	IT	78359941265	RAMPUR

Method 2: Using `ROW_NUMBER()`

The ROW_NUMBER() function provides a more elegant and flexible solution. This window function assigns a unique number to each row within a partition (group of duplicates). We can delete rows where the row number is greater than 1.

Query:

WITH CTE AS (
    SELECT SN, EMPNAME, DEPT, CONTACTNO, CITY,
           ROW_NUMBER() OVER (PARTITION BY EMPNAME, DEPT, CONTACTNO, CITY ORDER BY SN) AS RowNum
    FROM DETAILS
)
DELETE FROM CTE WHERE RowNum > 1;

Output

SN	EMPNAME	DEPT	CONTACTNO	CITY
1	VISHAL	SALES	9193458625	GAZIABAD
2	VIPIN	MANAGER	7352158944	BAREILLY
3	ROHIT	IT	7830246946	KANPUR
4	RAHUL	MARKETING	9635688441	MEERUT
5	SANJAY	SALES	9149335694	MORADABAD
8	AMAN	IT	78359941265	RAMPUR

Method 3: Using Common Table Expressions (CTEs)

Using a Common Table Expression (CTE), we can delete duplicates in a more structured way. CTEs provide a cleaner approach by allowing us to define a temporary result set that can be referenced within the DELETE statement. This method can be more readable and maintainable, especially when dealing with complex queries.

Query:

WITH CTE AS (
    SELECT SN, EMPNAME, DEPT, CONTACTNO, CITY,
           ROW_NUMBER() OVER (PARTITION BY EMPNAME, DEPT, CONTACTNO, CITY ORDER BY SN) AS RowNum
    FROM DETAILS
)
DELETE FROM CTE WHERE RowNum > 1;

Output

SN	EMPNAME	DEPT	CONTACTNO	CITY
1	VISHAL	SALES	9193458625	GAZIABAD
2	VIPIN	MANAGER	7352158944	BAREILLY
3	ROHIT	IT	7830246946	KANPUR
4	RAHUL	MARKETING	9635688441	MEERUT
5	SANJAY	SALES	9149335694	MORADABAD
8	AMAN	IT	78359941265	RAMPUR

Method 4: Using Temporary Tables

You can create a temporary table to hold unique records and then replace the original table with the new, clean data.

Steps:

Insert unique rows into a temporary table.
Truncate the original table.
Insert the unique rows back.

Query:

SELECT DISTINCT EMPNAME, DEPT, CONTACTNO, CITY
INTO #TempTable
FROM DETAILS;

TRUNCATE TABLE DETAILS;

INSERT INTO DETAILS (EMPNAME, DEPT, CONTACTNO, CITY)
SELECT EMPNAME, DEPT, CONTACTNO, CITY
FROM #TempTable;

DROP TABLE #TempTable;

Output

SN	EMPNAME	DEPT	CONTACTNO	CITY
1	VISHAL	SALES	9193458625	GAZIABAD
2	VIPIN	MANAGER	7352158944	BAREILLY
3	ROHIT	IT	7830246946	KANPUR
4	RAHUL	MARKETING	9635688441	MEERUT
5	SANJAY	SALES	9149335694	MORADABAD
8	AMAN	IT	78359941265	RAMPUR

Method 5: Using `DISTINCT` with `INSERT INTO`

You can use DISTINCT to select only unique rows and then insert them back into the original table, effectively deleting duplicates.

Query:

DELETE FROM DETAILS;

INSERT INTO DETAILS (EMPNAME, DEPT, CONTACTNO, CITY)
SELECT DISTINCT EMPNAME, DEPT, CONTACTNO, CITY
FROM DETAILS;

Output

SN	EMPNAME	DEPT	CONTACTNO	CITY
1	VISHAL	SALES	9193458625	GAZIABAD
2	VIPIN	MANAGER	7352158944	BAREILLY
3	ROHIT	IT	7830246946	KANPUR
4	RAHUL	MARKETING	9635688441	MEERUT
5	SANJAY	SALES	9149335694	MORADABAD
8	AMAN	IT	78359941265	RAMPUR

Best Practices to Prevent Duplicates

While identifying and removing duplicates is essential, preventing them is even better. Here are some best practices to ensure that duplicates don’t enter your database in the first place:

Use Primary Keys or Unique Constraints: These ensure that each record is unique, preventing accidental duplication.
Data Validation: Implement validation rules in your application to prevent duplicate entries.
Indexing: Create unique indexes on columns that must remain unique, like contact numbers or email addresses.
Regular Data Cleaning: Periodically run data-cleaning queries to identify and remove any newly inserted duplicates.

Conclusion

Duplicate rows in SQL databases can negatively impact performance and data accuracy. Using methods like GROUP BY, ROW_NUMBER(), and CTE, we can efficiently delete duplicate rows in SQL while retaining unique records. Always test your queries on a backup or development environment to ensure accuracy before applying them to production databases. By Using these methods, we can confidently remove duplicate rows in SQL, keeping our database clean and reliable.

SQL Query to Convert an Integer to Year Month and Days

M

Improve

Article Tags :

Similar Reads

Python MySQL Connector is a Python driver that helps to integrate Python and MySQL. This Python MySQL library allows the conversion between Python and MySQL data types. MySQL Connector API is implemented using pure Python and does not require any third-party library.Â This Python MySQL tutorial will

How to install MySQL Connector Package in Python

MySQL is a Relational Database Management System (RDBMS) whereas the structured Query Language (SQL) is the language used for handling the RDBMS using commands i.e Creating, Inserting, Updating and Deleting the data from the databases. A connector is employed when we have to use MySQL with other pro

Connect MySQL database using MySQL-Connector Python

While working with Python we need to work with databases, they may be of different types like MySQL, SQLite, NoSQL, etc. In this article, we will be looking forward to how to connect MySQL databases using MySQL Connector/Python.MySQL Connector module of Python is used to connect MySQL databases with

How to Install MySQLdb module for Python in Linux?

In this article, we are discussing how to connect to the MySQL database module for python in Linux. MySQLdb is an interface for connecting to a MySQL database server from Python. It implements the Python Database API v2.0 and is built on top of the MySQL C API. Installing MySQLdb module for Python o

Boutique Management System using Python-MySQL Connectivity

In this article, we are going to make a simple project on a boutique management system using Python MySql connectivity. Introduction This is a boutique management system made using MySQL connectivity with Python. It uses a MySQL database to store data in the form of tables and to maintain a proper r