This article contains information on how to delete duplicate rows in a table using a PL/SQL procedure or block.

The duplicate row problem occurs only when primary keys are not declared on table.

Procedures

Create a table test with two fields as follows:

SQL> CREATE TABLE test( col1 NUMBER, col2 VARCHAR2(10));

Insert the following records into the test table as follows:

INSERT INTO test VALUES(10, ‘asd’);
INSERT INTO test VALUES(10, ‘asd’);
INSERT INTO test VALUES(10, ‘asd’);
INSERT INTO test VALUES(20, ‘zx’);
INSERT INTO test VALUES(30, ‘zx’);

When you select the data from test table it appears as follows:

SQL> select * from test;

COL1 COL2
——— ———-
10 asd
10 asd
10 asd
20 zx
30 zx

Example 1

This procedure deletes duplicate records with respect to
‘col1’ as follows:

SQL> @ dup1.sql

CREATE OR REPLACE PROCEDURE dup1
AS

BEGIN
DELETE FROM test WHERE
ROWID NOT IN ( SELECT MIN(ROWID) FROM test
GROUP BY col1);
END dup1;
/
SHOW ERRORS

SQL> exec DUP1 gives the following :

SQL> select * from test;

COL1 COL2
—– ———-
10 asd
20 zx
30 zx

Example 2

This procedure deletes duplicate records with respect to
‘col2’ as follows:

SQL> @ dup2.sql

CREATE OR REPLACE PROCEDURE dup2
AS

BEGIN
DELETE FROM test WHERE
ROWID NOT IN ( SELECT MIN(ROWID) FROM test
GROUP BY col2);
END dup2;
/
SHOW ERRORS

SQL> exec DUP2 gives the following:

SQL> select * from test;

COL1 COL2
——— ———-
10 asd
20 zx

Example 3

This procedure deletes duplicate records with respect to
both ‘col1’ and ‘col2’ as follows:

CREATE OR REPLACE PROCEDURE dup12
AS

BEGIN
DELETE FROM test WHERE
ROWID NOT IN ( SELECT MIN(ROWID) FROM test
GROUP BY col1, col2);
END dup12;
/
SHOW ERRORS

SQL> exec DUP12 gives the following:

SQL> select * from test;

COL1 COL2
——— ———-
10 asd
20 zx
30 zx

*************************************************
This is another example.

— This script is one SQL syntax to delete/select duplicate records
— from a table.

DELETE (or SELECT *) FROM TABLE1 TABLE1_ALIAS1
WHERE EXISTS (SELECT 1 FROM TABLE1 TABLE1_ALIAS2
WHERE TABLE1_ALIAS1.FIELD1 = TABLE1_ALIAS2.FIELD1
AND TABLE1_ALIAS1.ROWID < TABLE1_ALIAS2.ROWID)
ORDER BY FIELD1;

— This script simulates the DISTINCT function and
— can be used in Forms applications.

SELECT * FROM T1 T1_ALIAS1
WHERE NOT EXISTS (SELECT 1 FROM T1 T1_ALIAS2
WHERE T1_ALIAS1.rowid != T1_ALIAS2.rowid
AND T1_ALIAS1.rowid < T1_ALIAS2.rowid
AND T1_ALIAS1.field1 = T1_ALIAS2.field1)
ORDER BY field1;

Now a Faster Way

Using the previous example is OK with a table that does not have millions of records, but if you do have a table with lots and lots of records you may have a problem with rollback segments, so how about instead of just removing the duplicates, create a new table with just the records you need in it.

Here’s how to do it:

create table t2
as
select col1, col2, etc
from

(select t.*,
row_number() over
(partition by cust_seq_nbr
order by rowid) rn
from t
)

where rn = 1

How to delete duplicate rows

Leave a Reply

Your email address will not be published. Required fields are marked *

five + 4 =