This article contains information on how to delete duplicate rows in a table using a PL/SQL procedure or block.
The duplicate row problem occurs only when primary keys are not declared on table.
Procedures
Create a table test with two fields as follows:
SQL> CREATE TABLE test( col1 NUMBER, col2 VARCHAR2(10));
Insert the following records into the test table as follows:
INSERT INTO test VALUES(10, ‘asd’);
INSERT INTO test VALUES(10, ‘asd’);
INSERT INTO test VALUES(10, ‘asd’);
INSERT INTO test VALUES(20, ‘zx’);
INSERT INTO test VALUES(30, ‘zx’);
When you select the data from test table it appears as follows:
SQL> select * from test;
COL1 COL2
——— ———-
10 asd
10 asd
10 asd
20 zx
30 zx
Example 1
This procedure deletes duplicate records with respect to
‘col1’ as follows:
SQL> @ dup1.sql
CREATE OR REPLACE PROCEDURE dup1
AS
BEGIN
DELETE FROM test WHERE
ROWID NOT IN ( SELECT MIN(ROWID) FROM test
GROUP BY col1);
END dup1;
/
SHOW ERRORS
SQL> exec DUP1 gives the following :
SQL> select * from test;
COL1 COL2
—– ———-
10 asd
20 zx
30 zx
Example 2
This procedure deletes duplicate records with respect to
‘col2’ as follows:
SQL> @ dup2.sql
CREATE OR REPLACE PROCEDURE dup2
AS
BEGIN
DELETE FROM test WHERE
ROWID NOT IN ( SELECT MIN(ROWID) FROM test
GROUP BY col2);
END dup2;
/
SHOW ERRORS
SQL> exec DUP2 gives the following:
SQL> select * from test;
COL1 COL2
——— ———-
10 asd
20 zx
Example 3
This procedure deletes duplicate records with respect to
both ‘col1’ and ‘col2’ as follows:
CREATE OR REPLACE PROCEDURE dup12
AS
BEGIN
DELETE FROM test WHERE
ROWID NOT IN ( SELECT MIN(ROWID) FROM test
GROUP BY col1, col2);
END dup12;
/
SHOW ERRORS
SQL> exec DUP12 gives the following:
SQL> select * from test;
COL1 COL2
——— ———-
10 asd
20 zx
30 zx
*************************************************
This is another example.
— This script is one SQL syntax to delete/select duplicate records
— from a table.
DELETE (or SELECT *) FROM TABLE1 TABLE1_ALIAS1
WHERE EXISTS (SELECT 1 FROM TABLE1 TABLE1_ALIAS2
WHERE TABLE1_ALIAS1.FIELD1 = TABLE1_ALIAS2.FIELD1
AND TABLE1_ALIAS1.ROWID < TABLE1_ALIAS2.ROWID)
ORDER BY FIELD1;
— This script simulates the DISTINCT function and
— can be used in Forms applications.
SELECT * FROM T1 T1_ALIAS1
WHERE NOT EXISTS (SELECT 1 FROM T1 T1_ALIAS2
WHERE T1_ALIAS1.rowid != T1_ALIAS2.rowid
AND T1_ALIAS1.rowid < T1_ALIAS2.rowid
AND T1_ALIAS1.field1 = T1_ALIAS2.field1)
ORDER BY field1;
Now a Faster Way
Using the previous example is OK with a table that does not have millions of records, but if you do have a table with lots and lots of records you may have a problem with rollback segments, so how about instead of just removing the duplicates, create a new table with just the records you need in it.
Here’s how to do it:
create table t2
as
select col1, col2, etc
from
(select t.*,
row_number() over
(partition by cust_seq_nbr
order by rowid) rn
from t
)
where rn = 1