stormlolz

SQL delete duplicate rows

8 posts in this topic

#1 ·  Posted

Hello,

 

I have an Acces database who's alimented by queries in my script

 

But I have many duplicate rows, I need to delete the duplicate rows by a SQL querie

 

How can I do ?

 

table name  : "ParVoie"

Duplicates to delete are duplicates values on two Col : "NumVoie" & "NbPrises"

ex : my table here, two rows need to be delete

id NumVoie NbPrises
1 10 10
2 10 10
3 85 2
4 85 1
5 25BIS 2
6 25BIS 2

 

I have try something like that :

 

$adoConCOMPARE.Execute ("DELETE FROM ParVoie WHERE NumVoie IN (SELECT COUNT(*) AS NumVoie, NbPrises FROM ParVoie GROUP BY NumVoie, NbPrises HAVING COUNT(*) > 1)")

 

Thank you

Share this post


Link to post
Share on other sites



#3 ·  Posted

Try that (standard SQL):

delete
from
  nodups
where
  id in (
    select
      id
    from (
      select id,
        count(*)
      from
        nodups
      group by
        NumVoie, NbPrises
      having
        count(*) > 1
    ) 
  )
;

That will keep a random row in each subset of duplicates. For instance, SQLite will yield:

ID  NumVoie NbPrises
1   10      10
3   85      2
4   85      1
5   25BIS   2

Yet any engine is free to select rowids 1 and 5 for deletion, instead of 2 and 6.

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#4 ·  Posted

Thank you for your answer

 

There is nothing deleted with this query

 

$adoConCOMPARE.Execute ("delete from ParVoie where id in (select id from (select id, count(*) from ParVoie group by NumVoie, NbPrises having count(*) > 1))")

 

Share this post


Link to post
Share on other sites

#5 ·  Posted

That's weird. The statement will produce the posted result given the exemple content when submitted to most SQL RDBMS engines. Only catch is that certain engines refuse to yield a non-deterministic result.

select id, count(*) from ParVoie group by NumVoie, NbPrises having count(*) > 1

is not deterministic in that only one ID would result after the group by. Non-deterministically picking one ID among all those in a group is a no-no for some, go figure. That's why these engines forbid selecting any column not part of the group by clause. It looks like Access is one of them.

There are two kinds of workaround for that. First, you can create a temporary table, which is pretty doable if applied to a relatively small table; second, you can build a more complex "delete from select" statement not based on ID but on the couple of NumVoie, NbPrises values (you can regard this as a compound primary key). The syntax may use or not a CTE (With ...) but I don't know enough of Access specific SQL support to guide you further.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#6 ·  Posted

Stormlolz,

If you can use a "results" table this works in SQLite...

#cs
    id  NumVoie  NbPrises
    1  10  10
    2  10  10
    3  85  2
    4  85  1
    5  25BIS  2
    6  25BIS  2
#ce

#include <array.au3>
#include <sqlite.au3>

Local $aTblData[6][3] = [ _
        [1, 10, 10], _
        [2, 10, 10], _
        [3, 85, 2], _
        [4, 85, 1], _
        [5, '"25BIS"', 2], _
        [6, '"25BIS"', 2] _
        ]

;_arraydisplay($aTblData)

Local $db = @ScriptDir & '\dropdups.db3'
_SQLite_Startup()
Local $hDB = _SQLite_Open($db)
If Not $hDB Then _fini()
OnAutoItExitRegister('_fini')
ConsoleWrite("_SQLite_LibVersion=" & _SQLite_LibVersion() & @CRLF)

$hDB = _SQLite_Open($db)
If Not $hDB Then _fini()

Local $ret = _SQLite_Exec(-1, 'drop table if exists withdups; CREATE TABLE if not exists [WithDups] (XID, NumVoie, NbPrises);')
If $ret <> $SQLITE_OK Then Exit (MsgBox(0, 'SQLITE ERROR', 'Create WithDups Table Failed'))

For $1 = 0 To UBound($aTblData) - 1
    _SQLite_Exec($hDB, 'insert into WithDups values(' & $aTblData[$1][0] & ', ' & $aTblData[$1][1] & ', ' & $aTblData[$1][2] & ');')
    If @error Then Exit MsgBox(0, '', _SQLite_ErrMsg())
Next

Local $ret = _SQLite_Exec(-1, 'drop table if exists NoDups; CREATE TABLE NoDups AS SELECT XID, NumVoie, NbPrises FROM WithDups group by NumVoie, NbPrises;')
If $ret <> $SQLITE_OK Then Exit (MsgBox(0, 'SQLITE ERROR', 'Create NoDups Table Failed'))

Func _fini()
    _SQLite_Close($hDB)
    _SQLite_Shutdown()
    Exit
EndFunc   ;==>_fini

Although I don't understand what JCHD is saying about "deterministic" and "non-deterministic" selection.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

#7 ·  Posted

@kylomas,

When you "group by" along some criterion (say using columns A, B & C), you treat as many rows that have been grouped as only one row. Refering to a column X not made part of the group by criterion, you'd need to non-determiniscally pick up some X among the various rows that have been grouped. Using the exemple given, you may select to delete rows 1 & 5, or 1 & 3, or 2 & 5, or 2 & 6 without jeopardizing the correctness of the result. This choice depends on deep guts (say "internal behavior") of a particular engine, so it isn't deterministic. Some DB engines forbid such queries, to satisfy their reluctance to non-determinism.

The same kind of non-determinism occurs when you request: select * from sometable limit 1

Without an order by clause, you have no clue which row will get returned, provided the table has more than one row, because SQL deals with sets and sets have no inherent order. SQL makes no specification at all about how to select the resulting row and, worse, an engine is fully entitled to return a different row at every invokation of that query, even if you issue the query repeatedly without any other external use of the DB. Of course engines generally avoid adding code to randomize such result but SQLite has a specific pragma to force returning rows in the reverse order of what it normally does, just to remind users that an order by clause is the only correct way to get results in the "expected order" (i.e. there is no "expected order" by default).


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#8 ·  Posted

JCHD...thanks

 


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now