Iczer

SQLite - mark first file in the folder

5 posts in this topic

basically i need mark (set Mark_U1 = 5) first image file in the folder if this folder dose not contain any archives.

but it seems i cannot figure out right query for it:

Global $sSqlBuild_indexTable = '(ItemHash PRIMARY KEY,FullFilePath,FilePath,FileName,Extention,Mark_U1 INTEGER,Mark_U2 INTEGER)'
Global $sSqlFill_indexTable = '(ItemHash,FullFilePath,FilePath,FileName,Extention,Mark_U1,Mark_U2)'


_SQLite_Exec($hDB, "BEGIN TRANSACTION;UPDATE DataBase SET Mark_U1 = 0;COMMIT;" )


Local $sSQLquery1 = "UPDATE DataBase SET Mark_U1 = 3 WHERE  Extention LIKE 'jpeg' OR " & _
                                                                "Extention LIKE 'jpg' OR " & _
                                                                "Extention LIKE 'png' OR " & _
                                                                "Extention LIKE 'tiff' OR " & _
                                                                "Extention LIKE 'tif' OR " & _
                                                                "Extention LIKE 'gif' OR " & _
                                                                "Extention LIKE 'tga' OR " & _
                                                                "Extention LIKE 'bmp' OR " & _
                                                                "Extention LIKE 'pcx' OR " & _
                                                                "Extention LIKE 'jpe' OR " & _
                                                                "Extention LIKE 'jif' OR " & _
                                                                "Extention LIKE 'jfif' OR " & _
                                                                "Extention LIKE 'jfi' OR " & _
                                                                "Extention LIKE 'dib' OR " & _
                                                                "Extention LIKE 'tpic' AND Filepath NOT IN (SELECT Filepath FROM DataBase t2 WHERE t2.Filepath = DataBase.Filepath AND (" & _
                                                                                            "Extention LIKE 'zip' OR " & _
                                                                                            "Extention LIKE 'lha' OR " & _
                                                                                            "Extention LIKE 'lzh' OR " & _
                                                                                            "Extention LIKE 'rar' OR " & _
                                                                                            "Extention LIKE '7z'));"

Local $sSQLquery2 = "UPDATE DataBase SET Mark_U1 = 5 WHERE Mark_U1 = 3 AND FullFilePath = (SELECT FullFilePath FROM DataBase t2 WHERE t2.FilePath = DataBase.FilePath ORDER BY FileName ASC LIMIT 1);"





_SQLite_Exec($hDB, "BEGIN TRANSACTION;"&$sSQLquery1&";COMMIT;" )
_SQLite_Exec($hDB, "BEGIN TRANSACTION;"&$sSQLquery2&";COMMIT;" )

 

first query set Mark_U1 = 3 even if there a archives in the folder and it very time consuming...

 

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

_SQLite_Exec($hDB, "BEGIN TRANSACTION;UPDATE DataBase SET Mark_U1 = 0;COMMIT;" )

This first update is a single statement, hence it doesn't need being wrapped inside an explicit transaction (this is the autocommit mode) :

_SQLite_Exec($hDB, "UPDATE DataBase SET Mark_U1 = 0" )

In the second update (used to set MARK_U1 = 3) you need parenthesis around the OR block. As it stands, the AND condition is only tied to Extention LIKE 'tpic', which isn't what you clearly intend.

The you can make your update much simpler. First, use datatypes in your schema and make the extension case-insensitive:

Create table Files (
    ItemHash text PRIMARY KEY,   -- if you store the hash as binary, declare it as BLOB instead
    FullFilePath text collate nocase,   -- this column is redondant, being the same as FilePath || '\' || FileName || '.' || Extention
    FilePath text collate nocase,
    FileName text collate nocase,
    Extention text collate nocase,
    Mark_U1 INTEGER,
    Mark_U2 INTEGER
)

Then create an index to speed up the subquery:

create index ixExtPath on database (extention, filepath)  -- collate nocase is implied from table DDL

The query could then be:

UPDATE DataBase
SET
   Mark_U1 = 3
WHERE
     Extention in ('jpeg','jpg', 'png', 'tiff', 'tif', 'gif', 'tga', 'bmp', 'pcx', 'jpe', 'jif', 'jfif', 'jfi', 'dib', 'tpic')
     AND
     Filepath NOT IN (
              SELECT
                    Filepath
              FROM DataBase t2
              WHERE
                   t2.Filepath = DataBase.Filepath
                   AND
                   Extention in ('zip', 'lha', 'lzh', 'rar', '7z')                   
     )
)

The condition can be simplified (without speed gain):

UPDATE DataBase
SET
   Mark_U1 = 3
WHERE
     Extention in ('jpeg','jpg', 'png', 'tiff', 'tif', 'gif', 'tga', 'bmp', 'pcx', 'jpe', 'jif', 'jfif', 'jfi', 'dib', 'tpic')
     AND
     not exists (
              SELECT
                    1
              FROM DataBase t2
              WHERE
                   t2.Filepath = DataBase.Filepath
                   AND
                   Extention in ('zip', 'lha', 'lzh', 'rar', '7z')                   
     )
)

Note that you can instead set MAR_U1 dynamically at insert/update/delete time by using triggers.

Final note: maybe you should/could change your schema to a tree-like table for storing folders and files just like the filesystem does. With the advent of SQLite CTEs, recursion is no more a problem.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

thanks! the only concern is speed now

i'm reading about triggers and tree-like tables

Share this post


Link to post
Share on other sites

I don't know which are your complete requirements but FYI using this sample code:

Spoiler
#AutoIt3Wrapper_UseX64=n

#include <String.au3>
#include <File.au3>
#include <SQLite.au3>
;~ #include <SQLite.Dll.au3>
;~ #include "DbHelpers.au3"

Local $dll = _SQLite_Startup("c:\bin\sqlite3.dll")
Local $hDB

$hDB = _SQLite_Open("MyFiles.sq3")
_SQLite_Exec($hDB, "pragma page_size = 64536;")
_SQLite_Exec($hDB, "pragma cache_size = 10000;")
_SQLite_Exec($hDB, "pragma journal_mode = WAL;")

;~ ConsoleWrite(_SQLite_LibVersion() & @LF)
;~ Local $aRows, $iCols, $iRows
;~ _SQLite_GetTable2d($hDB, "pragma compile_options", $aRows, $iRows, $iCols)
;~ _SQLite_Display2DResult($aRows)

;~ _SQLite_Exec($hDB, "drop table if exists Files")
_SQLite_Exec($hDB, "create table if not exists Files (Id integer primary key, Parent text collate nocase, Filename text collate nocase, Extension text collate nocase default '', Mark_U1 int default 0, Mark_U2 int default 0, Hash text);")

Local $Time = TimerInit()
Local $sDir = "C:\Users\jc\"
Local $aFiles = _FileListToArrayRec($sDir, "*", $FLTAR_FILES, $FLTAR_RECUR, $FLTAR_FASTSORT, $FLTAR_FULLPATH)
If @error Then
    ConsoleWrite(@extended & @LF)
Else
    $Time = TimerDiff($Time)
    Local $iFiles = $aFiles[0]
    ConsoleWrite("Enumerating " & $iFiles & " files in array took " & Round($Time / 1000, 2) & "s" & " (" & Round($Time * 1000 / $iFiles, 1) & "µs per file)" & @LF)
    $Time = TimerInit()
    Local $sDrive, $sFolder, $sName, $sExt, $sParent
    _SQLite_Exec($hDB, "begin immediate")
    For $i = 1 To $iFiles
        _PathSplit($aFiles[$i], $sDrive, $sFolder, $sName, $sExt)
        _SQLite_Exec($hDB, "insert into " & _
                                "Files (" & _
                                    "Parent, " & _
                                    "Filename, " & _
                                    "Extension" & _
                                ") values (" & _
                                    X($sDrive & $sFolder) & _
                                    XX($sName) & _
                                    XX($sExt) & _
                                ")")
    Next
    _SQLite_Exec($hDB, "end")
    $Time = TimerDiff($Time)
    ConsoleWrite("Inserting " & $iFiles & " files in database took " & Round($Time / 1000, 2) & "s (" & Round($Time * 1000 / $iFiles, 1) & "µs per file)" & @LF)
    $Time = TimerInit()
    _SQLite_Exec($hDB, "create index if not exists ixParentExtension on Files (Parent, Extension)")
    _SQLite_Exec($hDB, _
        "update Files " & _
            "set " & _
                "Mark_U1 = 3 " & _
            "where " & _
                "Extension in ('.jpeg','.jpg', '.png', '.tiff', '.tif', '.gif', '.tga', '.bmp', '.pcx', '.jpe', '.jif', '.jfif', '.jfi', '.dib', '.tpic') " & _
                "and " & _
                "not exists (" & _
                    "select " & _
                        "1 " & _
                    "from " & _
                        "Files F " & _
                    "where " & _
                        "F.Parent = Files.Parent " & _
                        "and " & _
                        "F.Extension in ('.zip', '.lha', '.lzh', '.rar', '.7z')" & _
                ")" _
    )
    $Time = TimerDiff($Time)
    ConsoleWrite("Updating Mark_U1 to 3 among " & $iFiles & " files in database took " & Round($Time / 1000, 2) & "s (" & Round($Time * 1000 / $iFiles, 1) & "µs per file)" & @LF)
    $Time = TimerInit()
    _SQLite_Exec($hDB, "create index if not exists ixParentMark_U1Filename on Files (Parent, Mark_U1, Filename)")
    _SQLite_Exec($hDB, _
        "update Files " & _
            "set " & _
                "Mark_U1 = 5 " & _
            "where " & _
                "Id = (" & _
                    "select " & _
                        "Id " & _
                    "from " & _
                        "Files F " & _
                    "where " & _
                        "F.Parent = Files.Parent " & _
                        "and " & _
                        "Mark_U1 = 3 " & _
                    "order by " & _
                        "Filename " & _
                    "limit 1" & _
                ")" _
    )
EndIf
$Time = TimerDiff($Time)
ConsoleWrite("Updating Mark_U1 to 5 among " & $iFiles & " files in database took " & Round($Time / 1000, 2) & "s (" & Round($Time * 1000 / $iFiles, 1) & "µs per file)" & @LF)
_SQLite_Close($hDB)
_SQLite_Shutdown()

Func X($s)
    Return "'" & StringReplace($s, "'", "''") & "'"
EndFunc

Func XX($s)
    Return ", '" & StringReplace($s, "'", "''") & "'"
EndFunc

 

I get the following result on my old slow PC:

Enumerating 166627 files in array took 23.55s (141.3µs per file)
Inserting 166627 files in database took 64.42s (386.6µs per file)
Updating Mark_U1 to 3 among 166627 files in database took 3.27s (19.6µs per file)
Updating Mark_U1 to 5 among 166627 files in database took 2.1s (12.6µs per file)

Of course you'd have to adjust a couple pathes to try it.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

sorry for delay...

after some testing i get

Updating Mark_U1 to 3 among 480133 files in database took 11.14s (23.2µs per file)
Updating Mark_U1 to 5 among 480133 files in database took 1.2s (2.5µs per file)

it's only if i use:

_SQLite_Exec( $hIndexSSDB, "PRAGMA synchronous = OFF;" )
    _SQLite_Exec( $hIndexSSDB, "PRAGMA journal_mode = OFF;" )
    _SQLite_Exec( $hIndexSSDB, "BEGIN;" )
    ;--------------------
    _SQLite_Exec( $hIndexSSDB, "COMMIT;" )
    _SQLite_Exec( $hIndexSSDB, "PRAGMA synchronous = NORMAL;" )
    _SQLite_Exec( $hIndexSSDB, "PRAGMA journal_mode = MEMORY;" )

before/after all operations, otherwise i get

Updating Mark_U1 to 3 among 480133 files in database took 272.18s (566.9ֳ per file)
Updating Mark_U1 to 5 among 480133 files in database took 91.94s (191.5ֳ per file)

;Updating Mark_U1 to 3 among 84993 files in database took 13.39s (157.5µs per file)
;Updating Mark_U1 to 5 among 84993 files in database took 16.77s (197.3µs per file)

and so on

it seems hdd-io-operations too costly for my system

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now