qwert

Seeking advice on filename search strategy

14 posts in this topic

I have a directory of Clipart (snippet attached) ... 500 subdirectories ... 350,000 files. A few times a day, I use an AU3 script to search for filenames that match a wildcard entry (e.g., *Sale*.*). It takes several seconds as the script traverses the tree.

I’ve been considering how I might be able to reduce this to under a second, considering that the directories are static. No files are added or taken away (although individual files might, on rare occasions, be renamed to be more descriptive).

So far, my idea is to basically do what the operating system would do: pre-build an index file of the names and then search that one file.

Can anyone offer additional insights or proven methods?

Thanks for help with this.

 

ClipArt_Dirs.thumb.PNG.c4ddfd9994ccde673

 

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I will tell a personal recommendation,while using dir /b /s  from the command line (it doesnt leave any file behind and recurses al directories ) during the second run of the command it doesnt take the time it took for the first run.that is the second run is faster than the first so i would preffer that.

 

And if the directories are static and nothing is being added or removed you could generate a list of files in the drive and search in the list that you stored instead of searching the entire drive

it takes a large indexing time if you choose for simple computer search.

Edited by Surya

No matter whatever the challenge maybe control on the outcome its on you its always have been.

MY UDF: Transpond UDF (Sent vriables to Programs) , Utter UDF (Speech Recognition)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

@qwert, with 800+ posts I thought by now you would have learnt where to post an AutoIt General Help & Support question. As you have posted zero code and a picture that pretty much everyone is aware of (we all understand the contents of a directory), I can assume you're not using the UDF functions in File.au3 to traverse a directory and therefore I can surmise that when I use those functions they are pretty quick.

Edited by guinness

_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

Ouch.  I guess not.

Can anyone offer additional insights or proven methods?

I considered General Help and Support.  But I was hoping for perspective that might even include some file system or operating system uses that I might not be aware of.  And the picture (worth 1,000) words was to illustrate the kinds of filenames in my clipart folder.

There seems to be a dozen ways to approach this commonly-needed process.  Given the particulars of my need, I'm looking for a good path to follow.

Now that you have my reasoning, feel free to move the post if you feel it's appropriate to do so.

 

 

 

 

Share this post


Link to post
Share on other sites

Now that you have my reasoning, feel free to move the post if you feel it's appropriate to do so.

Regardless of your additional reasoning, this is not a collaborative project, it's a question. Anyway what you have posted is quite generic, again you don't specify what you have tried, just that you have tried something and want some help. So basically the situation will be people will post, you will say you tried that and then in the end we're back to square one. I see posts like this all too often I'm afraid.


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

If subfolders and file names are static, there will not be anything faster than a solution based on a SQLite database. Store folder and file names in the database and search the database instead of the file system.

Share this post


Link to post
Share on other sites

If subfolders and file names are static, there will not be anything faster than a solution based on a SQLite database. Store folder and file names in the database and search the database instead of the file system.

+1
and a db allows modifications (renaming etc)

Share this post


Link to post
Share on other sites

What is wrong with using an application called Everything (search Google)?


_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 04/09/2015

Share this post


Link to post
Share on other sites

If subfolders and file names are static, there will not be anything faster than a solution based on a SQLite database.

That sounds promising.  I've wanted to try SQL access from AU3 scripts, but have never had the right "beginner-level" application as a point to start from.  This sounds like it.  I'll look for a starter example.  Being able to maintain some "stats" for display on a GUI panel will be icing on the cake.

Thanks for the suggestion.

 

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

qwert, Here is some simple code to start with. I assume that SQLite is already installed.

We'll create a database like this with two tables and a one-to-many relation between the tables:

database_zps3hh1kbb4.png

Copy this code to a file called CreateDB.sql.

CREATE TABLE subfolders (
  subfolder_id    INTEGER PRIMARY KEY,
  subfolder_name  CHAR UNIQUE NOT NULL );

CREATE TABLE files (
  subfolder_id    INTEGER NOT NULL,
  file_name       CHAR NOT NULL,
  UNIQUE (subfolder_id,file_name),
  FOREIGN KEY (subfolder_id) REFERENCES subfolders(subfolder_id) );

INSERT INTO subfolders(subfolder_id,subfolder_name) VALUES (1,"Test1");
INSERT INTO subfolders(subfolder_id,subfolder_name) VALUES (2,"Test2");

INSERT INTO files(subfolder_id,file_name) VALUES (1,"File11.wmf");
INSERT INTO files(subfolder_id,file_name) VALUES (1,"File12.wmf");
INSERT INTO files(subfolder_id,file_name) VALUES (1,"Sale11.wmf");
INSERT INTO files(subfolder_id,file_name) VALUES (1,"Sale12.wmf");

INSERT INTO files(subfolder_id,file_name) VALUES (2,"File21.wmf");
INSERT INTO files(subfolder_id,file_name) VALUES (2,"File22.wmf");
INSERT INTO files(subfolder_id,file_name) VALUES (2,"Sale21.wmf");
INSERT INTO files(subfolder_id,file_name) VALUES (2,"Sale22.wmf");

The code will create the tables and insert a few rows.

To create a database with the name "database.db" open a Command Prompt and start SQLite in this way:

D:\Programmering\AutoIt\Samples\SQLite\Test>sqlite3 database.db
SQLite version 3.7.11 2012-03-20 11:35:50
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .read 'CreateDB.sql'        <<<<<<<< Run CreateDB.sql
sqlite>

Verify that the tables are OK:

sqlite> .schema        <<<<<<<< Show the create statements
CREATE TABLE files (
  subfolder_id    INTEGER NOT NULL,
  file_name       CHAR NOT NULL,
  UNIQUE (subfolder_id,file_name),
  FOREIGN KEY (subfolder_id) REFERENCES subfolders(subfolder_id) );
CREATE TABLE subfolders (
  subfolder_id    INTEGER PRIMARY KEY,
  subfolder_name  CHAR UNIQUE NOT NULL );
sqlite>

Select all rows from subfolders:

sqlite> select * from subfolders;
1|Test1
2|Test2
sqlite>

Select all files that matches *Sale*.*

sqlite> SELECT subfolder_name || '\' || file_name FROM files
   ...> INNER JOIN subfolders ON files.subfolder_id = subfolders.subfolder_id
   ...> WHERE file_name LIKE '%Sale%.%';
Test1\Sale11.wmf
Test1\Sale12.wmf
Test2\Sale21.wmf
Test2\Sale22.wmf
sqlite>

Quit SQLite:

sqlite> .q
D:\Programmering\AutoIt\Samples\SQLite\Test>

To run the *Sale*.* query with AutoIt code do something like this:

#include <SQLite.au3>
#include <Array.au3>

Example()


Func Example()
  _SQLite_Startup()
  Local $hDb = _SQLite_Open( "database.db" )

  Local $aResult, $iRows, $iColumns, $iRval
  Local $sql = "SELECT subfolder_name || '\' || file_name FROM files" & _
               "  INNER JOIN subfolders ON files.subfolder_id = subfolders.subfolder_id" & _
               "  WHERE file_name LIKE '%Sale%.%';"
  $iRval = _SQLite_GetTable2d( $hDB, $sql, $aResult, $iRows, $iColumns )
  If $iRval = $SQLITE_OK Then _ArrayDisplay( $aResult )

  _SQLite_Close( $hDb )
  _SQLite_Shutdown()
EndFunc

 

Edited by LarsJ

Share this post


Link to post
Share on other sites

LarsJ, thanks for these clear instructions.  I've downloaded SQL Lite and will work through them this week.

Share this post


Link to post
Share on other sites

#14 ·  Posted (edited)

qwert,

Before re-inventing the wheel you might want to check out kafu's SMF (Search My Files).  He provided a link above.  It is SQLite based and the interface is really slick.

If nothing else you may get some ideas for a direction or technique.

kylomas

@‌ALL - SQLite Expert is a free SQLite manager.  It is excellent for rapid prototyping and testing.

Edited by kylomas
additional info
1 person likes this

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now