Jump to content

Improvement of included _FileListToArray function.


Tlem
 Share

Recommended Posts

  • Replies 265
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Looks like Switch is a little slower - on my machine at least:

$fred = 0
Yup, you're right, only remembered reading it somewhere  . It's slower then 'if...then' but faster then 'if...then...endif'. Also added a 'select...case' to the benchmark. So 'if...then' should be the way to go for single line statements, 'switch...case' if you have multiple statements following.

$fred = 0

$begin = TimerInit()
For $i = 1 To 1000000
    If $fred = 1 Then
        ConsoleWrite("fred" & @CRLF)
    EndIf
Next
ConsoleWrite(Timerdiff($begin) & @CRLF)

$begin = TimerInit()
For $i = 1 To 1000000

    If $fred = 1 Then ConsoleWrite("fred" & @CRLF)

Next
ConsoleWrite(Timerdiff($begin) & @CRLF)

$begin = TimerInit()
For $i = 1 To 1000000
    Switch $fred
        Case 1
            ConsoleWrite("fred" & @CRLF)
    EndSwitch
Next
ConsoleWrite(Timerdiff($begin) & @CRLF)

$begin = TimerInit()
For $i = 1 To 1000000
    Select
        Case $fred = 1
            ConsoleWrite("fred" & @CRLF)
    EndSelect
Next
ConsoleWrite(Timerdiff($begin) & @CRLF)

; Results:
;851.329474631567 if...then...endif
;641.703022462219 if...then
;702.605523680954 switch...case
;843.367480406495 select...case
Edited by KaFu
Link to comment
Share on other sites

After doing lot of tests on this function with 100000, I have not seen a gain to use :

If ... Then
     ...
EndIf
instead of
If ... Then ...

The same think for 'If' against 'Switch', 'Switch' is faster here.

On the other hand, big up to Spiff59, because he's function is faster than all I tested.

I just do a little modification because the use of @extented isn't a god choice for old version of AutoIt, and like you can see in the results, there is not a significant gain of time of using @Extented, so keep the compatibility.

This is the result of my tests on my Laptop.

For 100000 files :

Original _FileListToArray : 2298

Modified _FileListToArray : 1349.6

Modified _FileListToArray : 1349.2 (Spiff59 version)

For 50000 files :

Original _FileListToArray : 1151

Modified _FileListToArray : 675

Modified _FileListToArray : 669 (Spiff59 version)

Like we can see, this new version of _FileListToArray made the work in two steps less time (almost) for less than 50000 files. :D

This is the new version :

Func _FileListToArray($sPath, $sFilter = "*", $iFlag = 0)
    Local $hSearch, $sFile, $sFileList, $sDelim = "|"

    If Not FileExists($sPath) Then Return SetError(1, 1, "")
    If StringRegExp($sFilter, "[\\/:<>|]") Or (Not StringStripWS($sFilter, 8)) Then Return SetError(2, 2, "")
    If (StringMid($sPath, StringLen($sPath), 1) = "\") Then $sPath = StringTrimRight($sPath, 1) ; needed for Win98 for x:\  root dir

    If $iFlag > 3 Then
        $sDelim &= $sPath & "\"
        $iFlag -= 4
    EndIf

    $hSearch = FileFindFirstFile($sPath & "\" & $sFilter)
    If $hSearch = -1 Then
        Return SetError(4, 4, "")
    EndIf

    Switch $iFlag
        Case 0; Files and Folders
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
                $sFileList &= $sDelim & $sFile
            WEnd
        Case 1; Files Only
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
                ; bypass folder
                ;If @extended Then ContinueLoop ; This line for new beta instead of next line.
                If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") <> 0 Then ContinueLoop
                $sFileList &= $sDelim & $sFile
            WEnd
        Case 2; Folders Only
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
                ; bypass file
                ;If @extended = 0 Then ContinueLoop ; This line for new beta instead of next line.
                If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") = 0 Then ContinueLoop
                $sFileList &= $sDelim & $sFile
            WEnd
        Case Else
            Return SetError(3, 3, "")
    EndSwitch

    FileClose($hSearch)
    Return StringSplit(StringTrimLeft($sFileList, 1), "|")
EndFunc   ;==>_FileListToArray

I am impatient to read your results and comments. :D

Edit :

For 50000 files over the network :

Original _FileListToArray : 20726

Modified _FileListToArray : 20259

Modified _FileListToArray : 20506 (Spiff59 version)

Edited by Tlem

Best Regards.Thierry

Link to comment
Share on other sites

  • Moderators

If you guys ran the tests in reverse order, where the "Beta" version or Even the "Release" version of _FileListToArray() was the last timed call, you might see that your tests results are not really true.

#include<file.au3>
#include<array.au3>
$iflag = 0

; ----------------------
$timer = TimerInit()
For $j = 1 to 100
    $x = _FileListToArray3(@SystemDir,"*",$iflag)
Next
$timer2 = TimerDiff ($timer)


; Beta Version -------------------------------
$timer = TimerInit()
For $j = 1 to 100
    $x = _FileListToArray(@SystemDir,"*",$iflag)
Next
$timer1 = TimerDiff ($timer)

MsgBox (0, "", $timer1 & @CRLF & $timer2)
;_ArrayDisplay($x)

Func _FileListToArray3($sPath, $sFilter = "*", $iFlag = 0)
    Local $hSearch, $sFile, $sFileList, $sDelim = "|"

    If Not FileExists($sPath) Then Return SetError(1, 1, "")
    If StringRegExp($sFilter, "[\\/:<>|]") Or (Not StringStripWS($sFilter, 8)) Then Return SetError(2, 2, "")
    If (StringMid($sPath, StringLen($sPath), 1) = "\") Then $sPath = StringTrimRight($sPath, 1) ; needed for Win98 for x:\  root dir

    If $iFlag > 3 Then
        $sDelim &= $sPath & "\"
        $iFlag -= 4
    EndIf

    $hSearch = FileFindFirstFile($sPath & "\" & $sFilter)
    If $hSearch = -1 Then
        Return SetError(4, 4, "")
    EndIf

    Switch $iFlag
        Case 0; Files and Folders
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
                $sFileList &= $sDelim & $sFile
            WEnd
        Case 1; Files Only
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
                ; bypass folder
                ;If @extended Then ContinueLoop ; This line for new beta instead of next line.
                If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") <> 0 Then ContinueLoop
                $sFileList &= $sDelim & $sFile
            WEnd
        Case 2; Folders Only
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
                ; bypass file
                ;If @extended = 0 Then ContinueLoop ; This line for new beta instead of next line.
                If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") = 0 Then ContinueLoop
                $sFileList &= $sDelim & $sFile
            WEnd
        Case Else
            Return SetError(3, 3, "")
    EndSwitch

    FileClose($hSearch)
    Return StringSplit(StringTrimLeft($sFileList, 1), "|")
EndFunc   ;==>_FileListToArray
I ran that in just 3.3 release, and the release version of AutoIt was faster 4 out of 5 times. Beta there was no contest, beta was much faster.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

If you guys ran the tests in reverse order, where the "Beta" version or Even the "Release" version of _FileListToArray() was the last timed call, you might see that your tests results are not really

I'm not sure, but did you mix up the var names?

MsgBox (0, "", $timer1 & @CRLF & $timer2), where $timer1 is the org. UDF and $timer2 is the modified one

$timer2 (the modified one) is always faster for me (~2 times) then the org. UDF.

Edited by KaFu
Link to comment
Share on other sites

  • Moderators

I'm not sure, but did you mix up the var names?

MsgBox (0, "", $timer1 & @CRLF & $timer2), where $timer1 is the org. UDF and $timer2 is the modified one

$timer2 (the modified one) is always faster for me (~2 times) then the org. UDF.

I certainly did :D:D

Edit:

I always send the data with a text recognition of what it was running, shouldn't have just ran it out of the box.

Edited by SmOke_N

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

I still use Autoit 3.2.12.1

On my EEE with WinXP standard _FileListToArray takes more than 10x more time no matter of calling order!

I made modified version using 

$sFileList &= $sFile & $sDelim
    Return StringSplit(StringTrimRight($sFileList, 1), "|")

instead of

$sFileList &= $sDelim & $sFile
    Return StringSplit(StringTrimLeft($sFileList, 1), "|")

because I thought right trim (and maybe also right concatenation) of very long string should be simpler/faster/less memory using

But result time was the same as left sided one.

I looked into Autoit's sources and saw right side string operations (StrimTrimRight) are the same as left sided

I think Autoit can be optimized better for right side string trim operations.

Now both StrimTrimLeft/StrimTrimRight uses AString::assign() which is not efficient in case of StrimTrimRight I think.

But I'm not sure about this because I'm not C++ expert.

Another idea is to use ASM optimized code for inner loop. This way results may be amazing.

EDIT: See this post about ASM optimizations of Autoit's loops

http://www.autoitscript.com/forum/index.php?showtopic=96330

I think in this case it could be worth to use ASM optimization

Edited by Zedna
Link to comment
Share on other sites

Because of fluctuations in the tests and about what SmOke_N said, I redid my protocols of tests.

In this one, I made a loop of 10 tests and put the original _FileListToArray on the end of test (for SmOke_N :D ).

Before each test I do a Sleep(500), and I do the function one time for nothing (I do this because I have noticed that sometimes if I start the function without doing it one time before, the working time is longer than normally :D ).

Then I start the timerInit and I do the function.

The timer result is adding in a variable and finally the result is divided by 10 after the loop to see the average time.

Just for information, the fact to put original _FileListToArray on the end of test do not change anything of the result for me (make your own test to see it).

This is the code of my tests protocol :

#include <file.au3>

Dim $TestDir = @ScriptDir & "\50000_Files"
;Dim $TestDir = @ScriptDir & "\100000_Files"
;Dim $TestDir = "\\Diskstation_207\Share\50000_Files"

Dim $FileList[1], $dif1, $dif2, $dif3
Dim $Loop = 10

For $j = 1 To $Loop

    Sleep(500)
    _FileListToArray2($TestDir, "*")
    $begin = TimerInit()
    $FileList = _FileListToArray2($TestDir, "*")
    $dif2 += Round(TimerDiff($begin)) / $Loop
    ConsoleWrite("Test 2 - Loop " & $j & @CRLF)

    Sleep(500)
    _FileListToArray3($TestDir, "*")
    $begin = TimerInit()
    $FileList = _FileListToArray3($TestDir, "*")
    $dif3 += Round(TimerDiff($begin)) / $Loop
    ConsoleWrite("Test 3 - Loop " & $j & @CRLF)

    Sleep(500)
    _FileListToArray($TestDir, "*")
    $begin = TimerInit()
    $FileList = _FileListToArray($TestDir, "*") ;Original _FileListToArray function.
    $dif1 += Round(TimerDiff($begin)) / $Loop
    ConsoleWrite("Test 1 - Loop " & $j & @CRLF)

Next

MsgBox(0, "Time Difference for " & $FileList[0] & " files", "_FileListToArray Original = " & $dif1 & @CRLF & _
        "_FileListToArray Modified = " & $dif2 & @CRLF & "_FileListToArray Spiff59 = " & $dif3)

Exit

This is my results on my laptop for AutoIt 3.3.0.0 and AutoIt Beta 3.3.1.1.

Result for 100000 files (v3.3.0.0/v3.3.1.1) :

Original _FileListToArray : 2331 ms / 2392 ms

Modified _FileListToArray : 1346 ms / 1415 ms

Modified _FileListToArray : 1382 ms / 1418 ms (Spiff59 version)

Result for 50000 files (v3.3.0.0/v3.3.1.1) :

Original _FileListToArray : 1158 ms / 1180 ms

Modified _FileListToArray : 668 ms / 688 ms

Modified _FileListToArray : 679 ms / 698 ms (Spiff59 version)

Result for 50000 files over ethernet on a NAS (v3.3.0.0/v3.3.1.1) :

Original _FileListToArray : 20991 ms / 21183 ms

Modified _FileListToArray : 20415 ms / 20745 ms

Modified _FileListToArray : 21072 ms / 21290 ms (Spiff59 version)

Having said that, there is one thing which adorned me relatively strange.

If I am not wrong, @extended contain the result of FileFindNextFile() about type of file.

So testing :

If @extended  Then ...
should be faster than testing
If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") Then ...
Did I'm wrong?

And why beta version seems to be just a little slower than release version (for just a few ms but all the same)?

I would be back tomorrow, because this evening I go to see the concert of a great French singer: Johnny Hallyday

Good evening/day at all.

Edited by Tlem

Best Regards.Thierry

Link to comment
Share on other sites

I still use Autoit 3.2.12.1

On my EEE with WinXP standard _FileListToArray takes more than 10x more time no matter of calling order!

I made modified version using 

$sFileList &= $sFile & $sDelim
    Return StringSplit(StringTrimRight($sFileList, 1), "|")

instead of

$sFileList &= $sDelim & $sFile
    Return StringSplit(StringTrimLeft($sFileList, 1), "|")

because I thought right trim (and maybe also right concatenation) of very long string should be simpler/faster/less memory using

But result time was the same as left sided one.

I looked into Autoit's sources and saw right side string operations (StrimTrimRight) are the same as left sided

I think Autoit can be optimized better for right side string trim operations.

Now both StrimTrimLeft/StrimTrimRight uses AString::assign() which is not efficient in case of StrimTrimRight I think.

But I'm not sure about this because I'm not C++ expert.

Another idea is to use ASM optimized code for inner loop. This way results may be amazing.

All of the String*() functions seem to have gotten much faster starting with 3.3.0.0 so perhaps it's time to bite the bullet and update.

George

Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.

Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.***

The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number.

Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else.

"Old age and treachery will always overcome youth and skill!"

Link to comment
Share on other sites

Another idea is to use ASM optimized code for inner loop. This way results may be amazing.

EDIT: See this post about ASM optimizations of Autoit's loops

http://www.autoitscript.com/forum/index.php?showtopic=96330

I thought about that too, I even started to write the code (however it's tricky to copy the raw string to an autoit array in a efficient way). I might or might not finish it later tonight.

Broken link? PM me and I'll send you the file!

Link to comment
Share on other sites

I thought about that too, I even started to write the code (however it's tricky to copy the raw string to an autoit array in a efficient way). I might or might not finish it later tonight.

Don't use array instead translate to ASM only inner WHILE loop.

StringSplit at the end is executed only once so it will not be main time consumer I think.

In case of ASM optimized version also can be one WHILE loop with case inside it because gain will not be so bad in ASM loop I think.

Edited by Zedna
Link to comment
Share on other sites

Don't use array instead translate to ASM only inner WHILE loop.

StringSplit at the end is executed only once so it will not be main time consumer I think.

In case of ASM optimized version also can be one WHILE loop with case inside it because gain will not be so bad in ASM loop I think.

Well yes, I know, it's conjugating the all the strings inside the asm and allocating the storage before entering the loop that's the tricky part.

Broken link? PM me and I'll send you the file!

Link to comment
Share on other sites

I guess if you're squeezing whatever you can out of this routine then the following line offers room for improvement:

If (StringMid($sPath, StringLen($sPath), 1) = "\") Then $sPath = StringTrimRight($sPath, 1)

I don't know if the line is really necessary, but dumping the StringMid and StringLen for a single StringRight seemed obvious, and using a StringRegExpReplace I assumed would be the fastest. Oddly, in my tests using this:

$sPath = StringRegExpReplace($sPath, "\\\z", "")

was slower than:

If StringRight($sPath, 1) = "\" Then $sPath = StringTrimRight($sPath, 1)

So here's the version from Tlem's last post, but using the StringRight:

CODE
Func _FileListToArray($sPath, $sFilter = "*", $iFlag = 0)

Local $hSearch, $sFile, $sFileList, $sDelim = "|"

If Not FileExists($sPath) Then Return SetError(1, 1, "")

If StringRegExp($sFilter, "[\\/:<>|]") Or (Not StringStripWS($sFilter, 8)) Then Return SetError(2, 2, "")

If StringRight($sPath, 1) = "\" Then $sPath = StringTrimRight($sPath, 1); needed for Win98 for x:\ root dir

If $iFlag > 3 Then

$sDelim &= $sPath & "\"

$iFlag -= 4

EndIf

$hSearch = FileFindFirstFile($sPath & "\" & $sFilter)

If $hSearch = -1 Then Return SetError(4, 4, "")

Switch $iFlag

Case 0; Files and Folders

While 1

$sFile = FileFindNextFile($hSearch)

If @error Then ExitLoop

$sFileList &= $sDelim & $sFile

WEnd

Case 1; Files Only

While 1

$sFile = FileFindNextFile($hSearch)

If @error Then ExitLoop

; If @extended Then ContinueLoop; bypass folder (for beta version)

If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") <> 0 Then ContinueLoop; bypass folder

$sFileList &= $sDelim & $sFile

WEnd

Case 2; Folders Only

While 1

$sFile = FileFindNextFile($hSearch)

If @error Then ExitLoop

; If @extended = 0 Then ContinueLoop; bypass file (for beta version)

If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") = 0 Then ContinueLoop; bypass file

$sFileList &= $sDelim & $sFile

WEnd

Case Else

Return SetError(3, 3, "")

EndSwitch

FileClose($hSearch)

Return StringSplit(StringTrimLeft($sFileList, 1), "|")

EndFunc ;==>_FileListToArray

Edited by Spiff59
Link to comment
Share on other sites

It is not my version, but that of SolidSnake improved by some persons (we and you in particular :D ).

Are you sure that your improvement of

If (StringMid($sPath, StringLen($sPath), 1) = "\") Then $sPath = StringTrimRight($sPath, 1)
work on all case?

Did you test it under Win98 or with UNC path?

This statement is doing one time, so the gain can be about 2-5 ms ...

Is it necessary to modify this line for just that?

Edited by Tlem

Best Regards.Thierry

Link to comment
Share on other sites

It is not my version, but that of SolidSnake improved by some persons (we and you in particular :D ).

Are you sure that your improvement of

If (StringMid($sPath, StringLen($sPath), 1) = "\") Then $sPath = StringTrimRight($sPath, 1)
work on all case?

Did you test it under Win98 or with UNC path?

This statement is doing one time, so the gain can be about 2-5 ms ...

Is it necessary to modify this line for just that?

I don;t consider this anyones version, it's a team effort, I hope!

I am sure that StringRight($x,1) works identically to StringMid($x,StringLen($x),1)), it's just faster.

If we're trying to reach a consensus version, one that, once proven stable, might get implemented, it might as well have all the improvements we can think of included. Otherwsie, someone may come along and create a BugTrac saying "Hey, we can make this faster by changing this one line...". The fact that the current version generates so many BugTracs does indicate the routine needs some work. It would be nice to see a version go in that had such general approval that it might go a few years without creating a slew of BugTracs.

Link to comment
Share on other sites

Just one word from me about If...Then...

I was talking generally and suggested maybe a a test to see if there would be a gain in efficiency.

I see that you tested and concluded that there isn't any. That's ok by me.

What I don't understand is the way tests are done (Melba23 and others) ...and codeless speculations of some.

No one tested situations when something is actually done if condition is met.

...I will edit this post with some code to show you what I mean, just for the sake of completeness of the tests, or for some to get the full picture.

edit:

Testing three different ways. In case condition is met $x variable will be increased by one. Result to verify should be the same for all three ways in case of successful test. First number is time in ms passed during every loop for every case. Smaller the number - faster the way (less time passed). Real test starts after the initial warm-up.

$fred = 0

ConsoleWrite("Warming..." & @CRLF)
;warm up...
$x = 0
$begin = TimerInit()
For $i = 1 To 1000000
    If $fred = 0 Then
        $x = 1
    Else
        $x = -1
    EndIf
Next
;end warm up...




ConsoleWrite("Let's start:" & @CRLF)


;xxxxxxxxxxxxxxxxxxxxxxx Switch...EndSwitch xxxxxxxxxxxxxxxxxxxxxxxxx
$x = 0
$begin = TimerInit()
For $i = 1 To 1000000
    Switch $fred
        Case 0
            $x += 1
    EndSwitch
Next
ConsoleWrite("Switch... EndSwitch: " & Timerdiff($begin) & "  Result to verify: " & $x & @CRLF)
;xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


;xxxxxxxxxxxxxxxxxxxxxxxxx If... Then... EndIf xxxxxxxxxxxxxxxxxxxxxxxx
$x = 0
$begin = TimerInit()
For $i = 1 To 1000000
    If $fred = 0 Then
        $x += 1
    EndIf
Next
ConsoleWrite("If... Then... EndIf: " & Timerdiff($begin) & "  Result to verify: " & $x & @CRLF)
;xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx


;xxxxxxxxxxxxxxxxxxxxxxxxxxx If...Then xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
$x = 0
$begin = TimerInit()
For $i = 1 To 1000000
    If $fred = 0 Then $x += 1
Next
ConsoleWrite("If... Then.........: " & Timerdiff($begin) & "  Result to verify: " & $x & @CRLF)
;xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Edited by trancexx

♡♡♡

.

eMyvnE

Link to comment
Share on other sites

In the sights of the results of this test, we can be only all right with you.

But having redone once again the tests (loop of 10 with 100000 files) by considering your suggestions, it turns out that we lose neighborhood 100ms of time. :D

For my testing, I have replacing :

Case 1; Files Only
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then ExitLoop
; bypass folder
;If @extended Then ContinueLoop; This version for new beta
                If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") <> 0 Then ContinueLoop
                $sFileList &= $sDelim & $sFile
            WEnd
by
Case 1; Files Only
            While 1
                $sFile = FileFindNextFile($hSearch)
                If @error Then
                    ExitLoop
                EndIf
; bypass folder
;If @extended Then ContinueLoop; This version for new beta
                If StringInStr(FileGetAttrib($sPath & "\" & $sFile), "D") <> 0 Then
                    ContinueLoop
                EndIf
                $sFileList &= $sDelim & $sFile
            WEnd

Same modification for Case 0 and Case 2. :D

I do not know why, but it is doubtless bound to the statement following the 'If'.

Edited by Tlem

Best Regards.Thierry

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...