Jump to content
Sign in to follow this  
CrypticKiwi

Help make StringInStr loop Faster

Recommended Posts

CrypticKiwi

Hello,

I am going to be using a code similar to below code a lot and its very very slow so I was wondering if anyone can help me out and make it faster, maybe with an alternative to StringinStr.

would also appreciate it if someone could explain to me what is the relationship between counter length and loop speed.

 

#include <File.au3>
#include <String.au3>

$Handle = FileOpen("SampleDataFile.txt", 0)
$Read = FileRead($Handle)
$Counter = 10000

For $i = 1 To StringLen($Read) Step 5

    For $P = 1 To 100   ; 100 is an estimate of max number of occurrences
        $Location = StringInStr($Read, $Counter, 0, $P)
        If $Location = 0 Then   ;Exitloop if all occurrences are found
            ExitLoop
        EndIf
    Next
    ToolTip($i)
    $Counter += 1

Next

I need to loop through all occurrences, rank them and do other stuff but I only included the slow part  (stringinstr) for clarity.

SampleDataFile.txt

Share this post


Link to post
Share on other sites
kylomas

CrypticKiwi,

Two possible methods (if I'm understanding your purpose).

#include <File.au3>
#include <String.au3>

$Handle = FileOpen(@scriptdir & "\SampleDataFile.txt", 0)
$Read = FileRead($Handle)

ConsoleWrite('! ---------  Non regexp method' & @CRLF)

For $i = 10000 To 10099

    ConsoleWrite('searching for ' & string($i) & @CRLF)
    stringreplace($Read,string($i),'')
    ConsoleWrite($i & ' appears ' & @extended & ' times' & @CRLF)

Next

ConsoleWrite('! ---------  regexp method' & @CRLF)

For $i = 10000 To 10099

    ConsoleWrite('searching for ' & string($i) & @CRLF)
    ConsoleWrite($i & ' appears ' & ubound(stringregexp($Read,$i,3)) & ' times' & @CRLF)

Next

 

Incidentally, I believe there is a logic flaw in the code you posted.  "Step 5" may miss occurrences of the target string.

kylomas


Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Share this post


Link to post
Share on other sites
AndyG

Hi,

why do you use the "casesense=0"-flag? 

The searching is much faster when casesense=1

#include <array.au3>
#include <String.au3>

Dim $occur[100][2]

$Read = FileRead("SampleDataFile.txt")
$Counter = 10000
$Location = 0               ;start searching

For $i = 1 To 100           ;10001 to 10100
    $hits = -1
    Do
        $hits += 1          ;we found a string
        $Location = StringInStr($Read, $Counter + $i, 1, 1, $Location + 1)
    Until $Location = 0     ;no string found
    $occur[$i - 1][0] = $Counter + $i ;strings
    $occur[$i - 1][1] = $hits ;number strings found
Next

_ArrayDisplay($occur, "occurrence/hits")

it becomes increasingly harder to beat the regex engine :D

 

//EDIT

Interesting: if i run the following script on my laptop (AMD A6-3400M APU (quadcore)) with clockspeed 700Mhz (the script is not able to pull the processor out of the sleepmode), the speed of the RegEx-part is a little bit faster than the StringInStr()-loop.

But if i force the processor to run with fullspeed (2300Mhz), the StringInStr()-loop is twice as fast as the RegEx!? 

//EDIT2, 

Depending on other activities (HDD) on my computer, the RegEx is much more sensitive to these activities! Often the RegEx runs twice as long, while the speed of the StringInStr() is every time the same...:blink:

 

#include <array.au3>
#include <String.au3>

Dim $occur[100][2]

$Read = FileRead("SampleDataFile.txt")
$Counter = 10000
$Location = 0               ;start searching

$t=timerinit()
For $i = 1 To 100           ;10001 to 10100
    $hits = -1
    Do
        $hits += 1          ;we found a string
        $Location = StringInStr($Read, $Counter + $i, 1, 1, $Location + 1)
    Until $Location = 0     ;no string found
    $occur[$i - 1][0] = $Counter + $i ;strings
    $occur[$i - 1][1] = $hits ;number strings found
Next

$m=timerdiff($t)
ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : loop= ' & $m & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console


$t=timerinit()
For $i = 1 To 100

;~     ConsoleWrite('searching for ' & string($i) & @CRLF)
;~     ConsoleWrite($i & ' appears ' & ubound(stringregexp($Read,$i,3)) & ' times' & @CRLF)
    $occur[$i - 1][0] = $Counter + $i ;strings
    $occur[$i - 1][1] = ubound(stringregexp($Read,$Counter + $i,3)) ;number strings found

Next

$m=timerdiff($t)
ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : regex= ' & $m & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console

_ArrayDisplay($occur, "occurrence/hits")

 

Edited by AndyG

Share this post


Link to post
Share on other sites
CrypticKiwi

 

I don’t understand, case sensitive = 1 does make the searching faster, thank you for that but why would case sensitive make any difference while Im searching through Numbers!!? That’s why I choose case sense = 0 the default (No case sensitive) cause again im searching through numbers !!

come to think of it numbers or no numbers why would it make a difference in speed in any case!

 

@Kylomas

Using Regex and @extend to get number of occurrences are good ideas but unfortunately I also need the location (Position).

step 5  is needed in other script and is ok to change it in this example.

 

 

Anyway, Thank you guys for your help its much faster now with casesense = 1!

 

Edited by CrypticKiwi

Share this post


Link to post
Share on other sites
Chimp

..... come to think of it numbers or no numbers why would it make a difference in speed in any case! .....

I think that higher speed  with casesense=1 has a sense.
if you have to compare 2 chars with casesense=1, then you need only one comparison to check if both are equal. example: is a = a ?
if you use casesense=0 (upper and lower case are both ok) then you need to check 2 times. Example: is a = a ? or is a = A ? are both ok so you need to check both (2 comparisons instead of only one).


small minds discuss people average minds discuss events great minds discuss ideas.... and use AutoIt....

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×