Jump to content

Slow String Processing


Recommended Posts

I know autoit script isn't real compiled code after compiling but

I compared PHP and autoit in a simple string comparison thing.

I need autoit to do this because I want to make a small database with mp3s the mpeg info and the tags.

Why autoit well because php is not super handy for scheduling it wich works good in autoit.

I tested it with a folder wich contains about 13 000 mp3 files.

I use dir to scan the folder (works fine and quick)

I use the stdread thing to read the dir output.

Also this works fast.

The string reaches about 1 mb.

Saving the string is quick takes only half a sec

But I tried testing it with comparing parts of the string.

Scanning for cr and lf chars (to count the number of files)

I didn't even think this would take a lot of time.

But it takes about 5 minutes just to count.

I first tried it with StringInStr() that took the same time.

I thoung maybe stringinstr rereads the whole string every time.

So I used stringmid and a simple IF same result.

In short string comparison takes really long.

I could think maybe thats because its not real compiled code but

I tested the same thing with php (ok this thing cannot count the cr lf for some reason but it works with normal letters) In php it takes 3 seconds to count some string in the large string.

php also aint compiled code.

I know 5 minutes is stil workable because its only going to update every hour or so

but Its not only needing to count the files also to read every file and read the id3 tags and mpeg info.

Wich would mean more time to compare strings. This would take hours. I have a external binary (exe) to read the mp3s but even then comparing the 13 0000 results from that program also would result in hours.

Is there something wich can increase the speed in autoit?

Maybe a differend function.

example code autoit takes 5 minutes to count @crlf in result of "dir" with 13 000 lines of names

#include <Constants.au3>

$foo = Run(@ComSpec & " /c dir *.mp3 /b /o /a /s", "i:\!=- mp3", @SW_HIDE, $STDERR_CHILD + $STDOUT_CHILD)
$line = ""
While 1
    $line &= StdoutRead($foo)
    If @error = -1 Then ExitLoop
Wend
$count1 = 0
$count2 = 0


;this takes incredibly long
while 1
    $count1 += 1
    if ($count1 + 1 ) > stringlen($line) then exitloop
    if stringmid($line,$count1,2) == @CRLF then $count2 +=1
wend
;until here


filewrite("test.txt",$line)
msgbox(0,"",$count2 & " " & $count1

example code php takes 3 seconds to count cr lf or some string in result of "dir" with 13 000 lines of names

<?php

$blaat = shell_exec("dir *.mp3 /b /o /a /s");


$count1 = 0;
$count2 = 0;
$string2 = chr(13).chr(10);

while (true == true) {
    $count1 += 1;
    if (($count1 + 1 ) > strlen($blaat)) break;
    if (!strcasecmp(substr($blaat,$count1,2),$string2)) $count2 +=1;
}



echo $count2;
?>
Link to comment
Share on other sites

  • Moderators

Try maybe StringRegExp() to return a variable and then Ubound it? Sorry, I'm hoping I'm following what your looking for correctly.

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

I dont expect AutoIt to be optimized for string handeling like PHP is. So it is possible you have used the worst case scenario for AutoIt.

How about using somthing like this: (I have not tested this, neither used StringRegExp in AutoIt before)

;StringRegExp ( "test", "pattern" [, flag ] )
dim $arr = StringRegExp($line, @CRLF, 3); 3 or 1.
$count = UBound($arr)
Link to comment
Share on other sites

Too bad no reactions...

Well give me awhile :-)

$line = ""

For $i = 1 To 500
    $line = $line & "test data " & @CRLF
Next

;MsgBox(0,"",$line)

$count1 = 0
$count2 = 0
;this takes incredibly long
$start = TimerInit()
While 1
    $count1 = $count1 + 1
    if ($count1 + 1) > StringLen($line) Then ExitLoop
    If StringMid($line, $count1, 2) == @CRLF Then $count2 = $count2 + 1
WEnd
MsgBox(0, "", (TimerDiff($start) / 1000))
;until here
MsgBox(0, "", $count2 & " " & $count1)


$count1 = 0
$count2 = 0
$start = TimerInit()
$count1 = StringLen($line)
$count2 = StringLen(StringAddCR($line)) - StringLen($line)
MsgBox(0, "", (TimerDiff($start) / 1000))
MsgBox(0, "", $count2 & " " & $count1)
I'm not sure that I understood what you wanted with $count1

If this works for you, credit tylo and my selective memory:

http://www.autoitscript.com/forum/index.ph...indpost&p=44449

If it does not do what you wanted - forget that I ever posted to this thread :-)

Edit: 13000 lines in 0.016 seconds on a slow system

Edited by herewasplato

[size="1"][font="Arial"].[u].[/u][/font][/size]

Link to comment
Share on other sites

Indeed thats quick but because I'll be needing to filter and read the id3 tags also I think none of this will really help.

But for reference the stringadCR function is at 0.04 secs (its longer then herewasplato his example because my lines are much longer but still the winner in linecounting)

and the regexp function with ubound uses 0.19 secs

so stringregexp could do some things for me because I need more then just counting lines (wich still nessary)

Anotherthing the stringregexp example inspired me to test the same thing with stringsplit and ubound

wich made as a time about 0.03 secs seems to be the winner here

regexp result 0.202915733703585 secs

$start = TimerInit()
$arr = StringRegExp($line, "(" & @CRLF & ")", 3); 3 or 1.
$count2 = UBound($arr)
MsgBox(0, "", (TimerDiff($start) / 1000))

stringlen 0.0291662767195272 secs

$count3 = 0
$start = TimerInit()
$count3 = StringLen(StringAddCR($line)) - StringLen($line)
MsgBox(0, "", (TimerDiff($start) / 1000))

stringsplit 0.029129959254598 secs

$count4 = 0
$start = TimerInit()
$arr = Stringsplit($line, @CRLF,1)
$count4 = UBound($arr)-2
MsgBox(0, "", (TimerDiff($start) / 1000))

after much testing the stringsplit is only a tat faster but sometime a bit faster (looks random maybe pc usage) but always quicker

And stringregexp can help me filter the strings wich are returned by the external mp3info.exe (see google its quite handy and quick and opensource(wich I need because it misses id3 v2))

So to count the lines I can use the trick found by tylo wich herewasplato remembered

correction I would use the stringsplit thing

And maybe the rest can be done with stringregexp(tnx uten and smoke_N for reminding me) wich I often used allready But i figured regexp would be slower (because logically it does more complex calculations) but is faster then the internal compare functions wich are used in if and while etc

Stupid of me to assume that regexp would be slower.

PS I read the whole tread on the linecounting funny how verry long functions wich looked even a bit complex were overruled by a simple 3 line code.

Edited by MrSpacely
Link to comment
Share on other sites

UBound is not needed with StringSplit:

$line = ""

For $i = 1 To 500
    $line = $line & "test data " & @CRLF
Next

$count1 = 0
$count2 = 0
$start = TimerInit()
$start = TimerInit()
$arr = StringSplit($line, @CRLF, 1)
InputBox("", "", (TimerDiff($start) / 1000))
MsgBox(0, "", $arr[0] - 1)
0.00128899063987183 (P3, 1GHz, 512MB, .117 Beta AutoIt)

StringSplit

...the first element ($array[0]) contains the number of strings returned,...

Even though you stated, "I use the stdread thing to read the dir output" and you showed same in your code, my feeble mind was stuck on the DIR output being a file and went hunting for tylo's code for FileCountLines. (I also remembered Jon's comment about tylo's post, "...it's a whole different sort of cheating" --- which is what I searched on.) Once I found the post, I realized that FileCountLines is now part of File.au3

[size="1"][font="Arial"].[u].[/u][/font][/size]

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...