StringRegExp confirmation

Steveiwonder · December 1, 2009

Hallo,

I'm new to AutoIt and enjoying it alot atm, made some pretty cool stuff!

The only thing i'm struggling with is RegExp's :-/

However after much confusion and many different patterns i managed to get what i wanted working, what i'm trying to confirm is if i have done it correctly.

To all of you guys this is going to be the most simple RegExp match you've seen, i get truly suck when it comes to these

This is meant to scan through the HTMl pulled from a web page and find "<TR" none-case sensitive and then show me how many it found.

The following works but have i done it correctly?

$html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd"

$isFound = StringRegExp($html, "(?i)\<TR", 3)



For $element IN $isFound
    ConsoleWrite($element & @CRLF)
Next

ConsoleWrite("Total Matches Found: " & UBound($isFound) & @CRLF)

Just looking for some advice TBH

Anything is appreciated.

Thanks

Edited December 1, 2009 by Steveiwonder

GEOSoft · December 1, 2009

Hallo,

I'm new to AutoIt and enjoying it alot atm, made some pretty cool stuff!

The only thing i'm struggling with is RegExp's :-/

However after much confusion and many different patterns i managed to get what i wanted working, what i'm trying to confirm is if i have done it correctly.
To all of you guys this is going to be the most simple RegExp match you've seen, i get truly suck when it comes to these

This is meant to scan through the HTMl pulled from a web page and find "<TR" none-case sensitive and then show me how many it found.

The following works but have i done it correctly?
$html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd"

$isFound = StringRegExp($html, "(?i)\<TR", 3)



For $element IN $isFound
    ConsoleWrite($element & @CRLF)
Next

ConsoleWrite("Total Matches Found: " & UBound($isFound) & @CRLF)
Just looking for some advice TBH

Anything is appreciated.

Thanks

Your pattern will work but there is an easier way which could be used as long as all you really need is the count.

$html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd"
StringRegExpReplace($html, "(?i)<tr.*?>", "")
If @Extended Then MsgBox(0, "Result", "There are " & @Extended & " <tr> elements on the page")

PsaltyDS · December 1, 2009

StringRegExp() is nice (and very geeky, if you're into that), but not always the fastest way. This might be quicker to just count instances:

; Generate about 1K lines
$html = "<TR tes tesn .... yest >/\/\@?''' <tR 1111> <tr adawd>" & @CRLF
For $n = 1 To 10
    $html &= $html
Next

; With StringRegExp()
$iTimer = TimerInit()
For $n = 1 To 1000
    $isFound = StringRegExp($html, "(?i)\<TR", 3)
Next
$iCount = UBound($isFound)
$iTimer = TimerDiff($iTimer)
ConsoleWrite("Total StringRegExp() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF)

; With StringReplace
$iTimer = TimerInit()
For $n = 1 To 1000
    $isFound = StringReplace($html, "<TR", "")
Next
$iCount = @extended
$iTimer = TimerDiff($iTimer)
ConsoleWrite("Total StringReplace() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF)

Results on my CPU:

Total StringRegExp() Matches Found: 3072; In 28.271sec
Total StringReplace() Matches Found: 3072; In 8.33sec

About three times as fast.

You would still want to use StringRegExp() for more complicated matches (i.e. "TR tags that do not contain any TD tags").

PsaltyDS · December 1, 2009

StringRegExp() is nice (and very geeky, if you're into that), but not always the fastest way. This might be quicker to just count instances:

; Generate about 1K lines
$html = "<TR tes tesn .... yest >/\/\@?''' <tR 1111> <tr adawd>" & @CRLF
For $n = 1 To 10
    $html &= $html
Next

; With StringRegExp()
$iTimer = TimerInit()
For $n = 1 To 1000
    $isFound = StringRegExp($html, "(?i)\<TR", 3)
Next
$iCount = UBound($isFound)
$iTimer = TimerDiff($iTimer)
ConsoleWrite("Total StringRegExp() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF)

; With StringReplace
$iTimer = TimerInit()
For $n = 1 To 1000
    $isFound = StringReplace($html, "<TR", "")
Next
$iCount = @extended
$iTimer = TimerDiff($iTimer)
ConsoleWrite("Total StringReplace() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF)

Results on my CPU:

Total StringRegExp() Matches Found: 3072; In 28.271sec
Total StringReplace() Matches Found: 3072; In 8.33sec

About three times as fast.

You would still want to use StringRegExp() for more complicated matches (i.e. "TR tags that do not contain any TD tags").

P.S. If you are working with an active instance of IE, you could also just do _IETagNameGetCollection() and check @extended for the count. I haven't timed that.

trancexx · December 1, 2009

Ok, ok Psalty. We read you.

Steveiwonder · December 1, 2009

@ GEOSoft - Your code didn't seem to do anything

Did it work for you?

@Psalty will have a look at this and see how i get on, thanks.. and how come its so fast?

Is there anywhere i can learn some more about autoit RegExp's so i don't have to bug people on here?

GEOSoft · December 1, 2009

Change it to this

$html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd"
StringRegExpReplace($html, "(?i)<tr.*?>", "")
$iCount = @Extended
MsgBox(0, "Result", "There are " & $iCount & " <tr> elements on the page.")

Steveiwonder · December 1, 2009

Thank alot both of you. Both work as needed.

I'm gonna use Geosoft's version for one reason only, i have no idea how to use Regular Expression yet and i need to learn so i figure this is the best way to start. It also seems more flexible for future use? (Correct me if i'm wrong)

but thanks again both of you

PsaltyDS · December 1, 2009

Ok, ok Psalty. We read you.

Oops, sloppy mousing...

Sign In

StringRegExp confirmation

Recommended Posts

Steveiwonder

Link to comment

Share on other sites

GEOSoft

Link to comment

Share on other sites

PsaltyDS

Link to comment

Share on other sites

PsaltyDS

Link to comment

Share on other sites

trancexx

Link to comment

Share on other sites

Steveiwonder

Link to comment

Share on other sites

GEOSoft

Link to comment

Share on other sites

Steveiwonder

Link to comment

Share on other sites

PsaltyDS

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Recently Browsing 0 members

Browse

AutoIt Resources

Release

Beta