Steveiwonder Posted December 1, 2009 Share Posted December 1, 2009 (edited) Hallo, I'm new to AutoIt and enjoying it alot atm, made some pretty cool stuff! The only thing i'm struggling with is RegExp's :-/ However after much confusion and many different patterns i managed to get what i wanted working, what i'm trying to confirm is if i have done it correctly. To all of you guys this is going to be the most simple RegExp match you've seen, i get truly suck when it comes to these This is meant to scan through the HTMl pulled from a web page and find "<TR" none-case sensitive and then show me how many it found. The following works but have i done it correctly? $html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd" $isFound = StringRegExp($html, "(?i)\<TR", 3) For $element IN $isFound ConsoleWrite($element & @CRLF) Next ConsoleWrite("Total Matches Found: " & UBound($isFound) & @CRLF) Just looking for some advice TBH Anything is appreciated. Thanks Edited December 1, 2009 by Steveiwonder They call me MrRegExpMan Link to comment Share on other sites More sharing options...
GEOSoft Posted December 1, 2009 Share Posted December 1, 2009 Hallo, I'm new to AutoIt and enjoying it alot atm, made some pretty cool stuff! The only thing i'm struggling with is RegExp's :-/ However after much confusion and many different patterns i managed to get what i wanted working, what i'm trying to confirm is if i have done it correctly. To all of you guys this is going to be the most simple RegExp match you've seen, i get truly suck when it comes to these This is meant to scan through the HTMl pulled from a web page and find "<TR" none-case sensitive and then show me how many it found. The following works but have i done it correctly? $html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd" $isFound = StringRegExp($html, "(?i)\<TR", 3) For $element IN $isFound ConsoleWrite($element & @CRLF) Next ConsoleWrite("Total Matches Found: " & UBound($isFound) & @CRLF) Just looking for some advice TBH Anything is appreciated. Thanks Your pattern will work but there is an easier way which could be used as long as all you really need is the count. $html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd" StringRegExpReplace($html, "(?i)<tr.*?>", "") If @Extended Then MsgBox(0, "Result", "There are " & @Extended & " <tr> elements on the page") George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
PsaltyDS Posted December 1, 2009 Share Posted December 1, 2009 StringRegExp() is nice (and very geeky, if you're into that), but not always the fastest way. This might be quicker to just count instances: ; Generate about 1K lines $html = "<TR tes tesn .... yest >/\/\@?''' <tR 1111> <tr adawd>" & @CRLF For $n = 1 To 10 $html &= $html Next ; With StringRegExp() $iTimer = TimerInit() For $n = 1 To 1000 $isFound = StringRegExp($html, "(?i)\<TR", 3) Next $iCount = UBound($isFound) $iTimer = TimerDiff($iTimer) ConsoleWrite("Total StringRegExp() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF) ; With StringReplace $iTimer = TimerInit() For $n = 1 To 1000 $isFound = StringReplace($html, "<TR", "") Next $iCount = @extended $iTimer = TimerDiff($iTimer) ConsoleWrite("Total StringReplace() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF) Results on my CPU: Total StringRegExp() Matches Found: 3072; In 28.271sec Total StringReplace() Matches Found: 3072; In 8.33sec About three times as fast. You would still want to use StringRegExp() for more complicated matches (i.e. "TR tags that do not contain any TD tags"). Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
PsaltyDS Posted December 1, 2009 Share Posted December 1, 2009 StringRegExp() is nice (and very geeky, if you're into that), but not always the fastest way. This might be quicker to just count instances: ; Generate about 1K lines $html = "<TR tes tesn .... yest >/\/\@?''' <tR 1111> <tr adawd>" & @CRLF For $n = 1 To 10 $html &= $html Next ; With StringRegExp() $iTimer = TimerInit() For $n = 1 To 1000 $isFound = StringRegExp($html, "(?i)\<TR", 3) Next $iCount = UBound($isFound) $iTimer = TimerDiff($iTimer) ConsoleWrite("Total StringRegExp() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF) ; With StringReplace $iTimer = TimerInit() For $n = 1 To 1000 $isFound = StringReplace($html, "<TR", "") Next $iCount = @extended $iTimer = TimerDiff($iTimer) ConsoleWrite("Total StringReplace() Matches Found: " & $iCount & "; In " & Round($iTimer/1000, 3) & "sec" & @CRLF) Results on my CPU: Total StringRegExp() Matches Found: 3072; In 28.271sec Total StringReplace() Matches Found: 3072; In 8.33sec About three times as fast. You would still want to use StringRegExp() for more complicated matches (i.e. "TR tags that do not contain any TD tags"). P.S. If you are working with an active instance of IE, you could also just do _IETagNameGetCollection() and check @extended for the count. I haven't timed that. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
trancexx Posted December 1, 2009 Share Posted December 1, 2009 Ok, ok Psalty. We read you. ♡♡♡ . eMyvnE Link to comment Share on other sites More sharing options...
Steveiwonder Posted December 1, 2009 Author Share Posted December 1, 2009 @ GEOSoft - Your code didn't seem to do anything Did it work for you? @Psalty will have a look at this and see how i get on, thanks.. and how come its so fast? Is there anywhere i can learn some more about autoit RegExp's so i don't have to bug people on here? They call me MrRegExpMan Link to comment Share on other sites More sharing options...
GEOSoft Posted December 1, 2009 Share Posted December 1, 2009 Change it to this $html = "<TR tes tesn .... yest /\/\@?''' <tR 1111 <tr adawd" StringRegExpReplace($html, "(?i)<tr.*?>", "") $iCount = @Extended MsgBox(0, "Result", "There are " & $iCount & " <tr> elements on the page.") George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
Steveiwonder Posted December 1, 2009 Author Share Posted December 1, 2009 Thank alot both of you. Both work as needed.I'm gonna use Geosoft's version for one reason only, i have no idea how to use Regular Expression yet and i need to learn so i figure this is the best way to start. It also seems more flexible for future use? (Correct me if i'm wrong)but thanks again both of you They call me MrRegExpMan Link to comment Share on other sites More sharing options...
PsaltyDS Posted December 1, 2009 Share Posted December 1, 2009 Ok, ok Psalty. We read you.Oops, sloppy mousing... Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now