PMac Posted July 22, 2008 Share Posted July 22, 2008 Hi, I'm new to AutoIt, with some limited experience of Python, and I've been trying to figure out the text-handling side of it. I've written a script which opens Firefox, logs into my Travian (online game) account, copies the page source to the clipboard and assigns a variable to it, but I've gotten stuck when trying to parse the page source and extract the information I want. I followed the tutorial for StringRegExp in AutoIt Help and was able to pull out a string, but when I tried to process that string through StringRegExp a second time, it threw up an error. Here's some code to illustrate the problem: ;Get page source and parse it for resource amounts and other info Sleep(3000) ;opens the page source viewer in firefox Send("^u") ;Give the browser time to respond Sleep(100) ;Highlight the contents Send("^a") ;Copy them to the clipboard Send("^c") $pagesource = Clipget() The section of HTML source I'm interested in looks like this: <td><img class="res" src="img/un/r/1.gif" title="Wood"></td> <td id=l4 title=8>768/800</td> <td class="s7"> <img class="res" src="img/un/r/2.gif" title="Clay"></td> <td id=l3 title=8>768/800</td> <td class="s7"> <img class="res" src="img/un/r/3.gif" title="Iron"></td> <td id=l2 title=8>768/800</td><td class="s7"> <img class="res" src="img/un/r/4.gif" title="Wheat"></td> <td id=l1 title=10>773/800</td> To process it, I tried to isolate each resource by including their unique id's in the search pattern, which worked fine. This pulls out the stats for the wood resource, along with the HTML that identifies it as being the wood resource: $wood1 = StringRegExp($pagesource, "(id=l4 title=8>[0-9]{3,9}/[0-9]{3,9})", 1) MsgBox(0, "Wood1", $wood1[0]) This returns "id=l4 title=8>768/800" The problem came when I tried to remove the extraneous HTML in a second processing step: $wood2 = StringRegExp($wood1, "([0-9]{3,9}/[0-9]{3,9})", 1) MsgBox(0, "Wood2", $wood2[0]) This returns the following error: "MsgBox(0, "Wood2", $wood2[0]) MsgBox(0, "Wood2", $wood2^ ERROR Error: Subscript used with non-Array variable." Can someone please point out what I'm doing wrong? Also, is there any good beginners material online? I've searched, but I've only found a handful of tutorials, and I've mostly been stumbling around in the dark trying to figure out how to do things from reading the official documentation. Link to comment Share on other sites More sharing options...
Paulie Posted July 22, 2008 Share Posted July 22, 2008 (edited) First of all, You are much better off with _INetGetSource() Secondly, Try this: #include <Array.au3> $String = '<td><img class="res" src="img/un/r/1.gif" title="Wood"></td>'&@CRLF& _ '<td id=l4 title=8>768/800</td>'&@CRLF& _ '<td class="s7"> <img class="res" src="img/un/r/2.gif" title="Clay"></td>'&@CRLF& _ '<td id=l3 title=8>768/800</td>'&@CRLF& _ '<td class="s7"> <img class="res" src="img/un/r/3.gif" title="Iron"></td>'&@CRLF& _ '<td id=l2 title=8>768/800</td><td class="s7"> <img class="res" src="img/un/r/4.gif" title="Wheat"></td>'&@CRLF& _ '<td id=l1 title=10>773/800</td>' $NumberPattern = "\d{3,9}/\d{3,9}" $TitlePattern = '(?: title=")(.{4,6})"' $Result2 = StringRegExp($String, $NumberPattern, 3) $Result1 = StringRegExp($String, $TitlePattern, 3) $Bound = Ubound($Result1) Dim $Combo[$Bound][2] For $x = 0 to $Bound-1 $Combo[$x][0] = $Result1[$x] $Combo[$x][1] = $Result2[$x] Next _ArrayDisplay($Combo) Or if you want to do it your way, with the look for each based on the IDs, what you need to use is a non-capturing group. (?: ...) Edited July 22, 2008 by Paulie Link to comment Share on other sites More sharing options...
PsaltyDS Posted July 22, 2008 Share Posted July 22, 2008 Hi, I'm new to AutoIt, with some limited experience of Python, and I've been trying to figure out the text-handling side of it. I've written a script which opens Firefox, logs into my Travian (online game) account, copies the page source to the clipboard and assigns a variable to it, but I've gotten stuck when trying to parse the page source and extract the information I want. I followed the tutorial for StringRegExp in AutoIt Help and was able to pull out a string, but when I tried to process that string through StringRegExp a second time, it threw up an error. To process it, I tried to isolate each resource by including their unique id's in the search pattern, which worked fine. This pulls out the stats for the wood resource, along with the HTML that identifies it as being the wood resource: $wood1 = StringRegExp($pagesource, "(id=l4 title=8>[0-9]{3,9}/[0-9]{3,9})", 1) MsgBox(0, "Wood1", $wood1[0]) This returns "id=l4 title=8>768/800" The problem came when I tried to remove the extraneous HTML in a second processing step: $wood2 = StringRegExp($wood1, "([0-9]{3,9}/[0-9]{3,9})", 1) MsgBox(0, "Wood2", $wood2[0]) This returns the following error: "MsgBox(0, "Wood2", $wood2[0]) MsgBox(0, "Wood2", $wood2^ ERROR Error: Subscript used with non-Array variable." Can someone please point out what I'm doing wrong? Also, is there any good beginners material online? I've searched, but I've only found a handful of tutorials, and I've mostly been stumbling around in the dark trying to figure out how to do things from reading the official documentation. I don't have any problem running this: $wood1 = "id=l4 title=8>768/800" $wood2 = StringRegExp($wood1, "([0-9]{3,9}/[0-9]{3,9})", 1) If @error Then MsgBox(16, "Error", "StringRegExp() failed, @error = " & @error & ", @extended = " & @extended & @LF) Else MsgBox(0, "Wood2", $wood2[0]) EndIf It returns "768/800". muttley Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
PMac Posted July 22, 2008 Author Share Posted July 22, 2008 (edited) First of all, You are much better off with _INetGetSource() Secondly, Try this: #include <Array.au3> $String = '<td><img class="res" src="img/un/r/1.gif" title="Wood"></td>'&@CRLF& _ '<td id=l4 title=8>768/800</td>'&@CRLF& _ '<td class="s7"> <img class="res" src="img/un/r/2.gif" title="Clay"></td>'&@CRLF& _ '<td id=l3 title=8>768/800</td>'&@CRLF& _ '<td class="s7"> <img class="res" src="img/un/r/3.gif" title="Iron"></td>'&@CRLF& _ '<td id=l2 title=8>768/800</td><td class="s7"> <img class="res" src="img/un/r/4.gif" title="Wheat"></td>'&@CRLF& _ '<td id=l1 title=10>773/800</td>' $NumberPattern = "\d{3,9}/\d{3,9}" $TitlePattern = '(?: title=")(.{4,6})"' $Result2 = StringRegExp($String, $NumberPattern, 3) $Result1 = StringRegExp($String, $TitlePattern, 3) $Bound = Ubound($Result1) Dim $Combo[$Bound][2] For $x = 0 to $Bound-1 $Combo[$x][0] = $Result1[$x] $Combo[$x][1] = $Result2[$x] Next _ArrayDisplay($Combo) Or if you want to do it your way, with the look for each based on the IDs, what you need to use is a non-capturing group. (?: ...) Thanks. I tried _INetGetSource(), but the site requires cookies and throws up the source to the login page when I use it, and I've no idea at this point about where to begin with cookie handling to make it work. I've only been learning the language for the past couple of days, so I'm quite limited in what I can do, and learning how to navigate around with my browser seemed like a good place to start. I'm not set on doing it any particular way, but I'll look up non-capturing groups and try to figure out how the code you posted works. Thanks for the pointers. Edited July 22, 2008 by PMac Link to comment Share on other sites More sharing options...
PMac Posted July 22, 2008 Author Share Posted July 22, 2008 I don't have any problem running this: $wood1 = "id=l4 title=8>768/800" $wood2 = StringRegExp($wood1, "([0-9]{3,9}/[0-9]{3,9})", 1) If @error Then MsgBox(16, "Error", "StringRegExp() failed, @error = " & @error & ", @extended = " & @extended & @LF) Else MsgBox(0, "Wood2", $wood2[0]) EndIf It returns "768/800". muttley That works for me too, but when I change $wood1 to StringRegExp(<HTML source from clipboard>, <search pattern>), it causes an error when fed to $wood2, though I don't know why. Link to comment Share on other sites More sharing options...
PsaltyDS Posted July 22, 2008 Share Posted July 22, 2008 That works for me too, but when I change $wood1 to StringRegExp(<HTML source from clipboard>, <search pattern>), it causes an error when fed to $wood2, though I don't know why. $wood1 = '<td><img class="res" src="img/un/r/1.gif" title="Wood"></td>' & @CR & _ '<td id=l4 title=8>768/800</td>' & @CR & _ '<td class="s7"> <img class="res" src="img/un/r/2.gif" title="Clay"></td>' & @CR & _ '<td id=l3 title=8>768/800</td>' & @CR & _ '<td class="s7"> <img class="res" src="img/un/r/3.gif" title="Iron"></td>' & @CR & _ '<td id=l2 title=8>768/800</td><td class="s7"> <img class="res" src="img/un/r/4.gif" title="Wheat"></td>' & @CR & _ '<td id=l1 title=10>773/800</td>' $wood2 = StringRegExp($wood1, "([0-9]{3,9}/[0-9]{3,9})", 1) If @error Then MsgBox(16, "Error", "StringRegExp() failed, @error = " & @error & ", @extended = " & @extended & @LF) Else MsgBox(0, "Wood2", $wood2[0]) EndIf Still works fine... muttley Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
PMac Posted July 22, 2008 Author Share Posted July 22, 2008 $wood1 = '<td><img class="res" src="img/un/r/1.gif" title="Wood"></td>' & @CR & _ '<td id=l4 title=8>768/800</td>' & @CR & _ '<td class="s7"> <img class="res" src="img/un/r/2.gif" title="Clay"></td>' & @CR & _ '<td id=l3 title=8>768/800</td>' & @CR & _ '<td class="s7"> <img class="res" src="img/un/r/3.gif" title="Iron"></td>' & @CR & _ '<td id=l2 title=8>768/800</td><td class="s7"> <img class="res" src="img/un/r/4.gif" title="Wheat"></td>' & @CR & _ '<td id=l1 title=10>773/800</td>' $wood2 = StringRegExp($wood1, "([0-9]{3,9}/[0-9]{3,9})", 1) If @error Then MsgBox(16, "Error", "StringRegExp() failed, @error = " & @error & ", @extended = " & @extended & @LF) Else MsgBox(0, "Wood2", $wood2[0]) EndIf Still works fine... muttley However, this doesn't: Link to comment Share on other sites More sharing options...
PsaltyDS Posted July 22, 2008 Share Posted July 22, 2008 However, this doesn't:What version of AutoIt are you running? muttley Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
PMac Posted July 23, 2008 Author Share Posted July 23, 2008 What version of AutoIt are you running? muttleyVersion 3.2.12.1. I just installed the beta (v3.2.13.4) and tried that, but it gives the same error. Link to comment Share on other sites More sharing options...
PsaltyDS Posted July 23, 2008 Share Posted July 23, 2008 Version 3.2.12.1. I just installed the beta (v3.2.13.4) and tried that, but it gives the same error.Can't help you. I get no errors running the code I posted with the same version of AutoIt. muttley Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
PMac Posted July 23, 2008 Author Share Posted July 23, 2008 Can't help you. I get no errors running the code I posted with the same version of AutoIt. muttley Strange. I was using MsgBox() because it was the closest I could find to Python's Print command to give me feedback on what I was doing as I tested my code, but I found ConsoleWrite() today, which is closer to what I wanted, and it works without any problems. The issue seems to be specific to MsgBox(). ;This code works fine $pagesource = '<td><img class="res" src="img/un/r/1.gif" title="Wood"></td>' & @CR & _ '<td id=l4 title=8>768/800</td>' & @CR & _ '<td class="s7"> <img class="res" src="img/un/r/2.gif" title="Clay"></td>' & @CR & _ '<td id=l3 title=8>768/800</td>' & @CR & _ '<td class="s7"> <img class="res" src="img/un/r/3.gif" title="Iron"></td>' & @CR & _ '<td id=l2 title=8>768/800</td><td class="s7"> <img class="res" src="img/un/r/4.gif" title="Wheat"></td>' & @CR & _ '<td id=l1 title=10>773/800</td>' $NumberPattern = "\d{3,9}/\d{3,9}" $TitlePattern = '(?:title=")(.{4,6})"' $Result2 = StringRegExp($pagesource, $NumberPattern, 3) $Result1 = StringRegExp($pagesource, $TitlePattern, 3) ConsoleWrite($Result2[0] & @LF) ConsoleWrite($Result2[1] & @LF) ConsoleWrite($Result2[2] & @LF) ConsoleWrite($Result2[3] & @LF) Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now