xeroTechnologiesLLC Posted January 19, 2012 Share Posted January 19, 2012 (edited) Greetings,I've been using vbscript, html and various other languages to build a script from secureCRT to capture certain data off a webpage after a few various irrelevant steps and that works just fine.But it's using abilities in vbScript to read the entire webpage, a regular expression to capture the first id# then assign it to a variable.I just started working with AutoIT and was just curious if there was a better way.Essentially - capture the first id# (or id# with a predictable report name beside it: 12345 team42_1:08pm) and assign it to a variable to be used later in the program.I'm extremely new to AutoIT so I'm not sure what the logic / syntax to do this would be.The second part of the problem is to be able to read a link's a href value, but I'm not quite to that stage yet and haven't looked through the forums and such for that yet. I only post it here to give some kind of idea what i will need the report ID in later.Thanks in advance for any time available to answer this inquiry.-Nick Edited April 7, 2012 by xeroTechnologiesLLC Link to comment Share on other sites More sharing options...
somdcomputerguy Posted January 19, 2012 Share Posted January 19, 2012 The _INetGetSource, StringInStr, StringSplit, and StringMid functions will probably be of use to you here. These are certainly not the only functions that could be used in a case like this, but I use them for a similar thing.For instance; I use code like this to find a particular word on a webpage, then assign that word to a variable and pass it on to another function.Local $Text_Array[5] = ["Registering", "Creating", "Modifying", "Activating", "Viewing User Profile"] $Source = _INetGetSource($URL) For $a = 0 To UBound($Text_Array) - 1 $KeyWord = StringInStr($Source, $Text_Array[$a], 1) If $KeyWord <> 0 Or $Debug Then ;If text does exist $FoundWord = StringSplit(StringMid($Source, $KeyWord), " ") Actions($FoundWord[1]) EndIf Next - Bruce /*somdcomputerguy */ If you change the way you look at things, the things you look at change. Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 19, 2012 Author Share Posted January 19, 2012 I get an invalid function error on using _INetGetSource is there an include that i need to use? Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 19, 2012 Author Share Posted January 19, 2012 nevermind - include inet.au3 ftw so once i have the source - how can i get it into an array? i'm assuming that simply assigning the source to a variable, the variable won't hold all of it due to character size, or will it? Link to comment Share on other sites More sharing options...
kylomas Posted January 19, 2012 Share Posted January 19, 2012 xeroTechnologiesLLC, Variable size is not an issue. The help file does not give a max size so memory might be the only real restriction (not sure of that - perhaps an expert could advise). The first part of your problem is easily handled using the solution that somdcomputerguy gave in post #2. Parsing out href strings can be accomplished in a number of ways. Provide a URL to the WEB site that you are parsing for further help. kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 19, 2012 Author Share Posted January 19, 2012 so i figured out what my problem was. using inetgetsource doesn't read the active page/ie Object apparently. i'm writting this for a page thats behind a secure service and requires a login. i send the login at the beginning of the script and it logs in just fine, but when I try to capture the source - it doesn't. I get the 'please login' error page. thoughts? Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 19, 2012 Author Share Posted January 19, 2012 unfortunately i am unable to provide the url as noted, it's an internal secure web tool : (good times right? ) Link to comment Share on other sites More sharing options...
kylomas Posted January 19, 2012 Share Posted January 19, 2012 xeroTechnologiesLLC, You have this working in VBS? VB script is easily translated to autoIT. kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 19, 2012 Author Share Posted January 19, 2012 disregard - found temporary solution: $pagesource = _IEDocReadHTML($oIE) wrote the output to a file to verify and it is capturing the right data. Link to comment Share on other sites More sharing options...
kylomas Posted January 19, 2012 Share Posted January 19, 2012 (edited) xeroTechnologiesLLC, This will get you all "href"'s (assuming that stmt starts with "href='" and ends with next"'") $href_array = StringRegExp($Value, "href='([^#].*?)'", 3) Where $value = the string from your WEB source $href_array = an array of href elements kylomas edit: changed regex to not list "#" hrefs Edited January 19, 2012 by kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 19, 2012 Author Share Posted January 19, 2012 apologies for a dumb question continuation, how do i take that and capture the first iD# on the page? I don't necessarily need the link, i just need to capture the first 5 digit number that shows up on the page (first one being a known predictable inevitability). I'm headed home for the night but will be back in tomorrow to continue working on this. I am not able to easily convert the vb version over to this either. i'm finding a lot of the syntax and such are not the same and that is what i am having trouble with. thanks. Below is what i had before that works: expandcollapse popup' Below is to read the ID page so we can capture the ID# from the HTML ' It's assigned to an array because a string will not hold enough characters dim plusJunk set junk = ie.document plusJunk = junk.body.innerHTML dim data_array data_array = split(plusJunk, chr(13)) ' Building the array in a way that we can read if we need to break down the report page for testing dim a, temp temp = "" for a = 0 to ubound(data_array) temp = temp & cstr(a) & " " & data_array(a) & chr(13) next ' This segment is creating a file to output data to. Used only in testing - commented (REM) out otherwise. Stuff = temp REM Set myFSO = CreateObject("Scripting.FileSystemObject") REM Set WriteStuff = myFSO.OpenTextFile("c:output.txt", 8, True) REM WriteStuff.WriteLine(Stuff) REM WriteStuff.Close REM SET WriteStuff = NOTHING REM SET myFSO = NOTHING ' Expression set arrayExp = new RegExp arrayexp.ignoreCase = true arrayexp.global = false arrayExp.pattern = ">[0-9][0-9][0-9][0-9][0-9]<" ' regular expression search pattern ' Begin cycling through the array for the regular expression pattern above to find the ID# used in pulling the report for a = lbound(data_array) to ubound(data_array) if arrayExp.test(data_array(a)) then exit for end if next ' Break down the ID# from the array line string found above for b = len(data_array(a)) to 1 step -1 temp = mid(data_array(a), b, 1) if (asc(temp) >= 48) and (asc(temp) <= 57) then temp = mid(data_array(a), b - 4, 5) exit for end if next Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 20, 2012 Author Share Posted January 20, 2012 $idNumber=StringRegExp($pagesource, "[^0-9a-zA-z]([0-9][0-9][0-9][0-9][0-9])[^0-9a-zA-z]", 3) $keyword=stringinstr($pagesource, $idNumber, 0) MsgBox(0,"",$keyword)...is what i have now to try and capture the ID number on the page.$pagesource is the information read by using _IEDocReadHTML and I've used a message box to verify that it is capturing everything I need it to...so it's good there.But for some reason, the above code just returns blank information, not finding the 5 digit number that represents the ID number. In the source it reads like >12345< so I've tried various expressions:[0-9][0-9][0-9][0-9][0-9] [0-9]{5} >[0-9][0-9][0-9][0-9][0-9]< >[0-9]{5}<None of them seem to be capturing anything.Thoughts? Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 20, 2012 Author Share Posted January 20, 2012 $pagesource = _IEDocReadHTML($oIE) $idNumber=StringRegExp($pagesource, ">[0-9][0-9][0-9][0-9][0-9]<", 0) MsgBox(0,"",$idNumber) $keyword=stringinstr($pagesource, $idNumber, 0, 1) MsgBox(0,"",$keyword) ...this is the code i'm working with as of 10:07am. the first message box returns a 1 the second a 40 ...not sure why. Link to comment Share on other sites More sharing options...
xeroTechnologiesLLC Posted January 20, 2012 Author Share Posted January 20, 2012 $idNumber=StringRegExp($pagesource, ">[0-9][0-9][0-9][0-9][0-9]<", 1) $idNumber[0]=StringReplace($idNumber[0], ">", "") $idNumber[0]=StringReplace($idNumber[0], "<", "") msgbox(0, "d", $idNumber[0]) disregard - i was missing the whole .... access the array item...part... noob 101 [RESOLVED] Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now