SorryButImaNewbie

[SOLVED: basic COM help] Working on string returned from API, m'i doing this okey?

5 posts in this topic

#1 ·  Posted (edited)

Hello, I try to pull some data from a webpage.

I need the value of local currency compared to the euro. I can go and open the required API page on the required date interval, read in from elsewhere, its format in the memory of the script is like this: 20161005 so YYYYMMDD.

The return string if I try to view the opened API's source code is simple, but if I use _IEBodyReadHTML, _IEDocReadHTML, _IEBodyReadText i get it back with a lot of html code (i guess, it looks like HTML, and one of them doesn't show any string in the MsgBox when I try to chechk it) about its color etc. I need dates and the corresponding currency exchange rates (these can be found between  <kozep>exchangerate</kozep>, but I need the first only after every month because the second is the avarage exchange rate of the month (i guessed this again).

Now I have an approach which will work evantually I guess, but I'm pretty sure its not the standard aproach or how the creators of autoit envisiond the useage of their functions :)

So I post my code here hoping, someone tells me how to do this simply and inteligently.

Sorry for such question but I only used regex for much, much simplier tasks. My approach is to identify everything I dont need basicly, after getting rid of a few key problematic chars (like " ) and do this untill I'm only left with what I need. THe problem with this if anything change in the envierment the script has like 99,9999% chance to not run properly, and I would like to handle this better, even if APIs usually don't change that much according to my knowladge.

Also I write this in a separat function for now, I will plan to call it from my other function which does different things with the corresponding excel files, among them is the calculation of local currency values of the bills with data from MNB (Hungarian National Bank or something)

Here is my code so far, and what its gives back, I will update this with the pic from the source code I see from internet explorer and the webpage I see. Thank you for your help and insight!

Func InternetRead()

    ;Create the URL for napiarfolyam API
#cs
http://api.napiarfolyam.hu/?bank=mnb&valuta=eur&datum=20160901&datumend=20160926
</penznem> után jön a használt árfolyam
Példa Return:
<item>
<bank>mnb</bank>
<datum>2016-09-06 11:25:18</datum>
<penznem>EUR</penznem>
<kozep>309.8500</kozep>
<kozep>310.1700</kozep>
</item>
#ce

;Global $MinTime ;20160601000000 these are example variables I read in, during the function that will call this one
;Global $MaxTime ;20160610000000

    Local $URLbase = "http://api.napiarfolyam.hu/?bank=mnb&valuta=eur" ;view-source:
    ;Local $MinTimeFormated = StringTrimRight($MinTime, 6)
    ;Local $MaxTimeFormated = StringTrimRight($MaxTime, 6)
    Local $URL = $URLbase & ("&datum=" & "20160601" & "&datumend=" & "20160603" & "")
;20160603 $MinTimeFormated, $MaxTimeFormated
    MsgBox(64, "Értesítés", "URL:" & $URL & "")

    Local $oIE = _IECreate($URL)
    Sleep(1000)
    Local $sHTML = _IEDocReadHTML($oIE)
    ;_IEBodyReadHTML - Is string but, MsgBox shows nothing
    ;_IEDocReadHTML - at least retunrs something (extra then what i see from thw source code, ctrl+u)
    ;_IEBodyReadText - at least retunrs something (extra then what i see from thw source code, ctrl+u)
    $sHTML = String($sHTML)
    If IsString($sHTML) Then
        MsgBox(64, "HTML String?", "The variable is a string")
    Else
        MsgBox(64, "HTML String?", "The variable is not a string")
    EndIf
    ;Variable is a String!
    ;StringSplit
;">datum</span>&gt;</a>" & "20160601"

    Local $Stuff = Chr(34) ;The " char
    ;Local $Stuff2 = "<a xmlns=http://www.w3.org/1999/xhtml class=collapse style=color: blue; marginleft: 2em; position: relative; href=#>&lt;<span style=color: rgb(153,0,0);>"
    Local $StringInput = $sHTML
    Local $sHTML = StringRegExpReplace($StringInput, "[-]", "")
    Local $StringInput = $sHTML
    Local $sHTML = StringStripWS($StringInput, $STR_STRIPLEADING + $STR_STRIPTRAILING + $STR_STRIPSPACES)
    Local $StringInput = $sHTML
    Local $sHTML = StringReplace($StringInput, $Stuff, "")
    ;Local $StringInput = $sHTML
    ;Local $sHTML = StringReplace($StringInput, $Stuff2, "")
    Local $StringInput = $sHTML
    Local $ValutaPosition = StringInStr($StringInput, "</valuta>")
    Local $sHTML = StringTrimLeft($StringInput, $ValutaPosition+8)
    Local $StringInput = $sHTML

    ;StringReplace($StringInput, "<a xmlns="http://"

    ;Local $StringInput = $sHTML
    ;StringInStr
    ;Local $sHTML = StringTrimLeft($StringInput, 1850)
    ;Local $aDays = StringSplit($sHTML, ">datum</span>&gt;</a>")
    ;_ArrayDisplay($aDays)
    ;If @error Then Exit MsgBox($MB_SYSTEMMODAL, "StringRegExpReplace Error", "Error listing:" & @CRLF & "@error = " & @error & ", @extended = " & @extended)
    MsgBox(64, "HTML String?", "$sHTML:" & $sHTML)

EndFunc ;==>InternetRead

 

APIHTMLString.JPG

Edit:

Sorry for the long post and I hope I was able to write dowm my problem in a way that others can understand, pls ask anything if you don't.

Edited by SorryButImaNewbie

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Hi.

First of all, the return seems to be XML, not HTML. The current return you get is HTML because IE tries to show you the return visually with styling in HTML.

I would advise not using IE. It's slower than just getting the return with something like InetGet. Also as mentioned it's not formatted by IE.

I've made a small example, showing one way to do it :)

$oHTTP = ObjCreate("WinHttp.WinHttpRequest.5.1")
$oHTTP.Open("GET", "http://api.napiarfolyam.hu/?bank=mnb&valuta=eur&datum=20160601&datumend=20160603", False)
$oHTTP.Send()
$sXML = $oHTTP.responseText
$oXML = ObjCreate("Microsoft.XMLDOM")
$oXML.loadXML( $sXML )
$oXML_Nodes = $oXML.SelectNodes("./arfolyamok/deviza/item")
For $i=0 To $oXML_Nodes.Length-1
    $oXML_Node = $oXML_Nodes.Item($i)
    $oXML_Node_Bank = $oXML_Node.SelectNodes("./bank")
    $oXML_Node_Bank = $oXML_Node_Bank.Length>0?$oXML_Node_Bank.Item(0).text:""
    $oXML_Node_Datum = $oXML_Node.SelectNodes("./datum")
    $oXML_Node_Datum = $oXML_Node_Datum.Length>0?$oXML_Node_Datum.Item(0).text:""
    $oXML_Node_Penznem = $oXML_Node.SelectNodes("./penznem")
    $oXML_Node_Penznem = $oXML_Node_Penznem.Length>0?$oXML_Node_Penznem.Item(0).text:""
    $oXML_Node_Kozeps = $oXML_Node.SelectNodes("./kozep")
    $oXML_Node_Kozep01 = $oXML_Node_Kozeps.Length>0?$oXML_Node_Kozeps.Item(0).text:""
    $oXML_Node_Kozep02 = $oXML_Node_Kozeps.Length>1?$oXML_Node_Kozeps.Item(1).text:""
    ConsoleWrite( "Match [" & StringFormat("%02i", $i+1) & "]:"&@CRLF& _
        @TAB&"Bank: "&@TAB&$oXML_Node_Bank&@CRLF& _
        @TAB&"Datum: "&@TAB&$oXML_Node_Datum&@CRLF& _
        @TAB&"Penznem: "&@TAB&$oXML_Node_Penznem&@CRLF& _
        @TAB&"Kozep01: "&@TAB&$oXML_Node_Kozep01&@CRLF& _
        @TAB&"Kozep02: "&@TAB&$oXML_Node_Kozep02&@CRLF _
    )
Next

Sources: DOM Reference, XPath Syntax and WinHttpRequest object

 

Let me know if you have questions, I'm happy to answer

Edited by genius257
Fixed code problems
1 person likes this

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Hmmm, this looks promising because it gets around IE, which is a big plus, didn't know that I can do it this way. Also I thought about my regEx problem a bit, and I think I can getter all 8 numbers, + the first char before the date usually, split them and order them into an array. The problem is that the exchange rate doesn't change every day necesseraly so I will have to keep that in mind.

Thank you genius257!

 

Edit: I'm really thankfull, but I'm only gona check this out tomorrow :D

Edited by SorryButImaNewbie

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

21 hours ago, genius257 said:

Hi.

First of all, the return seems to be XML, not HTML. The current return you get is HTML because IE tries to show you the return visually with styling in HTML.

I would advise not using IE. It's slower than just getting the return with something like InetGet. Also as mentioned it's not formatted by IE.

I've made a small example, showing one way to do it :)

$oHTTP = ObjCreate("WinHttp.WinHttpRequest.5.1")
$oHTTP.Open("GET", "http://api.napiarfolyam.hu/?bank=mnb&valuta=eur&datum=20160601&datumend=20160603", False)
$oHTTP.Send()
$sXML = $oHTTP.responseText
$oXML = ObjCreate("Microsoft.XMLDOM")
$oXML.loadXML( $sXML )
$oXML_Nodes = $oXML.SelectNodes("./arfolyamok/deviza/item")
For $i=0 To $oXML_Nodes.Length-1
    $oXML_Node = $oXML_Nodes.Item($i)
    $oXML_Node_Bank = $oXML_Node.SelectNodes("./bank")
    $oXML_Node_Bank = $oXML_Node_Bank.Length>0?$oXML_Node_Bank.Item(0).text:""
    $oXML_Node_Datum = $oXML_Node.SelectNodes("./datum")
    $oXML_Node_Datum = $oXML_Node_Datum.Length>0?$oXML_Node_Datum.Item(0).text:""
    $oXML_Node_Penznem = $oXML_Node.SelectNodes("./penznem")
    $oXML_Node_Penznem = $oXML_Node_Penznem.Length>0?$oXML_Node_Penznem.Item(0).text:""
    $oXML_Node_Kozeps = $oXML_Node.SelectNodes("./kozep")
    $oXML_Node_Kozep01 = $oXML_Node_Kozeps.Length>0?$oXML_Node_Kozeps.Item(0).text:""
    $oXML_Node_Kozep02 = $oXML_Node_Kozeps.Length>1?$oXML_Node_Kozeps.Item(1).text:""
    ConsoleWrite( "Match [" & StringFormat("%02i", $i+1) & "]:"&@CRLF& _
        @TAB&"Bank: "&@TAB&$oXML_Node_Bank&@CRLF& _
        @TAB&"Datum: "&@TAB&$oXML_Node_Datum&@CRLF& _
        @TAB&"Penznem: "&@TAB&$oXML_Node_Penznem&@CRLF& _
        @TAB&"Kozep01: "&@TAB&$oXML_Node_Kozep01&@CRLF& _
        @TAB&"Kozep02: "&@TAB&$oXML_Node_Kozep02&@CRLF _
    )
Next

Sources: DOM Reference, XPath Syntax and WinHttpRequest object

 

Let me know if you have questions, I'm happy to answer

Okey, so let me respond again.

I never used COM objects before, and seeing your sources and some of aoutoits COM ref, I think its time to dwell in to it a little more. Your code run as it was intended, I thank you for it (my original idea of stripping away the useless bits was literally the longest and stupidest way to do it now, that I gave it a little thought :D I'm happy I stoped doing that :) ) I can just modify a bit and it will become part of the automatization (with a comment refering to you if its okey) and I start to learn a bit about COM just so I can use them in the future.

In this spirit I would like to ask, how will I refer to for exampe: Match [03]/Kozep01? value? Is there a variable I can call here?

 

Edited by SorryButImaNewbie

Share this post


Link to post
Share on other sites

#5 ·  Posted

Happy to know it helped you :)

I'm always happy if i get referenced, but it's not something i require ;)

If you need help with the code, don't hesitate to PM me :D

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • AnAdventurer
      Newbie with a question... or two (IE focused)
      By AnAdventurer
      Hello hello!
      As the title suggests, I am fairly new to AutoIt. In fact, I am new to scripting/coding in general! I've done a few Codecademy courses on CSS and HTML and perhaps Java though this was all a few years back. I've recently come across AutoIt and decided to give it a try since I do quite a few repetitive tasks on a daily basis. In the last couple of weeks I've managed to master (or at least get comfortable with) mouse clicks(left/right), window focus, sending key strokes, controls, and pixel search.
      Now let's get to the topic.
      At this point in time I've tried out a few simple IE scripts but I am having difficulty understanding some things and tying everything together into one tool.
      Specifically, I am struggling with this little bit of code I got from DaleHohm in his IE examples thread. Post #3 (The last example.)
      #include <IE.au3> $sImgDir = "c:\foo\"; Please make certain this folder already exists (silent failure if not) $sWebPage = "http://www.autoitscript.com/forum/index.php?"; webpage with images $oIE = _IECreate() _IENavigate($oIE, $sWebPage) $oIMGs = _IETagNameGetCollection($oIE.document, "img") ; Loop through all IMG tags and save file to local directory using INetGet For $oIMG in $oIMGs $sImgUrl = $oIMG.src $sImgFileName = $oIMG.nameProp INetGet($sImgUrl, $sImgDir & $sImgFileName) Next I have a couple questions about the code above.
      1) ".src" ".nameProp" What are these called? I figured out that I can change the .src to something like .href and it gets anything on the webpage with a .href tag but where can I learn more about these? I still haven't been able to figure out what ".nameProp" is for or what it does. Is there any documentation/list of all the different ".PurpleTextAfterAVariable" (Edit: Not sure why it's red in the above example, just checked SciTE and it's purple there) that I can use?
      2) I understand that the code above gets every "For $oIMG in $oIMGs" on the page but how can I make it only get the first 5? I've tried doing a "count" and a "for" but I am unsure what to replace the "For...in" statement with to keep the script functional. Is there a way to limit the _IETagNameGetCollection function to only get a specific amount of tags?
       
      Finally, the reason I can't just use the code as is.
      The site I am trying to get images from works in this way:
      A href= "Link-To-Picture.jpg" Img src= "Link-To-Picture-thumbnail.jpg" The script above downloads every single thumbnail from the image gallery which is great, it does what it's supposed to but I need the full resolution image.
      After changing the script to get anything with an "A href" tag it does what I need it to do, it gets every single image in full resolution... along with every single one of the 80-100 extra files/links to other sites that are listed under an "A href" tag.
       
      Now I've come up with two solutions but unfortunately, as I mentioned above. I don't know how to put my solution into the code above to make it work.
      Solution 1) Only get the first 5 instances of "A href" on the page.
      As mentioned above. I don't know how to do this.
      Solution 2) Read the entire page, find "-Thumbnail.jpg" replace with ".jpg" and use the script as is.
      I understand how to do a replace. All I am missing is how to do a replace within a field in the code of an IE page. I assume that I have to use the HTMLRead functions but how do I use/alter the data read?
      I really hope all of this make sense and that someone here will be able to help me figure out a solution to my issue or at least answer one of my questions! I do have plenty more questions and I am sure that I'll have even more by the time I figure this out.
      Thank you very much for your time!
    • david1337
      StringReplace special characters in htm file
      By david1337
      Hey guys
      Can anyone help me explain this?
      $szFile = "test.htm" $szText = FileRead($szFile) $szText = StringReplace($szText, "hello", "ö") FileDelete($szFile) FileWrite($szFile,$szText) If the file "test.htm" has it's text changed into something containing non US characters, in this example "ö", the output is " ö " when shown in a browser.
      If i manually change the text in the "test.htm" file to "ö" - the output in the browser is "ö" !
      In both cases, if the htm file is opened in notepad, the content is just "ö" - but the one changed from the script, still opens as " ö " in a browser. How weird is this?
      I am aware that I can replace the text to " &ouml;" , which is the HTML code for "ö" - then the output is correct in the browser, but this is just dumb when there are a lot of characters to be changed

      Does anyone know why this happens, and how to solve it in a more simple way?
       
    • coffeeturtle
      StringReplace differentiating numeric values
      By coffeeturtle
      Hello. I need to perform a specific string replace, but not sure how to go about it.
      The scenario is this: I have a large block of text. Within the text colons appear ":",  Sometimes the colons are used in a sentence appearing after a word. Other times they appear in between numbers like a ratio or a sport score (e.g. "6:8").
      I want to replace the colons appearing between numeric values like 6:8 with the word "to", but not the ones appearing at the end of a sentence.
      Is there a way that I can have StringReplace (or any other method) differentiate when to replace the colon based on it appearing between numbers?
      I did try searching for a similar scenario.
      Thank you for any help. 
    • cookiemonster
      StringReplace from string with line break
      By cookiemonster
      Im trying to edit a file, I want to find a string which has a line break in it, and replace it with a string that has multiple line breaks in it.
      editfile.txt looks like this:
      dog cat mouse chicken my au3 script looks like this but is not working, i suspect because of how I am trying to do the line breaks?
      Func EditFile($CurrentFile) $szFile = "$CurrentFile" $szText = FileRead($szFile,FileGetSize($szFile)) $szText = StringReplace($szText, "Cat" & @CRLF & "Dog", "Hippo" & @CRLF "Lion" & @CRLF & "Tiger") FileDelete($szFile) FileWrite($szFile,$szText) EndFunc ;--EditFile-- But once ran the file should look like this:
      Hippo Lion Tiger cat mouse chicken I cannot replace by line number as the animals are not in the same line in each copy of the file i want to run against.
       
      Can anyone help?
    • ken82m
      StringReplace Multiple Search Strings
      By ken82m
      Nothing amazing but I use it all the time, I'm surprised something similar hasn't been added to the standard StringReplace.   I've never been any good at regular expressions, I'm sure if I was the whole example below could be done in one line
      But for the simple minded like me here you go    Enjoy $BIOS = _StringMultiReplace(CleanWMIC("bios", "biosversion"), "(|)|{|}", "") Func _StringMultiReplace($zString, $zSearchString, $zReplaceString, $zDelimeter = "|") If $zString = "" OR $zSearchString = "" OR $zDelimeter = "" Then SetError(1) Return $zString EndIf $zArray = StringSplit($zSearchString, $zDelimeter) For $i = 1 to $zArray[0] $zString = StringReplace($zString, $zArray[$i], $zReplaceString) Next Return $zString EndFunc