Sign in to follow this  
Followers 0
SueB

Looping through links on web page, clicking each

7 posts in this topic

My ultimate goal is to take a page of fund reports and save each individually to post them for faculty members. I am practicing on my home page to understand the process and coding before working on the real page which has frames. With http://home.comcast.net/~seboggs/ open in IE the following code works, the message box showing me each link in turn.

CODE
#include <IE.au3>

#include <File.au3>

Opt("TrayIconDebug",1)

$oIE = _IEAttach ("Sue Boggs' Home Page")

$oLinks = _IELinkGetCollection ($oIE)

For $oLink In $oLinks

$text=_IEPropertyGet($oLink, "innertext")

MsgBox(0, "Link Info", "$text is " & $text)

Next

However, when I add the lines to click on the link and go back

CODE
For $oLink In $oLinks

$text=_IEPropertyGet($oLink, "innertext")

MsgBox(0, "Link Info", "$text is " & $text)

_IELinkClickByText ($oLink, $text, 0, 0)

Sleep (3000)

; _IEAction($oIE, "saveas")

_IEAction($oIE, "back")

Next

it works fine for the first link on the page, clicking on it and going back to the main page (and also saving it, which I've commented out while debugging), but when it loops around the message box shows $text is 0 and I get the error

--> IE.au3 V2.3-1 Error from function _IEPropertyGet, $_IEStatus_InvalidObjectType

C:\Program Files\AutoIt3\Include\IE.au3 (828) : ==> The requested action with this object has failed.:

Local $found = 0, $link, $linktext, $links = $o_object.document.links

Local $found = 0, $link, $linktext, $links = $o_object.document^ ERROR

I suspect it has something to do with this advice I found in the archives from PsaltyDS Your $oIE should still be valid after navigation, because it points to the browser instance, not the DOM (page) that currently loaded. That is what caused me to change my "saveas" and "back" lines from $oLink to $oIE, whcih enabled the first click/return to work. In changing the page loaded t seems to have lost what $oLink is. But that is as far as my knowledge takes me. Can someone give me a suggestion on how to proceed?

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

I recommend you collect all the links in the site and then use _InetGetSource() and write that to an .html file in a loop through all the links. I've done it and it works, but it won't save pictures or other files that the site might need. This works a dream if you're only interested in text. Although I don't think it'll work with a site that requires authentication.

Edited by Nahuel

Share this post


Link to post
Share on other sites

havent read all of this, but if the general idea is to click each link, then I would add an array at the beginning. use the array for the text value when clicking

Share this post


Link to post
Share on other sites

Here's an example of what I ment. It's way faster:

#include <IE.au3>
#include <File.au3>
#include <inet.au3>
Opt("TrayIconDebug", 1)

$oIE = _IECreate("http://home.comcast.net/~seboggs/")
$oLinks = _IELinkGetCollection($oIE)

;~ For $oLink In $oLinks
;~  $text = _IEPropertyGet($oLink, "innertext")
;~  MsgBox(0, "Link Info", "$text is " & $text)
;~ Next

For $oLink In $oLinks
    $HtmlSource=_INetGetSource($oLink.href)
    
    $text = _IEPropertyGet($oLink, "innertext")
    FileWrite(@ScriptDir & "\" & $text & "(" & Random(0,1000,1) & ").html",$HtmlSource)
;~  MsgBox(0, "Link Info", "$text is " & $text)
;~  _IELinkClickByText($oLink, $text, 0, 0)
;~  Sleep(3000)
;~  ; _IEAction($oIE, "saveas")
;~  _IEAction($oIE, "back")
Next

Share this post


Link to post
Share on other sites

Nahuel, it works great on my home page but as you suspected in your first post it doesn't work on the real page, which is passworded. It did save a file for each department but instead of the financial page it saved a web page with a prompt for username/password for each. But I certainly learned from your example so thank you. I'm having a lot of fun learning AutoIt :)

Hatcheda, I'll investigate using an array tomorrow.

Share this post


Link to post
Share on other sites

try wget for windows, it will do that for you.

Share this post


Link to post
Share on other sites

For completeness in the forum archive, this is what I finally got to work.

#include <IE.au3>
#include <File.au3>
#include <inet.au3>
Opt("TrayIconDebug",1)
Opt("WinTitleMatchMode", 2)

$oIE = _IEAttach ("Books/Media 2007/2008")
$oFrame = _IEFrameGetObjByName ($oIE, "data")
$oLinks = _IELinkGetCollection ($oFrame)
#cs ----------------------------------------------------------------------------------------- 
The following code, suggested by Nahuel, works if the page isn't passworded, like my home page. On the
        fund reports it makes a file for each page but the contents are the prompt for the username/password.
        But the way of getting the contents of a link, INetGetSource($oLink.href) is worth remembering.
For $oLink In $oLinks
    $HtmlSource=_INetGetSource($oLink.href)
    $text = _IEPropertyGet($oLink, "innertext")
    FileWrite(@ScriptDir & "\" & $text & "(" & Random(0,1000,1) & ").html",$HtmlSource)
Next
#ce -----------------------------------------------------------------------------------------

$Links=FileOpen("departments.txt", 0) ;file listing text of each hyperlink (fund name with space after it)
; Check if file opened for reading OK
If $Links = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf

; Read in lines of text until the EOF is reached
While 1
    $oFrame = _IEFrameGetObjByName ($oIE, "data") ;this works to make it not lose the $oFrame
    $dept = FileReadLine($Links)
    If @error = -1 Then ExitLoop
    _IELinkClickByText ($oFrame, $dept, 0, 0)
    Sleep (3000)
    $hIE=_IEPropertyGet($oIE, "hwnd")
    WinActivate ($hIE)
    WinWaitActive($hIE)
    ConsoleWrite ("Debug: Frame data is active" & @LF)
    ControlSend($hIE, "", "", "!f") ; File  
    ControlSend($hIE, "", "", "a") ; SaveAs
    WinWait("Save Web Page", "", 10)
    $hSave = WinGetHandle("Save Web Page", "")
    WinActivate($hSave)
    ControlSend($hSave, "", "[CLASS:ComboBox; INSTANCE:3]", "{DOWN}{DOWN}{ENTER}") ;change file type to web archive, single file
    ControlClick($hSave, "","[CLASS:Button; Test:Save; INSTANCE:2]") ; click Save button
        Sleep (3000) ; need this or it gets to "go back" too fast and errors out
    $hSave2 = WinGetHandle("Save Web Page", "")     
        ControlClick($hSave2, "","[CLASS:Button; Test:Yes; INSTANCE:1]") ;for "file already exists" alert; works even if no file exists
        Sleep (3000)
    _IEAction ($oIE, "back") ; not sure why $oIE worked when $hIE didn't. Obviously have a lot to learn
Wend
FileClose($Links)
Exit

Sue

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0