Jump to content

How to export content into Text files?


Recommended Posts

Referring to following link, I would like to know on how to perform following task using AutoIT as shown below:

1) Click Next

2) Export html content into D:\Page1.txt

3) Click Next

4) Export html content into D:\Page2.txt

5) Click Next

6) Export html content until the end of last page.

Does anyone have any suggestions?
Thanks in advance for any suggestions

https://www.cmu.org.hk/cmupbb_ws/eng/page/wmp0100/wmp010001.aspx

 

Link to comment
Share on other sites

I don't know on how to locate the Next and click on it, and the rest of process is OK to be handled without any problem.

Do you have any suggestions?
Thanks, to everyone very much for any suggestions (^v^)

 

Edited by oemript
Link to comment
Share on other sites

#include <IE.au3>
While ProcessExists("iexplore.exe")
    ProcessClose("iexplore.exe")
WEnd
Global $sHTMLSavePath = @ScriptDir & "\IEData"  ;~ Folder Path to Save HTML Documents
Global $oIE = _IECreate("https://www.cmu.org.hk/cmupbb_ws/eng/page/wmp0100/wmp010001.aspx")
Sleep(5000)     ;~ Wait while page loads
Global $i = 1   ;~ Page Number
_GetSiteData($oIE, "Next")

Func _GetSiteData($p_oIEObject, $p_sLinkText)
    Local $oAnchors, $sLinkText, $hFileOpen, $bLinkText = False
    Local $oTableCells = _IETagNameGetCollection($oIE, "td")    ;~ Search for TD containing Navigation Bar Urls i.e. < Previous | Page 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 | Next >
    For $oTableCell In $oTableCells
        If $oTableCell.className = "blue01_12px" Then   ;~ TD ClassName matches known Navigational Bar ClassName
            $oAnchors = _IETagNameGetCollection($oTableCell, "a")
            For $oAnchor In $oAnchors   ;~ Check the InnerText matches 2nd Parameter of _GetSiteData() function
                $sLinkText = _IEPropertyGet($oAnchor, "innerText")
                    If @error Then ContinueLoop ;~ InnerText not found Continue Looping
                If $sLinkText = $p_sLinkText Then   ;~ InnerText matches  2nd Parameter of _GetSiteData() function
                    $bLinkText = True   ;~ Notification a Link was found
                    _IELinkClickByText($oAnchor, $p_sLinkText)  ;~ Click the Link
                    Sleep(3000) ;~ Wait 3 Seconds while page loads
                    $sBodyHTML = _IEBodyReadHTML($oIE)  ;~ Read HTML Body
                    $hFileOpen = FileOpen($sHTMLSavePath & "\Page#" & $i & ".html", 10) ;~ Create new document ...\IEData\Page#x.html
                        FileWrite($hFileOpen, $sBodyHTML)   ;~ Write HTML Body to ...\IEData\Page#x.html
                    FileClose($hFileOpen)   ;~ Close FileOpen
                    $i += 1
                    ExitLoop 2
                EndIf
            Next
        EndIf
    Next
    If $bLinkText = False Then Exit ;~ Notification no further page numbers were found, exit script.
    _GetSiteData($oIE, $i)
EndFunc

 

Link to comment
Share on other sites

3 hours ago, Subz said:
#include <IE.au3>
While ProcessExists("iexplore.exe")
    ProcessClose("iexplore.exe")
WEnd
Global $sHTMLSavePath = @ScriptDir & "\IEData"  ;~ Folder Path to Save HTML Documents
Global $oIE = _IECreate("https://www.cmu.org.hk/cmupbb_ws/eng/page/wmp0100/wmp010001.aspx")
Sleep(5000)     ;~ Wait while page loads
Global $i = 1   ;~ Page Number
_GetSiteData($oIE, "Next")

Func _GetSiteData($p_oIEObject, $p_sLinkText)
    Local $oAnchors, $sLinkText, $hFileOpen, $bLinkText = False
    Local $oTableCells = _IETagNameGetCollection($oIE, "td")    ;~ Search for TD containing Navigation Bar Urls i.e. < Previous | Page 1 - 2 - 3 - 4 - 5 - 6 - 7 - 8 - 9 - 10 | Next >
    For $oTableCell In $oTableCells
        If $oTableCell.className = "blue01_12px" Then   ;~ TD ClassName matches known Navigational Bar ClassName
            $oAnchors = _IETagNameGetCollection($oTableCell, "a")
            For $oAnchor In $oAnchors   ;~ Check the InnerText matches 2nd Parameter of _GetSiteData() function
                $sLinkText = _IEPropertyGet($oAnchor, "innerText")
                    If @error Then ContinueLoop ;~ InnerText not found Continue Looping
                If $sLinkText = $p_sLinkText Then   ;~ InnerText matches  2nd Parameter of _GetSiteData() function
                    $bLinkText = True   ;~ Notification a Link was found
                    _IELinkClickByText($oAnchor, $p_sLinkText)  ;~ Click the Link
                    Sleep(3000) ;~ Wait 3 Seconds while page loads
                    $sBodyHTML = _IEBodyReadHTML($oIE)  ;~ Read HTML Body
                    $hFileOpen = FileOpen($sHTMLSavePath & "\Page#" & $i & ".html", 10) ;~ Create new document ...\IEData\Page#x.html
                        FileWrite($hFileOpen, $sBodyHTML)   ;~ Write HTML Body to ...\IEData\Page#x.html
                    FileClose($hFileOpen)   ;~ Close FileOpen
                    $i += 1
                    ExitLoop 2
                EndIf
            Next
        EndIf
    Next
    If $bLinkText = False Then Exit ;~ Notification no further page numbers were found, exit script.
    _GetSiteData($oIE, $i)
EndFunc

Thanks, to everyone very much for suggestions (^v^)

 

Link to comment
Share on other sites

  • Moderators

As has been stated before, if you're not happy with a poster's level of effort - don't help them. Don't pollute the thread with unnecessary commentary as it inevitably leads to arguing back and forth.

"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...