Sign in to follow this  
Followers 0
litlmike

IE.au3 , How to Save WebPage, not SaveAs

13 posts in this topic

I would like to save a webpage as a single file (.mht) as is done by going (File -> Save As -> Save) in IE; however I would like to do it by working with an Object. Using _IEAction($oIE, "saveas") brings up a SaveAs dialog box to save the current HTML document, which seems doesn't allow for saving as a .mht. The script below accomplishes this by control clicks, but is it possible to do this by working directly with the object? Thanks

$oIE = _IECreate()
_IENavigate ($oIE, "http://www.hiddensoft.com/autoit3/")

Opt("WinWaitDelay",100)
Opt("WinTitleMatchMode",4)
Opt("WinDetectHiddenText",1)
Opt("MouseCoordMode",0)
Opt('SendKeyDelay', 250)            ; ? = 5 milliseconds by default.

Send("{ALTDOWN}{ALTUP}")
Send("f")
Send("a")

$sSaveAsWindow = "Save Webpage"
WinWait($sSaveAsWindow,"")
If Not WinActive($sSaveAsWindow,"") Then WinActivate($sSaveAsWindow,"")
WinWaitActive($sSaveAsWindow,"")

ControlSetText ($sSaveAsWindow, "", "Edit1", "Test")

ControlClick ($sSaveAsWindow, "", "Button2")

Share this post


Link to post
Share on other sites



I've tried as well. It is an odd omission from the automation interface, but your solution is the best I know.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

I've tried as well. It is an odd omission from the automation interface, but your solution is the best I know.

Dale

Darn that is a bummer. You are like the Yellow Pages, "If its not in there (Dale's Brain), it probably doesn't exist".

My solution will take forever considering that I would like to do this on an array of links. Is there another way to accomplish the same "feat"? To get a webpage saved as a file, so that it behaves offline the same way that it would online? I don't fully understand what makes an .mht unique. Meaning, InetGet for a .href just downloads source code, correct? That being the case, it seems that pages reliant on server-side scripting (ASP.NET, etc.) would first need to access the server to produce the correct output, so InetGet should not work correct?

Hmm.. I am thinking out loud here... so then, the SaveAs just saves the HTML whereas the Save Webpage saves all the data in the browser?

Share this post


Link to post
Share on other sites

A .mht is a webpage and all of it's content housed in a single compressed(?) file. A WebPage Complete saves a top-level .htm file and then creates a folder to hold all of the remaining files. Both rewrite URLs in the HTML to point to the now-local files.

The one main thing I would change in your script would be to use ControlSend instead of Send.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

A .mht is a webpage and all of it's content housed in a single compressed(?) file. A WebPage Complete saves a top-level .htm file and then creates a folder to hold all of the remaining files. Both rewrite URLs in the HTML to point to the now-local files.

The one main thing I would change in your script would be to use ControlSend instead of Send.

Dale

Hmm, I tried using ControlSend originally, but it didn't work. I tried these:

Opt("WinWaitDelay", 100)
    Opt("WinTitleMatchMode", 4)
    Opt("WinDetectHiddenText", 1)
    Opt("MouseCoordMode", 0)
    Opt('SendKeyDelay', 250); ? = 5 milliseconds by default.

    ControlSend ( $oIE, "", "ToolbarWindow32", "{ALTDOWN}{ALTUP}f")
    If @error Then ConsoleWrite ("ControlSend @error: " & @error & @CRLF )
    WinMenuSelectItem ( $oIE, "ToolbarWindow32", "&File", "Save &As...")
    If @error Then ConsoleWrite ("WinMenuSelectItem: " & @error & @CRLF )
    ControlCommand ($oIE, "", "ToolbarWindow32", "ShowDropDown", "")
    If @error Then ConsoleWrite ("ControlCommand: " & @error & @CRLF )

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

EDIT: Better solution (all autoit) in next post.

You could use a free command line utility like SavePage.

Download SavePage

Information:

Syntax: SavePage <Title> <NavigateURL> <DestinationFolder>

Saves the specified URL as a .MHT web archive. You can use Internet Explorer to open the output file. File will be stored in the <DestinationFolder> folder under the name <Title>.MHT

Example: SavePage "Yahoo" "http://www.yahoo.com" "c:\"

Note: Remember to include the http:// prefix in your URL.

© Decision Point Solutions

Edited by danwilli

Share this post


Link to post
Share on other sites

Kept searching and found this thanks to MHz, erifash, and Zedna:

_INetGetMHT( "http://www.yahoo.com", "C:\test.MHT" )
Func _INetGetMHT( $url, $file )
    Local $msg = ObjCreate("CDO.Message")
    If @error Then Return False
    Local $ado = ObjCreate("ADODB.Stream")
    If @error Then Return False

    With $ado
        .Type = 2
        .Charset = "US-ASCII"
        .Open
    EndWith
    $msg.CreateMHTMLBody($url, 0)
    $msg.DataSource.SaveToObject($ado, "_Stream")
    FileDelete($file)
    $ado.SaveToFile($file, 1)
    $msg = ""
    $ado = ""
    Return True
EndFunc

Share this post


Link to post
Share on other sites

Kept searching and found this thanks to MHz, erifash, and Zedna:

_INetGetMHT( "http://www.yahoo.com", "C:\test.MHT" )
Func _INetGetMHT( $url, $file )
    Local $msg = ObjCreate("CDO.Message")
    If @error Then Return False
    Local $ado = ObjCreate("ADODB.Stream")
    If @error Then Return False

    With $ado
        .Type = 2
        .Charset = "US-ASCII"
        .Open
    EndWith
    $msg.CreateMHTMLBody($url, 0)
    $msg.DataSource.SaveToObject($ado, "_Stream")
    FileDelete($file)
    $ado.SaveToFile($file, 1)
    $msg = ""
    $ado = ""
    Return True
EndFunc

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Kept searching and found this thanks to MHz, erifash, and Zedna:

_INetGetMHT( "http://www.yahoo.com", "C:\test.MHT" )
Func _INetGetMHT( $url, $file )
    Local $msg = ObjCreate("CDO.Message")
    If @error Then Return False
    Local $ado = ObjCreate("ADODB.Stream")
    If @error Then Return False

    With $ado
        .Type = 2
        .Charset = "US-ASCII"
        .Open
    EndWith
    $msg.CreateMHTMLBody($url, 0)
    $msg.DataSource.SaveToObject($ado, "_Stream")
    FileDelete($file)
    $ado.SaveToFile($file, 1)
    $msg = ""
    $ado = ""
    Return True
EndFunc

Thanks for this forum... Thanks for the code, this works great if you just want to just get a webpage, but if I want to navigate a website and submit a form and then save that results say to a simple .txt file can anyone help me modify the above call to do that? Much appreciation, thanks

Edited by Metaman

Share this post


Link to post
Share on other sites

Kept searching and found this thanks to MHz, erifash, and Zedna:

_INetGetMHT( "http://www.yahoo.com", "C:\test.MHT" )
Func _INetGetMHT( $url, $file )
    Local $msg = ObjCreate("CDO.Message")
    If @error Then Return False
    Local $ado = ObjCreate("ADODB.Stream")
    If @error Then Return False

    With $ado
        .Type = 2
        .Charset = "US-ASCII"
        .Open
    EndWith
    $msg.CreateMHTMLBody($url, 0)
    $msg.DataSource.SaveToObject($ado, "_Stream")
    FileDelete($file)
    $ado.SaveToFile($file, 1)
    $msg = ""
    $ado = ""
    Return True
EndFunc
Oops, sorry. I thought I had responded to this already. This solution worked lovely.

Share this post


Link to post
Share on other sites

Thanks for this forum... Thanks for the code, this works great if you just want to just get a webpage, but if I want to navigate a website and submit a form and then save that results say to a simple .txt file can anyone help me modify the above call to do that? Much appreciation, thanks

Has there been found a solution on how to save generated output (not static pages) into MHT, without explicitly selecting it from the save as dialog box?

Share this post


Link to post
Share on other sites

Kept searching and found this thanks to MHz, erifash, and Zedna:

_INetGetMHT( "http://www.yahoo.com", "C:test.MHT" )
Func _INetGetMHT( $url, $file )
Local $msg = ObjCreate("CDO.Message")
If @error Then Return False
Local $ado = ObjCreate("ADODB.Stream")
If @error Then Return False

With $ado
.Type = 2
.Charset = "US-ASCII"
.Open
EndWith
$msg.CreateMHTMLBody($url, 0)
$msg.DataSource.SaveToObject($ado, "_Stream")
FileDelete($file)
$ado.SaveToFile($file, 1)
$msg = ""
$ado = ""
Return True
EndFunc

Thanks for the code. will be using this a LOT. ^_^

No problem can withstand the assault of sustained thinking.Voltaire

_Array2HTMLTable()_IEClassNameGetCollection()_IEquerySelectorAll()

Share this post


Link to post
Share on other sites

Thanks for the code. will be using this a LOT. ^_^

Actually now that I dive deeper, would there be a way to save the web page you are currently on, rather than autoit navigating and then saving?

No problem can withstand the assault of sustained thinking.Voltaire

_Array2HTMLTable()_IEClassNameGetCollection()_IEquerySelectorAll()

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0