Sign in to follow this  
Followers 0
erifash

Convert a webpage to *.mht

14 posts in this topic




#2 ·  Posted (edited)

Is there some dllcall that IE uses to save a webpage in mht format? Thanks.

Hi erifash,

I don't think it is ... but I've been wrong before?

I've been looking for a program to convert a web page to .mht - without opening Internet Explorer or using the Firefox/Mozilla plugin.

I came across MHTSaver.exe but could not get it to work fully - the homepage is russian, so no help there!

I've done a search of the forum, and notice there are other results to yours - so I will check them out now!

This is the code I tried MHTSaver with (and variations)

Batch file (.bat)

@echo off
MHTSaver.exe "file:\\C:\MHTSaver\Test\Test.html" "Result.mht"
cls
exit

Script file (.au3)

<!--c1--><div class='codetop'>CODE</div><div class='codemain'><!--ec1-->
Dim $cmdln, $dest, $fext, $file, $fnam, $lngcmd, $path, $pos, $slsh, $typ
Dim $auto, $def, $rgky, $var

If $CmdLine[0] <> 0 Then
    $cmdln = $CmdLine[1]
    $typ = FileGetAttrib($cmdln)
    If StringInStr($typ, "D") < 1 Then
        $lngcmd = FileGetLongName($cmdln)
        If StringRight($lngcmd, 4) = ".htm" Or StringRight($lngcmd, 5) = ".html" Then
            SplashTextOn("", @CRLF & "Please Wait!", 220, 60, -1, -1, 1)
            $slsh = StringInStr($lngcmd, "", 0, -1)
            $path = StringLeft($lngcmd, $slsh)
            $file = StringMid($lngcmd, $slsh + 1)
            $pos = StringInStr($file, ".", 0, -1)
            $fnam = StringLeft($file, $pos - 1)
           ;$dest = '" "' & $path & $fnam & '.mht"'
            $dest = '" "' & $fnam & '.mht"'
            Run(@ScriptDir & 'MHTSaver.exe "file:\' & $lngcmd & $dest, StringTrimRight($path, 1))
            WinWaitActive("Save As", "", 5)
            Send("{DEL}")
            Send($path & $fnam & ".mht")
            If $auto = 1 Then
                Send("!s")
                 ProcessWaitClose("MHTSAVER.EXE", 50)
                SplashOff()
                If FileExists($path & $fnam & ".mht") Then
                   ;MsgBox(0, "Result", "The selected web page is now an .mht file!")
                    FileDelete($lngcmd)
                    DirRemove($path & $fnam & "_files", 1)
                    Sleep(500)
                    Send("{F5}")
                EndIf
            Else
                ProcessWaitClose("MHTSAVER.EXE", 50)
                SplashOff()
            EndIf
        EndIf
    EndIf
EndIf
Exit
<!--c2--></div><!--ec2-->

Once the original "_files" folder is deleted, any extras like graphics no longer appear in the newly created .mht file ... so it appears to only half work?

Obviously I tried the script with a shortcut in the SendTo folder, and right-clicked on a .html file!

:lmao:;)

Edited by TheSaint

TheSaints' Robust Chat

Make sure brain is in gear before opening mouth!
Remember, what is not said, can be just as important as what is said.

Spoiler

If I put effort into communication, I expect you to read properly & fully, or just not comment.
Ignoring those who try to divert conversation with irrelevancies.
If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it.
I'm only big and bad, to those who have an over-active imagination.

I may have the Artistic Liesense ;) to disagree with you. TheSaint's Toolbox

userbar.png

Share this post


Link to post
Share on other sites

Try this

Global $iMsg, $Flds, $iConf, $prueba, $cdoSuppressAll

$iMsg = ObjCreate("CDO.Message") ; Create Message object
$iConf = ObjCreate("CDO.Configuration") ; Create Message Configuration Object

$Flds = $iConf.Fields
$iMsg.CreateMHTMLBody("http://www.dbforums.com/archive/index.php/t-783832.html", 0) ;, "Username", "Password" 'If needed (user & pass)

Global $Stm
$Stm = ObjCreate("ADODB.Stream")
$Stm.Type = 2 ; TypeBinary
$Stm.Charset = "US-ASCII"
$Stm.Open
Global $iDsrc, $Filepath
$Filepath = "c:\archivename.mht" ; Path to save the file & filename
$iDsrc = $iMsg.DataSource ; response.Write("Ha cargado el contenido de la pagina en el stream")
$iDsrc.SaveToObject($Stm, "_Stream")
FileDelete($Filepath) ; overwrite seems invalid
$Stm.SaveToFile($Filepath, 1) ; 1 = overwrite if file exists

Share this post


Link to post
Share on other sites

Hi again erifash

Is there some dllcall that IE uses to save a webpage in mht format?

I found this on a microsoft page titled "Cannot Save Web Page as a Web Archive File", so I was wrong.

The ability to save a Web page as a Web archive file is provided by the Inetcomm.dll file, which is installed by Outlook Express 5.

Sorry I couldn't give you the above link, but it was on my new laptop and I forgot to add the link to the text file I copied across ... just do a search of that title on the microsoft website.

I also came across another freeware program Html2Mhtml at "SourceForge.net", that seems to work sometimes, but I have not tested it extensively yet. It didn't save a microsoft page properly (no graphics, etc) ... but then IE wouldn't save that page properly either ... in fact a lot of microsoft pages don't save properly, and the save dialog just hangs at 0% for many minutes ... very annoying issue with IE7 it seems ... I'm afraid once a convert to Firefox/Mozilla, I may just keep on being so! ;)

Please let us know how you get on! :lmao:


TheSaints' Robust Chat

Make sure brain is in gear before opening mouth!
Remember, what is not said, can be just as important as what is said.

Spoiler

If I put effort into communication, I expect you to read properly & fully, or just not comment.
Ignoring those who try to divert conversation with irrelevancies.
If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it.
I'm only big and bad, to those who have an over-active imagination.

I may have the Artistic Liesense ;) to disagree with you. TheSaint's Toolbox

userbar.png

Share this post


Link to post
Share on other sites

Thanks for the reply but MHz's code above works perfectly for me. I actually made a little function with it:

Func _INetGetMHT( $url, $file )
    Local $msg = ObjCreate("CDO.Message"), $ado = ObjCreate("ADODB.Stream")
    If @error Then Return False
    With $ado
        .Type = 2
        .Charset = "US-ASCII"
        .Open
    EndWith
    $msg.CreateMHTMLBody($url, 0)
    $msg.DataSource.SaveToObject($ado, "_Stream")
    FileDelete($file)
    $ado.SaveToFile($file, 1)
    $msg = ""
    $ado = ""
    Return True
EndFunc

Thanks for the help! ;)

Share this post


Link to post
Share on other sites

Thanks for the reply but MHz's code above works perfectly for me. I actually made a little function with it:

Nice function.

Here is little correction:

Func _INetGetMHT( $url, $file )
    Local $msg = ObjCreate("CDO.Message"), 
    If @error Then Return False
    Local $ado = ObjCreate("ADODB.Stream")
    If @error Then Return False

    With $ado
        .Type = 2
        .Charset = "US-ASCII"
        .Open
    EndWith
    $msg.CreateMHTMLBody($url, 0)
    $msg.DataSource.SaveToObject($ado, "_Stream")
    FileDelete($file)
    $ado.SaveToFile($file, 1)
    $msg = ""
    $ado = ""
    Return True
EndFunc

Share this post


Link to post
Share on other sites

Thanks for the help! :lmao:

No worries!

The code looks nice, neat and compact ... I like it, but haven't tried it yet ... been working on a big toolbar project! You people have done well! ;)


TheSaints' Robust Chat

Make sure brain is in gear before opening mouth!
Remember, what is not said, can be just as important as what is said.

Spoiler

If I put effort into communication, I expect you to read properly & fully, or just not comment.
Ignoring those who try to divert conversation with irrelevancies.
If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it.
I'm only big and bad, to those who have an over-active imagination.

I may have the Artistic Liesense ;) to disagree with you. TheSaint's Toolbox

userbar.png

Share this post


Link to post
Share on other sites

Whit this code

Func testINetGetMHT()
    _INetGetMHT("www.autoitscript.com", "c:\temp\autoitscript.mht")
        ;_INetGetMHT("http://www.autoitscript.com", "c:\temp\autoitscript.mht")
EndFunc

If StringInStr(@scriptname, "web.au3") Then 
    testINetGetMHT()
EndIf

I get this error:

C:\code\autoit\au3\Include\web.au3 (18) : ==> The requested action with this object has failed.: 
$msg.CreateMHTMLBody($url, 0) 
$msg.CreateMHTMLBody($url, 0)^ ERROR
+>AutoIT3.exe ended.rc:0
>Exit code: 0   Time: 1.714

Anyone knows what could be wrong? Should I check for any libraries that could be missing? The CDO libraries seems to be available as it is registered in the registry.

Share this post


Link to post
Share on other sites

"www.autoitscript.com' is not an address to a html file online. The default of that address is to run a php file which may not work.

Share this post


Link to post
Share on other sites

First check $msg and $ado with If Not IsObj($msg) Then instead of relying on @error. I suspect you will find $msg is not... you can then dig in the right place based on the result.

Dale


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

Ok, thanks..;)

But is not that a bit weird as it accepts a url, and will have to download the content at the url? Ex: INetGet does not know if www.autoitscript.com is a static or dynamic page?

As you can see form my sample I have tried with and without perpending the protocol.

I have also tried with thee google frontpage (I suppose it is static?)

_INetGetMHT("http://www.google.es/index.html", "c:\temp\google.mht")

It's not a big deal I'm just curious. And would rather use this code to download @katrijns joke threads than my own little ugly hack.:lmao:

Share this post


Link to post
Share on other sites

#12 ·  Posted (edited)

Hi @DaleHohm, Thanks for your advice.

I added it to the UDF. It passes the IsObj test but I still get the "The requested action with this object has failed.:" error message. I called it with the google test.

_INetGetMHT("http://www.google.es/index.html", "c:\temp\google.mht")oÝ÷ Ø*.Ö§ÊÞ¡ü¨ºè¾'^jX§ØZ·
+Çè¯*.N§ø­v¬p¢¹¢¹"Ö¤z+b¶+÷Þ­éÜzȧ³+-zk¶êZºÚ"µÍ[ÈÒS]Ù]R
    ÌÍÝ  ÌÍÙ[H
BNÐÜX]HÙXÚ]BNÔÓÕTÑNËÝÝÝË]]Ú]ØÜÛÛKÙÜ[KÚ[^ÜÏI[ÜÚÝÝÜXÏLÍMN   [ÝY]ÏY[ÜÝ   [ÜLÍÂØØ[   ÌÍÛÙÈHØÜX]J  ][ÝÐÑËYÜØYÙI][ÝÊBYÜ[][ÙBØØ[    ÌÍØYÈHØÜX]J   ][ÝÐQÑÝX[I][ÝÊBYÜ[][ÙBRYÝÓØ  ÌÍÛÙÊH[B[ÙØÞ
M   ][ÝÓØXÝÜ][ÝË ][ÝÑZ[YÈÜ]HÑËYÜØYÙHØXÝ   ][ÝÊBBT][ÙHQ[YRYÝÓØ   ÌÍØYÊH[B[ÙØÞ
M   ][ÝÓØXÝÜ][ÝË ][ÝÑZ[YÈÜ]HQÑÝX[HØXÝ    ][ÝÊBBT][ÙHQ[YBÚ]   ÌÍØYÂHHÚÙ]H   ][ÝÕTËPTÐÒRI][ÝÂÜ[[Ú]  ÌÍÛÙËÜX]SRSÙJ    ÌÍÝ
B   ÌÍÛÙË]TÛÝÙKØ]UÓØXÝ
    ÌÍØYË   ][Ý×ÔÝX[I][ÝÊB[Q[]J   ÌÍÙ[JB   ÌÍØYËØ]UÑ[J   ÌÍÙ[KJB  ÌÍÛÙÈH ][ÝÉ][Ý    ÌÍØYÈH  ][ÝÉ][ÝÂ]YB[[

@MHz $and @erifash

EDIT: IsObj test is passing.

EDIT2: Crap Dale. I always write Home rather than Hohm. Sorry about that :">

Edited by Uten

Share this post


Link to post
Share on other sites

To get more detail on the com error, add this to your script:

#include <IE.au3>
_IEErrorHandlerRegister()

Dale

regarding "Home" -- I'm used to it ;-)


Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Share this post


Link to post
Share on other sites

Thanks again Dale,

Your last tip provided some better error messages. Looks like there is some issues with the library making it fail depending on the content received from the server. I guess this is what MHz was at, even oif I did not understand why.

So, for me. I don't spend any more time on it. Guess I will just wait it out..;)

Thanks fro all your input.

Best Regards

Uten

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0