Jump to content
Sign in to follow this  
andybiochem

Convert HTML to MHTML

Recommended Posts

andybiochem

Hi,

I know this is a bit retro, but here is a simple way to embed images into HTML files to give MHTML 'archive' files.

For me, this is useful because I am often called on to create various types of reports which often include images (graphs etc). It's much easier to send/email someone a single file rather than the report HTML and its images separately.

This script will prompt for an input HTML file which it will then scan through for images, convert them to Base64, and write out to a .MHT file. The MHT file will contain both the HTML and the embedded images.

_HTML_To_MHTML(FileOpenDialog("", ".", "(*.html;*.htm)"))

Func _HTML_To_MHTML($sFileName)

    Local $i, $aFile

    ;----- Prepare MHT Header -----
    Local $sMHTHold = 'From:' & @CRLF & _
            'Subject:' & @CRLF & _
            'Date:' & @CRLF & _
            'MIME-Version:' & @CRLF & _
            'Content-Type: multipart/related; type="text/html"; boundary="----=_NextPart"' & @CRLF & _
            'X-MimeOLE:' & @CRLF & _
            @CRLF & _
            '------=_NextPart' & @CRLF & _
            'Content-Type: text/html; charset="Windows-1252"' & @CRLF & @CRLF
    ;
    ;----- Open HTML & Convert Images -----
    Local $sHTML = FileRead($sFileName)
    $sHTML = StringReplace($sHTML,"%20"," ")
    Local $sHTMLtemp = StringStripCR($sHTML)
    Local $aIMGImages = StringRegExp($sHTMLtemp, "<(?:img|IMG) [^>]*>", 3)
    Local $sMHTPost = "" ;MHT Footer
    For $i = 0 To UBound($aIMGImages) - 1
        $aFile = StringRegExp($aIMGImages[$i], 'src="([^"]*)"', 1)
        $sHTML = StringReplace($sHTML, $aFile[0], "cid:" & $aFile[0])
        $sMHTPost &= '------=_NextPart' & @CRLF & _
                'Content-Type: image' & @CRLF & _
                'Content-ID: <' & $aFile[0] & '>' & @CRLF & _
                'Content-Transfer-Encoding: base64' & @CRLF & _
                @CRLF & _
                _Base64Encode(FileRead($aFile[0])) & @CRLF & _
                @CRLF
        ;
    Next

    ;----- Combine Header, HTML, & Footer -----
    $sMHTHold &= $sHTML & @CRLF & $sMHTPost

    ;----- Prompt to save -----
    $sSaveFile = FileSaveDialog("Save File", ".", "(*.mht)", Default, StringMid($sFileName, 1, StringInStr($sFileName, ".", -1) - 1))
    If @error < 1 Then FileWrite($sSaveFile & ".mht", $sMHTHold)

EndFunc   ;==>_HTML_To_MHTML

Func _Base64Encode($sData)
    ;by TurboV21
    Local $oXml = ObjCreate("Msxml2.DOMDocument")
    Local $oElement = $oXml.createElement("b64")
    $oElement.dataType = "bin.base64"
    $oElement.nodeTypedValue = Binary($sData)
    Local $sReturn = $oElement.Text
    Return $sReturn
EndFunc   ;==>_Base64Encode

Some caveats:

- The HTML and images must exist on your PC, it won't pull images straight from the net

- The saved file must be created in the SAME directory as the original HTML file

- If your HTML file contains the string "%20" it will be converted to WS :mellow:

- ONLY images are embedded, any CSS PHP etc will still be dependent on external files

- It won't pull images from javascript tags, only HTML <img > tags

Please don't rely on this to successfully convert anything other than very simple HMTL files, it is designed to put simple images into basic HTML files.

Credits:

TurboV21 - for the Base64 encoder, which I butchered.

Edited by andybiochem
  • Like 1

- Table UDF - create simple data tables - Line Graph UDF GDI+ - quickly create simple line graphs with x and y axes (uses GDI+ with double buffer) - Line Graph UDF - quickly create simple line graphs with x and y axes (uses AI native graphic control) - Barcode Generator Code 128 B C - Create the 1/0 code for barcodes. - WebCam as BarCode Reader - use your webcam to read barcodes - Stereograms!!! - make your own stereograms in AutoIT - Ziggurat Gaussian Distribution RNG - generate random numbers based on normal/gaussian distribution - Box-Muller Gaussian Distribution RNG - generate random numbers based on normal/gaussian distribution - Elastic Radio Buttons - faux-gravity effects in AutoIT (from javascript)- Morse Code Generator - Generate morse code by tapping your spacebar!

Share this post


Link to post
Share on other sites
slayerz

andybiochem, this is great! I really like the idea of single html file because I can easily email them rather than compress them to archive.

For the time being, I'm using firefox screengrab to save the webpage as a single file.

Thanks again for your script :mellow:


AUTOIT[sup] I'm lovin' it![/sup]

Share this post


Link to post
Share on other sites
Quinch

This might be something of a newbie question {I'm still trying to get a firm grip on objects}, but if I were to try and rearrange this to work in reverse, basically separating the HTML and images from a MHT file, would there be any pitfalls I should be wary of?

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×