Jump to content

Base64 Converter


mikeytown2
 Share

Recommended Posts

I was working on the _INetSmtpMail so that it can use a smtp server that requires a login. I got that working here

http://www.autoitscript.com/forum/index.ph...52entry146052

but AutoIt didn't have a base64 converter... i found a VBscript of a base64 converter here

http://minasi.com/64.htm

I grabbed the script and put it through the VBScript to AutoIt Converter

http://www.autoitscript.com/forum/index.php?showtopic=12143

After some time i got the thing working, here it is...

-Mikeytown2

-EDIT-

Grab _Base64.au3 from the next post... its better

Edited by mikeytown2
Link to comment
Share on other sites

Update: Get Base64 in machine code running in AutoIt. It's about 100 times faster!

Ok so I converted everything I got into UDF friendly code...

Here it is with examples...

You might want to change the smtp server in the INetSmtpMail_Example to a useful one because odds are you dont have a login at USC. And no I wont let you use mine. You also need to populate

$s_FromName = ""

$s_FromAddress = ""

$s_ToAddress = ""

in the example file if you want this to work.

_INetSmtpMail requires _Base64 AND the latest AutoIt BETA!

_Base64 has

Function Name: _Base64Encode(), _Base64Decode(), _BinToDec(), _DecToBin()

Description: Convert: String to Base64, Base64 to String, Binary to Decimal, Decimal to Binary

Tell me if this is good or not, Im going to submit the UDF next week to the beta.

EDIT

250% Speed Increase in _Base64! Thank you blindwig for telling me how to speed it up. I also want to thank sylvanie for posting his awesome _BinToDec() & _DecToBin() functions, they are here

http://www.autoitscript.com/forum/index.ph...ost&p=72664

In order to help with the speed increase, the only func that throws an error is _Base64Decode()

EDIT

7x Speed increase in _base64 from the last time. So something that used to take around 40 seconds in my very first encoder now takes 2 seconds! The decoder is quicker but its still slow.

I also updated the example so you can encode files with it as well. Please let me know if it works. blindwig said that autoit can't handle null characters (binary zero) so try some files out and if they don't work let us know, because i tried a couple and they did work. so we need to do some testing.

EDIT

Speed increase in _Base64 Encode again! The Decoder is a lot faster now (3 times faster), and it can handle junk data. Updated _Base64_Example files as well. There are optional parameters now. Thank You blindwig, couldn't have done it without your help!

EDIT

_Base64: Small speed boost to the Encoder, Decoder now assumes all data is mixed, with almost no speed penitently (blindwig). Decoder also will not give an array size error when given bad data.

EDIT

_Base64: Global variable eliminated with no change in speed. One function gone as well.

EDIT

_Base64: Encode/Decode now has a progress bar option. Speed increase to decode function as well. Thanks blindwig!

_Base64_Example: Uses progress bar for one example now

EDIT

_Base64_Example: Does a file compare automatically for the encode/decode part. it also overwrites old output files. That means you can run it again and again without having to del the output files, in order to test this UDF.

EDIT

_Base64: Encode/Decode speed increase! The larger the file, the more noticeable the difference. Example: 80 seconds to encode a 1.2MB file

EDIT

_Base64: Decode speed increase! Syntax has changed. Encoder is faster as well when using the line return option. Thanks for both speed increases blindwig! bin and dec functions removed. They can be grabbed here if anyone needs them.

_Base64_Example: Updated to reflect the new decoder syntax.

EDIT

_Base64: Decoder speed increase! Syntax has changed. It's now back to the old way... yeah sorry about that. Encoder is faster as well!

_Base64_Example: Updated to reflect the new decoder syntax.

EDIT

_Base64: 3 Global variables are used now. The reason for this is speed. If using encode or decode multiple times in a script is should be faster now.

EDIT

_Base64: 2 Global variables are used now. Speed increases as well.

EDIT

_INetSmtpMail: Syntax Changed to look more like the current UDF one in INet.au3

_INetSmtpMail_Example: Changed to reflect the new syntax. I also added an option to send the email directly to the MX server, but it doesn't seem to work too well. make sure to read the msgbox's carefully.

EDIT

_Base64: Encoder and Decoder are faster thanks to blindwig

_INetSmtpMail: Changed Syntax, Added support to send emails directly from your computer with no smtp server needed.

_INetSmtpMail_Example: Updated to deal with the new functionality & syntax of the UDF.

EDIT

_Base64: Fix because of http://www.autoitscript.com/forum/index.php?showtopic=43881 It only seemed to effect the decoder....?

_Base64_Example: Fixed because of above link, FileOpen() Function has changed.

EDIT

_Base64: Code Cleanup, I left in 2 ConsoleWrite functions and #include <Array.au3>. They are now removed. Updated documentation inside the file as well.

-Mikeytown2

OLD

_Base64.au3 _Base64_Example.au3

NEW

_Base64.au3_Base64_Example.au3 _INetSmtpMail.au3 _INetSmtpMail_Example.au3

Edited by mikeytown2
Link to comment
Share on other sites

I was looking around and i found this post

http://www.autoitscript.com/forum/index.ph...topic=15640&hl=

Anyway the _Base64 udf will convert pictures into html... i think this UDF is good to go. It is very slow though, so if anyone knows how to speed it up that would be cool. I wouldn't use any pic larger then 20k, otherwise you will be waiting all day long.

So in short it encodes the picture in base64 and saves it as an html file so the img is embedded in the page... cool stuff!

image_gif_.htm is an example of the output. *EDIT* use Firefox to view

Convert_Img_to_Html.au3

image_gif_.htm

Edited by mikeytown2
Link to comment
Share on other sites

  • 2 weeks later...

Here's my old base64 file. I never converted it to a UDF though. It's not really finished, kind of a work in progress that never got finished. It's more targeted at converting binary to base64, so it's probably slower on text files than one that is targeted at text files.

Anyway, learn from it, if you can :o

I just tried compiling it and it still seems to work.

Base64.au3

Link to comment
Share on other sites

blindwig, your using bit operations and you have the base64 alphabet hardcoded. Your function is super fast! It takes me 18.5 seconds to encode a pic using my function, but 2.5 using yours! I'm going to combine the two if you don't mind.

I don't understand how bitwise operations work, although i can count in binary so i guess i just need to learn.

I'll post back with the updated udf soon! :o

EDIT

Semi done merging the two together, i have the encoding part done, and it looks like its in UDF format as well. But i need to check to make sure. I also sped up blindwig's encode function so it takes only 2.0 seconds instead of 2.5 for that pic. I got rid of the File encode function as it's kind of a slow function for small files. If you want to encode a large file then your crazy to be using this UDF.

Edited by mikeytown2
Link to comment
Share on other sites

Updated _Base64 Again. This time its 7 times faster! Grab it from the second post again. Thanks blindwig for your code, I used almost all of it. That also means the UDF doesn't contain any of the original VBscript conversion code, and I've changed the header of the file to represent that.

Please test out converting to and from base64 with files. The new example file makes it easy to do, but it is a slow process so only do small files (less then 100k). Let us know if the original file and the outputted original file is not the same. (don't worry it will not overwrite the source file)

Link to comment
Share on other sites

blindwig, your using bit operations and you have the base64 alphabet hardcoded. Your function is super fast! It takes me 18.5 seconds to encode a pic using my function, but 2.5 using yours! I'm going to combine the two if you don't mind.

Yup. Like I was telling you in the PM, pure math functions are way faster than string operations.

I don't understand how bitwise operations work, although i can count in binary so i guess i just need to learn.

It's good stuff to know if you want to learn how to optimize mathematical routines. And if you think that they're fast in a script language, wait until you try them in a lower-level language!

Semi done merging the two together, i have the encoding part done, and it looks like its in UDF format as well. But i need to check to make sure. I also sped up blindwig's encode function so it takes only 2.0 seconds instead of 2.5 for that pic. I got rid of the File encode function as it's kind of a slow function for small files. If you want to encode a large file then your crazy to be using this UDF.

I wrote that routine a long time ago, before AutoIt was able to handle binary zeros in files, so the routines read and write the files 1 byte at a time, which is very slow. Now that AutoIt beta can handle BinaryStrings, those functions can be greatly improved.
Link to comment
Share on other sites

Updated _Base64 Again. This time its 7 times faster! Grab it from the second post again. Thanks blindwig for your code, I used almost all of it. That also means the UDF doesn't contain any of the original VBscript conversion code, and I've changed the header of the file to represent that.

Please test out converting to and from base64 with files. The new example file makes it easy to do, but it is a slow process so only do small files (less then 100k). Let us know if the original file and the outputted original file is not the same. (don't worry it will not overwrite the source file)

Your decoding is still very slow. Here is a new decoding core. Remember, mathematical routines are always much faster than strings!

;Takes a char (base64 encoded) and returns a numeric value (base64 decoded)
Func _Base64AlphaReverseChar($s_Char)
    $i_Asc = Asc($s_Char)
    Switch $i_Asc
        Case 65 to 90 ;A-Z = 0-25
            Return $i_Asc - 65
        Case 97 to 122;a-z = 26-51
            Return $i_Asc - 71
        Case 48 to 57 ;0-9 = 52-61
            Return $i_Asc + 4
        Case 43    ; +  = 62
            Return 62
        Case 47    ; /  = 63
            Return 63
    EndSwitch
    SetError(1)
    Return -1
EndFunc  ;==>_Base64AlphaReverseChar

Also I notice that you encoding routine doesn't add any line breaks. If you have a file that is more than just a few KB, then you will have to do miles of horizontal scrolling when you view it. ugh. My original routine broke the lines at 76 characters per line. That's pretty standard for ASCII files.

I also notice that your decoding routine assumes that all bytes given to it are base64 characters to be decoded. If I get a base64 encoded file, it most likely came from an e-mail or a newsgroup. It will most likely have line breaks and other white space. If it was a message that was replied to or forwarded, it might have those ">>>" crap in it. To be useful, your decoding routine needs to be more robust, or you need to write a pre-decode scrubbing routine.

Anyway, hope I didn't sound too critical, just trying to give some feedback.

Link to comment
Share on other sites

Here is a decoding routine that can work around "junk characters". It does this by moving only the valid characters to a buffer and then decoding that buffer.

;Given a base64 encoded string it decodes it while avoiding garbage characters
Func _Base64Decode($as_CypherText)
    Local $i_Count = 1, $a_Bytes[5], $s_Out = ''
    
;Break the input up into bytes
    $as_CypherText = StringSplit($as_CypherText, "")
    
;This first loop goes through the input looking for base64 characters.
;When it finds one, it will add it to the buffer.
;When the buffer has 4 characters in it, they will be decoded and added to the output.
    $a_Bytes[0]=0
    For $i_Count = 1 To $as_CypherText[0]
    ;Try to Reverse a byte
        $iTemp = _Base64AlphaReverseChar($as_CypherText[$i_Count])
        If not @error Then;Byte reversal was successful
        ;Add this byte to the buffer
            $a_Bytes[0] += 1
            $a_Bytes[$a_Bytes[0]] = $iTemp
        ;Clear the buffer if it is full
            If $a_Bytes[0] = 4 Then;Buffer is full
            ;Decode the Buffer
                $s_Out = $s_Out & Chr(BitShift($a_Bytes[1], -2) + BitShift($a_Bytes[2], 4))
                $s_Out = $s_Out & Chr(BitAND(BitShift($a_Bytes[2], -4) + BitShift($a_Bytes[3], 2), 255))
                $s_Out = $s_Out & Chr(BitAND(BitShift($a_Bytes[3], -6) + $a_Bytes[4], 255))
            ;Reset the buffer
                $a_Bytes[0] = 0
            EndIf
        EndIf
    Next
    
;Deal with any extra bytes left in the buffer
    If $a_Bytes[0] >= 2 Then
        $s_Out = $s_Out & Chr(BitShift($a_Bytes[1], -2) + BitShift($a_Bytes[2], 4))
        If $a_Bytes[0] >= 3 Then
            $s_Out = $s_Out & Chr(BitAND(BitShift($a_Bytes[2], -4) + BitShift($a_Bytes[3], 2), 255))
            If $a_Bytes[0] >= 4 Then
            $s_Out = $s_Out & Chr(BitAND(BitShift($a_Bytes[3], -6) + $a_Bytes[4], 255))
            EndIf
        EndIf
    EndIf

    Return $s_Out

EndFunc  ;==>_Base64Decode
Link to comment
Share on other sites

Speed Increase in _Base64 all around! Thank You blindwig for helping out. I should have looked more into the 2 different decoders. I'm glad you did!

I've made adding @CRLF to the output of the _Base64Encode an option. It's default is set to true so it will automatically do that. It's a little slower, 1.25 seconds for my test file vs 1.5 seconds. So if you don't need a @CRLF every 76 characters then call the function like this

_Base64Encode($input, False)

I've also made having the decoder handle mixed data an option as well. The reason is for speed. Its default is to assume the data is mixed. If you know the data is all base64 data then do this

_Base64Decode($input, False)

Let me know if this can be sped up anymore. I think its about as fast as it can get. Also if you have more ways to handle odd data let me know as well!

Thank You Everyone!

Link to comment
Share on other sites

OKay, I did some testing and pre-scrubbing is much faster than parsing-as-you-go.

Try this: Read in a base64 file, and compare the speed of the 2 lines below:

;Using the routine's build-in parsing:
$s_Out = _Base64Decode($s_In)

;Pre-scrubbing the data:
$s_Out = _Base64Decode(StringRegExpReplace($s_In,'[^0-9a-zA-Z/+]',''), false)

And here's a nice little testing structure I used:

#include "_Base64.au3"

Dim $SourceFileName = 'test.zip'

;Input
$hf_In = FileOpen($SourceFileName & '.base64',0)
$s_In = FileRead($hf_In)
FileClose($hf_In)

;Work
$i_StartTime = TimerInit()
;uncomment one of the following lines to test parsing versus pre-scrubbing
;$s_Out = _Base64Decode($s_In)
;$s_Out = _Base64Decode(StringRegExpReplace($s_In,'[^0-9a-zA-Z/+]',''), false)
$i_TotalTime = TimerDiff($i_StartTime)

;Output
$hf_Out = FileOpen($SourceFileName & '.original',2)
FileWrite($hf_Out, $s_Out)
FileClose($hf_Out)

;Results
;If you have WinDiff installed, uncomment this first line and comment the second one.
;Run('WinDiff "' & $SourceFileName & '" "' & $SourceFileName & '.original"')
Run('cmd /c "fc /b "' & $SourceFileName & '" "' & $SourceFileName & '.original" & PAUSE"')
MsgBox(0,'Results',Round($i_TotalTime / 1000,3) & ' seconds')
Link to comment
Share on other sites

Also I notice that you encoding routine doesn't add any line breaks. My original routine broke the lines at 76 characters per line.

Line breaks should be done to easily fit on a screen, I've seen 70 and 76 as standard implementations. It used to be that some SMTP servers wouldn't process a line more than 78 chars long (the 79th and 80th being the CR/LF).

BTW, if anybody cares, RFC 821 states that the maximum line length for text is 1000 characters, including the CRLF, so you HAVE to do break every 998 chars or risk blowing up a SMTP server somewhere along the message path.

Reading the help file before you post... Not only will it make you look smarter, it will make you smarter.

Link to comment
Share on other sites

Updates to _Base64: Encode(very very minor speed increase) and Decode(better handling of errors)

OKay, I did some testing and pre-scrubbing is much faster than parsing-as-you-go.

....
;Pre-scrubbing the data:
$s_Out = _Base64Decode(StringRegExpReplace($s_In,'[^0-9a-zA-Z/+]',''), false)
The Time difference between Pre-scrubbing all data and not doing any error checking is almost non existent, so i took off the extra option in _Base64Decode and made it "Pre-scrub" all data. It also handles bad input better (it no longer will error with array sizes)

@flyingboz

Thank you for your input. I'm going to leave in the option of encoding the data without a line break. The embedding of an image into a html file is one example where line breaks are not necessary and therefore i will keep that option in. If adding line breaks didn't slow the function down i would probably take that option out. As of right now, the default is every 76 char, there is a line break.

Any other thoughts/ideas/bugs?

Link to comment
Share on other sites

Made a few changes to the decode routine:

Eliminated unnecessary string concatenations - gained about 5% speed

Eliminated redundant math in the main loop - no noticable speed gain

Rewrote the padding section, just because I really hate to see ReDim called inside a loop

Added an optional progress meter - about a 10% loss in speed, but definately useful for any file that is more than just a few KB.

Note: for a better (multi-tiered stackable) progress meter, see the progress UDF listed in my sig.

Also a suggestion: You might want to add a comment in your code and/or a note in your original post that these routines will only work on binary files if you are using a version AutoIt that supports BinaryStrings (ie v3.1.1.85 or newer)

Here's the code:

;Decodes a given base64 string and returns it.  Automatically filters illegal characters.  Optional progress meter.
Func _Base64Decode($as_CypherText, $s_ProgressTitle='')
    Local $i_Count = 0, $ai_Bytes[5], $s_Out = ''
    
    If $s_ProgressTitle Then
        ProgressOn($s_ProgressTitle, 'Base64 Decoding', 'Pre-processing input')
    EndIf

;Break the input up into bytes and takeout garbage characters
    $as_CypherText = StringSplit(StringRegExpReplace($as_CypherText, '[^0-9a-zA-Z/+]', ''), "")
    
;Pad the input to a multiple of four characters
    Local $i_CypMod = Mod($as_CypherText[0], 4)
    If $i_CypMod Then
        ReDim $as_CypherText[$as_CypherText[0] + 4 - $i_CypMod]
        For $i_Count = $as_CypherText[0] + 1 To UBound($as_CypherText) - 1
            $as_CypherText[$i_Count] = '='
        Next
        $as_CypherText[0] = UBound($as_CypherText) - 1
    EndIf
    
;Main decoding loop
    If $s_ProgressTitle Then
        ProgressSet(0, 'Main Decode...')
    EndIf
    Local $i_End = Int($as_CypherText[0] / 4 - 2), $i_Count4
    For $i_Count = 0 To $i_End Step + 1
    ;read 4 characters
        $i_Count4 = $i_Count * 4
        $ai_Bytes[0] = _Base64ReverseChar($as_CypherText[$i_Count4 + 1])
        $ai_Bytes[1] = _Base64ReverseChar($as_CypherText[$i_Count4 + 2])
        $ai_Bytes[2] = _Base64ReverseChar($as_CypherText[$i_Count4 + 3])
        $ai_Bytes[3] = _Base64ReverseChar($as_CypherText[$i_Count4 + 4])
        
        $s_Out = $s_Out & Chr(BitShift($ai_Bytes[0], -2) + BitShift($ai_Bytes[1], 4)) & Chr(BitAND(BitShift($ai_Bytes[1], -4) + BitShift($ai_Bytes[2], 2), 255)) & Chr(BitAND(BitShift($ai_Bytes[2], -6) + $ai_Bytes[3], 255))
        
        If $s_ProgressTitle Then
            ProgressSet(100 * $i_Count / $i_End)
        EndIf
    Next
    
;last run
    If $s_ProgressTitle Then
        ProgressOff()
    EndIf

    If $i_End = -2 Then;string has no char we can use, strlen of input is 0
        Return ""
    Else
        $ai_Bytes[0] = _Base64ReverseChar($as_CypherText[$i_Count * 4 + 1])
        $ai_Bytes[1] = _Base64ReverseChar($as_CypherText[$i_Count * 4 + 2])
        $ai_Bytes[2] = _Base64ReverseChar($as_CypherText[$i_Count * 4 + 3])
        $ai_Bytes[3] = _Base64ReverseChar($as_CypherText[$i_Count * 4 + 4])
        
        Select
            Case $ai_Bytes[0] = -1;File ended on a perfect octet
            Case $ai_Bytes[1] = -1;This should never happen
                SetError(-1)
            Case $ai_Bytes[2] = -1;Only the first 2 bytes to be considered
                $s_Out = $s_Out & Chr(BitShift($ai_Bytes[0], -2) + BitShift($ai_Bytes[1], 4))
            Case $ai_Bytes[3] = -1;Only the first 3 bytes to be considered
                $s_Out = $s_Out & Chr(BitShift($ai_Bytes[0], -2) + BitShift($ai_Bytes[1], 4)) & Chr(BitAND(BitShift($ai_Bytes[1], -4) + BitShift($ai_Bytes[2], 2), 255))
            Case Else;All 4 bytes to be considered
                $s_Out = $s_Out & Chr(BitShift($ai_Bytes[0], -2) + BitShift($ai_Bytes[1], 4)) & Chr(BitAND(BitShift($ai_Bytes[1], -4) + BitShift($ai_Bytes[2], 2), 255)) & Chr(BitAND(BitShift($ai_Bytes[2], -6) + $ai_Bytes[3], 255))
        EndSelect
        Return $s_Out
    EndIf

EndFunc  ;==>_Base64Decode
Link to comment
Share on other sites

Made a few changes to the decode routine:

Eliminated unnecessary string concatenations - gained about 5% speed

Eliminated redundant math in the main loop - no noticable speed gain

Rewrote the padding section, just because I really hate to see ReDim called inside a loop

Added an optional progress meter - about a 10% loss in speed, but definately useful for any file that is more than just a few KB.

Decode

Cool, I added your changes while changing a couple of things. I Sped up the decode function by passing the variable to the ReverseChar function ByRef. I also fixed a bug in the new padding code. I tried eliminating the array of size 4 ($ai_Bytes) but that slowed the function down.

Encode

Added a progress bar as well.

I changed the Example File to reflect these changes so some examples use the progress bar. As such you need to calc new times for testing. The file encode/decode example is slower, the html example is faster.

Note: for a better (multi-tiered stackable) progress meter, see the progress UDF listed in my sig.

I honestly don't see how we could use a multi-tiered stackable progress meter. I will use this for other projects though, thank you for pointing this out!

Also a suggestion: You might want to add a comment in your code and/or a note in your original post that these routines will only work on binary files if you are using a version AutoIt that supports BinaryStrings (ie v3.1.1.85 or newer)

Thanks, that is taken care of now.

Grab the new _Base64.au3 and _Base64_Example.au3 in my Second Post

Let me know if you have any thoughts/ideas/bugs?

Link to comment
Share on other sites

I also fixed a bug in the new padding code.

Oops. It seems that the file I was using to test the routine just happened to be a multiple of 4, so I never saw that bug. Guess I need to do so more robust testing, eh?

I honestly don't see how we could use a multi-tiered stackable progress meter. I will use this for other projects though, thank you for pointing this out!

I thought it would be useful if for example you were encoding multiple JPGs into a web page - you'd want a total progress bar for the page, and then child progress bars to show the progress for each individual image encoding. You might even want a parent progress bar if you are encoding multiple pages. My Progress Bar UDF was designed for that type of nesting. I just added another post to try to explain that better.

Let me know if you have any thoughts/ideas/bugs?

Minor issue in your example is that you write to the files in append mode, so each time you run this they just get bigger. Also you never compare the original file against the decoded file to prove that the encode/decode was flawless.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...