Sign in to follow this  
Followers 0
BoonPek

Huffman Trees - finding the prefix codes

4 posts in this topic

#1 ·  Posted (edited)

Hi all,

I understand the underlying concepts behind the generation of a Huffman Tree and can do it on paper; the problem is converting the concept into a routine/program.

This is what I'm hoping to accomplish: http://huffman.ooz.ie/?text=This%20is%20a%20test%20string

So far I've been able to create and organise the frequency table. I don't know what the next step is. I know, however, that I need to begin with the two lowest frequencies and sum them up then repeat. I'm uncertain as to how I can accomplish this and generate a tree. A 2D, 3D array? Some queer data type?

Any help would be highly appreciated, thanks!

#include <Array.au3>

$time = TimerInit()
$str = "This is a test string"
$var = StringSplit($str, "")

Dim $trie[1][2]
$trie[0][0] = $str
$trie[0][1] = $var[0]

For $i = 1 To $var[0]
    For $j = 1 To UBound($trie)-1
        If $trie[$j][0] == $var[$i] Then
            $trie[$j][1] += 1
            ContinueLoop 2
        EndIf
    Next

    ReDim $trie[$j + 1][2]
    $trie[$j][0] = $var[$i]
    $trie[$j][1] = 1
Next

ConsoleWrite("Execution time: " & TimerDiff($time) & "ms" & @LF)

_ArraySort($trie, 1, 0, 0, 1)
_ArrayDisplay($trie)

$trie[UBound($trie)-2][0] = $trie[UBound
Edited by BoonPek

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Update!

I managed to get the program to generate a tree-ish, but am struggling to get the prefix codes for the different symbols. The output, for example, of "this is a test string" is: "[0.[0. |1.t]|1.[0.[0.[0.h|1.g]|1.i]|1.[0.[0.[0.n|1.r]|1.[0.e|1.a]]|1.s]]]"

[t] = 01

[ ] = 00

= 111

= 101

etc.

I've tried StringSplit on the different delimiters such as [, |, . and ] but it doesn't seem to work for most of the symbols. Can anyone point me in the right direction? I'm doing this to better understand programming; I understand that other languages (supposedly) do it better but I'd like to try it out with AutoIt.

Thank you

#include <Array.au3>

$time = TimerInit()
$str = "this is a test string"
$var = StringSplit($str, "")

Dim $trie[1][2]
$trie[0][0] = $str
$trie[0][1] = $var[0]

For $i = 1 To $var[0]
    For $j = 1 To UBound($trie)-1
        If $trie[$j][0] == $var[$i] Then
            $trie[$j][1] += 1
            ContinueLoop 2
        EndIf
    Next

    ReDim $trie[$j + 1][2]
    $trie[$j][0] = $var[$i]
    $trie[$j][1] = 1
Next

Do
    $num = UBound($trie)
    _ArraySort($trie, 1, 0, 0, 1)

    $trie[$num-2][0] = "[0." & $trie[$num-1][0] & "|1." & $trie[$num-2][0] & "]"
    $trie[$num-2][1] = $trie[$num-1][1] + $trie[$num-2][1]

    ReDim $trie[$num-1][2]
Until UBound($trie) = 2

ConsoleWrite($trie[1][0] & @LF)
ConsoleWrite("Execution time: " & TimerDiff($time) & "ms" & @LF)
Edited by BoonPek

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

Hi!

Thanks for your reply and pointer to that resource :)

I'm not learned in the various other programming languages having only stuck to AutoIt for casual scripting for the past 6 years (wow, I feel old now :/). Looking over the various codes above suggests that I restart the writing of my code and drop the old one, right?

I've spoken with my CS teacher earlier and he mentioned heaps and priority queues and lists, which I believe are not officially supported data types in AutoIt?

More pointers? ^_^

EDIT: It's easy to miss out in the mess of code above, but I managed to generate my own tree -- somewhat: ((  t) (((h g) i) (((n r) (e a)) s))) -- visually interpret-able but difficult to derive a codeword from :(

Edited by BoonPek

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0