Jump to content

This site uses cookies. By continuing to browse the site you are agreeing to our use of cookies. Find out more here. X
X


Photo

UDF: Binary Trees


  • Please log in to reply
16 replies to this topic

#1 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 01 July 2005 - 08:28 PM

A while ago I wrote a "tables" UDF, for doing look-up tables (unique key=value). As I've worked with them, I've wanted one that are faster, and will be sorted when displayed. I also wanted a quick way to search an array, without having to sort it everytime I add something to it.

If you don't know how Binary Trees work, don't worry about it. Just think of it this way:
instead of having a flat array where the indexes are 0-x, you can have whatever indexes you want. For example:
;Create a binary tree $btNew = _BTreeCreate() ;assign some values _BTreeSet($btNew, 1, 'The first item') _BTreeSet($btNew, 2, 'The second item') _BTreeSet($btNew, 'three', 'The third item') _BTreeSet($btNew, 'fourth', 'The fourth item') _BTreeSet($btNew, 'red', 'A warm color') _BTreeSet($btNew, 'august', 'The 8th month') ;retireve a stored value msgbox(0,'RED',_BTreeGet($btNew, 'RED'));keys are not case sensative


After putting a bunch of values into the tree, you'll want to optimize it, so it's as fast as it can be:
;Create the Binary Tree $btTemp = _BTreeCreate() ;Populate Binary Tree with random key/values ProgressOn('MWR_BTree UDF Test','Populating Keys...') for $i = 1 to 50     $j = Random(1, 1000, 1)     _BTreeSet($btTemp, $j, '');set a random key to a blank value     ProgressSet($i * 2, $j) Next ProgressOff() ;See how ugly your tree can be MsgBox(0,'MWR_BTree UDF Test: Raw Tree',_BTreePrintKeys($btTemp)) ;See how pretty your tree can be _BTreeOptimize($btTemp) MsgBox(0,'MWR_BTree UDF Test: Optimized Tree',_BTreePrintKeys($btTemp))

And here's what the unoptimized tree looks like:
CODE

,-------7
,-------24
| '-------25
| | ,-------54
| '-------116
| | ,-------121
| | | '-------162
| | | | ,-------174
| | | '-------180
| '-------189
,-------237
| | ,-------240
| | ,-------283
| | | | ,-------290
| | | | | '-------291
| | | | | '-------354
| | | | | '-------437
| | | '-------438
| '-------480
| | ,-------489
| '-------522
| | ,-------537
| | | '-------538
| | ,-------547
| '-------549
| | ,-------602
| | | '-------606
| | | '-------624
| '-------655
--------662
| ,-------666
| | '-------667
| ,-------673
| ,-------761
| | | ,-------778
| | '-------808
'-------839
| ,-------843
| ,-------851
| ,-------859
| | | ,-------862
| | '-------910
| | '-------915
| ,-------932
| ,-------938
| | '-------941
| | '-------948
'-------992
'-------998
And the optimized one:
CODE

,-------7
| '-------24
,-------25
| '-------54
| '-------116
,-------121
| | ,-------162
| | | '-------174
| '-------180
| '-------189
| '-------237
,-------240
| | ,-------283
| | | '-------290
| | ,-------291
| | | '-------354
| | | '-------437
| '-------438
| | ,-------480
| | | '-------489
| '-------522
| | ,-------537
| '-------538
| '-------547
--------549
| ,-------602
| | '-------606
| ,-------624
| | '-------655
| | '-------662
| ,-------666
| | | ,-------667
| | | | '-------673
| | '-------761
| | '-------778
| | '-------808
'-------839
| ,-------843
| | '-------851
| ,-------859
| | '-------862
| | '-------910
'-------915
| ,-------932
| | '-------938
'-------941
| ,-------948
'-------992
'-------998

Attached Files









#2 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 01 July 2005 - 08:34 PM

I forgot to say - you have to have the latest beta to run this. To optimize the tree (in the _BTreeOptimize() function), I use the _ArraySort routine in the Array.au3, and it has a bug on some older versions.

#3 Valuater

Valuater

    www.PayFreeWireless.com

  • MVPs
  • 11,203 posts

Posted 01 July 2005 - 08:55 PM

That is cool, blindwing
8)

Posted Image

Clic The Pic!!!


#4 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 04 July 2005 - 09:24 AM

I've added a new function to my library, called _BTreeToTable.
It takes a binary tree as input, as spits out a table (2d array) where:
[x][0]=key
[x][1]=value
[x][2]=original index
And it actually walks the tree, so the keys will be sorted in ascending order.
here's the code:
CODE

;Returns a table of [x][3], where [x][0]=key, [x][1]=value, [x][2]=original index
Func _BTreeToTable(ByRef $BTree)
Dim $aOut[$BTree[0][0] + 1][3]
If $BTree[0][1] <> 0 Then __BTreeToTable($BTree, $BTree[0][1], $aOut)
Return $aOut
EndFunc
;private
Func __BTreeToTable(ByRef $BTree, $Ptr, ByRef $aKey)
If $BTree[$Ptr][1] <> 0 Then __BTreeToTable($BTree, $BTree[$Ptr][1], $aKey)
$aKey[0][0]=$aKey[0][0]+1
$aKey[$aKey[0][0]][0]=$BTree[$Ptr][0]
$aKey[$aKey[0][0]][1]=$BTree[$Ptr][3]
$aKey[$aKey[0][0]][2]=$Ptr
If $BTree[$Ptr][2] <> 0 Then __BTreeToTable($BTree, $BTree[$Ptr][2], $aKey)
EndFunc


And using this new code, I've re-written my optimizing routine. Since the keys are returned in sorted order, I no longer need to rely on _ArraySort, so the routine is now about 4x faster. Here's the new code:
CODE

;This will optimize the binary tree (make sure all the branches are balanced)
Func _BTreeOptimize(ByRef $BTree, $ProgressTitle = @ScriptName)
If $ProgressTitle <> '' Then ProgressOn($ProgressTitle, 'Optimizing Binary Tree...', 'Gathering Keys...')
Local $aKeys[$BTree[0][0]+1][3]
$aKeys = _BTreeToTable($BTree)
If $ProgressTitle <> '' Then ProgressSet(50, 'Restructuring Tree...')
$BTree[0][1]=__BTreeOpt($BTree, 1, $BTree[0][0], $aKeys)
If $ProgressTitle <> '' Then ProgressSet(99)
If $ProgressTitle <> '' Then ProgressOff()
EndFunc

;private
Func __BTreeOpt(ByRef $BTree, $min, $max, ByRef $aKeys)
If $min = $Max Then
$BTree[$aKeys[$min][2]][1] = 0
$BTree[$aKeys[$min][2]][2] = 0
Return $aKeys[$min][2]
EndIf
$mid = int(($min + $max) / 2)
If $min = $mid Then
$BTree[$aKeys[$mid][2]][1]=0
Else
$BTree[$aKeys[$mid][2]][1]=__BTReeOpt($BTree, $Min, $Mid - 1, $aKeys)
EndIf
If $mid = $max Then
$BTree[$aKeys[$mid][2]][2]=0
Else
$BTree[$aKeys[$mid][2]][2]=__BTReeOpt($BTree, $Mid + 1, $Max, $aKeys)
EndIf
Return $aKeys[$mid][2]
EndFunc

The biggest difference is that it no longer "fixes" the tree - if you've tampered with the array manually, this routine won't find orphaned node like the old one would. The solution for that is simple - don't #%$@&* with the array outside of the _BTree* functions! :)

#5 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 07 July 2005 - 08:28 AM

I've just wrote a routine to remove a key/value pair from the tree.
CODE

;Find the key $Key in Binary Tree $BTree, returns it's parent value (or 0 if $key is at root)
;Also sets @Extended to point to the array index of the parent item.
Func _BTreeFindParentKey(ByRef $BTree, $Key)
Local $BTCur = $BTree[0][1], $BTLast = 0
While $BTCur <> 0 And $BTree[$BTCur][0] <> $Key
Select
Case (IsNumber($Key) = IsNumber($BTree[$BTCur][0]) And $Key < $BTree[$BTCur][0]) Or (IsNumber($Key) <> IsNumber($BTree[$BTCur][0]) And String($Key) < String($BTree[$BTCur][0]))
$BTLast = $BTCur
$BTCur = $BTree[$BTCur][1]
Case (IsNumber($Key) = IsNumber($BTree[$BTCur][0]) And $Key > $BTree[$BTCur][0]) Or (IsNumber($Key) <> IsNumber($BTree[$BTCur][0]) And String($Key) > String($BTree[$BTCur][0]))
$BTLast = $BTCur
$BTCur = $BTree[$BTCur][2]
EndSelect
WEnd
If $BTCur = 0 Then
SetError(1) ;$Key was not found in this tree
Return 'The key "' & $Key & '" was not found in the tree.'
Else
SetExtended($BTLast)
If $BTLast = 0 Then
Return 0
Else
Return $BTree[$BTLast][0]
EndIf
EndIf
EndFunc

;This function will remove a key from the binary tree.
;Note that it does not actually remove it, it only orpahns it by re-routing the links around it.
Func _BTreeDelete(ByRef $BTree, $Key)
;Locate $key in $BTree
Local $RetVal = _BTreeGet($BTree, $Key), $ptr = @extended
If Not @error Then
;Gather a list of sub-branches under this key
Local $aBranch[$BTree[0][0]][3]
If $BTree[$ptr][1]<>0 Then __BTreeToTable($BTree, $BTree[$ptr][1], $aBranch)
If $BTree[$ptr][2]<>0 Then __BTreeToTable($BTree, $BTree[$ptr][2], $aBranch)
_BTreeFindParentKey($BTree, $Key)
;graft the new sub-branch onto the parent branch
Local $ParentPtr = @extended
If $ParentPtr = 0 Then
$BTree[0][1] = __BTreeOpt($BTree, 1, $aBranch[0][0], $aBranch)
ElseIf $BTree[$ParentPtr][1] = $ptr Then
$BTree[$ParentPtr][1] = __BTreeOpt($BTree, 1, $aBranch[0][0], $aBranch)
ElseIf $BTree[$ParentPtr][2] = $ptr Then
$BTree[$ParentPtr][2] = __BTreeOpt($BTree, 1, $aBranch[0][0], $aBranch)
EndIf
;remove branches from the original item
$BTree[$ptr][1] = 0
$BTree[$ptr][2] = 0
Else
SetError(1)
Return $RetVal
EndIf

EndFunc


#6 condoman

condoman

    Polymath

  • Active Members
  • PipPipPipPip
  • 237 posts

Posted 07 July 2005 - 10:54 AM

Very nice. I am amazed what can be done woth AutoIt3. I wrote a word counting program with C when I was learning it in 1983. Same concept different language.

Can you update the zip file with the new changes?

Thanks for your effort. :)

#7 HyperChao

HyperChao

    Seeker

  • New Members
  • 8 posts

Posted 14 July 2005 - 01:28 AM

Since you don't use _ArraySort anymore, is v3.1.1 good enough? Also, it doesn't seem that the zip file attached in the first post is updated with your new functions...

As for your algorithm, how does it handle adding new elements? Greater than, on the right, less than to the left, but what about when it's equal? Does it leave the tree unaltered? Or am I just totally confused? I'm not sure about this key and value business.

Edited by HyperChao, 14 July 2005 - 01:39 AM.


#8 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 14 July 2005 - 04:03 AM

Since you don't use _ArraySort anymore, is v3.1.1 good enough? Also, it doesn't seem that the zip file attached in the first post is updated with your new functions...

Correct, I no longer need the lates Beta
no, the ZIP file is still old. Get it, then patch it with the functions I have posted.

As for your algorithm, how does it handle adding new elements? Greater than, on the right, less than to the left, but what about when it's equal? Does it leave the tree unaltered? Or am I just totally confused? I'm not sure about this key and value business.

Yes, higher and lower key values spawn new branches. Similiar key values always over-write older ones.
If you just want to use the tree to store unique values, then just add keys, and don't worry about their values. The purpose of this UDF is to quickly store and retreive information (values) based on a unique key, but if the key *IS* the information, then no additional data is required. You'll notice on the _BTreeSet() function, the $Value parameter is optional.

#9 Insolence

Insolence

    Not distastefully arrogant

  • Active Members
  • PipPipPipPipPipPip
  • 1,304 posts

Posted 14 July 2005 - 08:25 PM

I'm very impressed, I don't understand this stuff quite yet... good job :)
"I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar.

#10 SlimShady

SlimShady

    AutoIt lover

  • Active Members
  • PipPipPipPipPipPip
  • 2,383 posts

Posted 14 July 2005 - 08:52 PM

Just the description of this makes me excited to use them :evil:
very nice :)

#11 layer

layer

    i love skateboarding

  • Active Members
  • PipPipPipPipPipPip
  • 2,470 posts

Posted 14 July 2005 - 10:05 PM

I wish I was good with arrays...
FootbaG

#12 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 15 July 2005 - 01:17 AM

I'm very impressed, I don't understand this stuff quite yet... good job :)

<{POST_SNAPBACK}>

It's not too complicated, my functions wrap up all the nitty-gritty stuff. It works like this:

First thing to do is create a binary tree variable:
$btMyData = _BTreeCreate()


The main purpose for these routines is to associate data (a number, string, or even an array) with a unique key (which is typically a number, but can be a string or even an array):
_BTreeSet($btMyData, 1, 'first bit of info')

Now the key '1' contains the data 'first bit of info'

To retreive the data using the key:
$vData = _BTreeGet($btMyData, 1)

And that will return 'first bit of info'

Now, key must be unique, so if you try to put the same key in with different data:
_BTreeSet($btMyData, 1, 'second bit of info')

Now the key '1' is associated with the data 'second bit of info', and the original data 'first bit of info' has been destroyed.

Now, that's if you want to have a unique set of keys and some associated data. The second way to use my functions is if you want to make a list of data, but want to make sure that you don't duplicate any data. In this case, you would use the keys as data (since they are forced to be unique) and just not associate any data with them:
$btMyData = _BTreeCreate() _BTreeSet($btMyData, 'some data') _BTreeSet($btMyData, 'some more data') _BTreeSet($btMyData, 'even more data') _BTreeSet($btMyData, 'some more data')

Now in the example above, the tree will only have 3 entries, since 'some more data' was keyed twice. So you can use my routine to store data that you want to keep it unique (for example if you want a list of files, but you want to make sure that each file is only listed once)

The function _BTreeGet() sets @error if the key was not found, so you can use that to check for the existence of keys:
$vRetVal = _BTreeGet($btMyData, 'crazy data') if @Error then MsgBox(0, 'not found', $vRetVal)




One of the big limitations of binary trees is that they are designed to take random values (keys). They don't do well if you give them values that are sorted in order (either ascending or descnding). For example:
For $i = 1 to 50     _BTreeSet($btMyData, $i) Next

This code would give you a very slow tree. Not even a tree really - more like a linked list. If you've worked with those before, you know that they can be very slow.
To fix this, either make sure that you put stuff in in a random order, or run _BTreeOptimize every once in a while.

I'm thinking of a way to write an Array-to-BTree routine, but haven't done it yet (mostly because I haven't needed one yet, and I write my stuff on a when-I-need-it basis :evil: )

Anyway, I hope that gives some insight into how to work with my BTree routines. Any questions, ask away...

#13 Insolence

Insolence

    Not distastefully arrogant

  • Active Members
  • PipPipPipPipPipPip
  • 1,304 posts

Posted 15 July 2005 - 09:33 AM

So it's sort of like a hashtable? :)

Thank you very much for your time in explaining it.
"I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar.

#14 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 15 July 2005 - 05:45 PM

So it's sort of like a hashtable? :)

Thank you very much for your time in explaining it.

<{POST_SNAPBACK}>

Yes, I designed it to work very much like a hash table. infact, if you wanted a true hash table, just write a hash routine to hash the key before submitting it:
_BTreeSet($btMyData, Hash($Key), $Data) $RetVal = _BTreeGet($btMyData, Hash($Key), $Data)


And like I said, you can also use it to store a unique list (using the keys as values and not providing or checking the associated data values)

#15 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 27 July 2005 - 12:36 AM

I've written 2 more functions for creating a Binary Tree from an existing array.
Use _BTreeFromArray1() for 1-d arrays and _BTreeFromArray2() for 2-d arrays.

These functions use _ArraySort(), so make sure you are using the latest version of that (the 3.1.1 version has a bug in it)

CODE

;Returns a Binary Tree keyed with values found in 1-d array $aIn
Func _BTreeFromArray1(ByRef $aIn, $iBase = 1)
Local $i, $InTop, $Next = 0

;Create the output structure
If $iBase Then
$iBase = 1
$InTop = $aIn[0]
Else
$iBase = 0
$InTop = UBound($aIn, 1) - 1
EndIf
Dim $btOut[$InTop + 2 - $iBase][4]
For $i = $iBase To $InTop
$btOut[0][0] = $btOut[0][0] + 1
$btOut[$btOut[0][0]][0] = $aIn[$i]
Next

_ArraySort($btOut, 0, 1, $btOut[0][0], 2, 0)

;Remove duplicates
$i = $iBase
While $i <= $btOut[0][0] - $Next
While $i + $Next + 1 <= $btOut[0][0] And $btOut[$i][0] = $btOut[$i + $Next + 1][0]
$Next = $Next + 1
WEnd
$i = $i + 1
If $Next And $i + $Next + 1 <= $btOut[0][0] Then
$btOut[$i][0] = $btOut[$i + $Next][0]
EndIf
WEnd
$btOut[0][0] = $i - 1
ReDim $btOut[$i][4]

;Build Tree Structure
$btOut[0][1] = __BTreeFromArray($btOut, 1, $btOut[0][0])

Return $btOut
EndFunc

;Returns a Binary Tree keyed with values found in 2-d array $aIn
;keys are taken from $aIn[x][$iKey] and data is taken from $aIn[x][$iData]
Func _BTreeFromArray2(ByRef $aIn, $iBase = 1, $iKey = 0, $iData = 1)
Local $i, $InTop, $Next = 0

;Create the output structure
If $iBase Then
$iBase = 1
$InTop = $aIn[0][0]
Else
$iBase = 0
$InTop = UBound($aIn, 1) - 1
EndIf
Dim $btOut[$InTop + 2 - $iBase][4]
For $i = $iBase To $InTop
$btOut[0][0] = $btOut[0][0] + 1
$btOut[$btOut[0][0]][0] = $aIn[$i][$iKey]
$btOut[$btOut[0][0]][3] = $aIn[$i][$iData]
Next

_ArraySort($btOut, 0, 1, $btOut[0][0], 4, 0)

;Remove duplicates
$i = 1
While $i <= $btOut[0][0] - $Next
While $i + $Next + 1 <= $btOut[0][0] And $btOut[$i][0] = $btOut[$i + $Next + 1][0]
$Next = $Next + 1
WEnd
$i = $i + 1
If $Next And $i + $Next + 1 <= $btOut[0][0] Then
$btOut[$i][0] = $btOut[$i + $Next][0]
$btOut[$i][3] = $btOut[$i + $Next][3]
EndIf
WEnd
$btOut[0][0] = $i - 1
ReDim $btOut[$i][4]

;Build Tree Structure
$btOut[0][1] = __BTreeFromArray($btOut, 1, $btOut[0][0])

Return $btOut
EndFunc

;private
Func __BTreeFromArray(ByRef $BTree, $min, $max)
If $min = $Max Then
$BTree[$min][1] = 0
$BTree[$min][2] = 0
Return $min
EndIf
$mid = Round(($min + $max) / 2)
If $min = $mid Then
$BTree[$mid][1]=0
Else
$BTree[$mid][1]=__BTreeFromArray($BTree, $Min, $Mid - 1)
EndIf
If $mid = $max Then
$BTree[$mid][2]=0
Else
$BTree[$mid][2]=__BTreeFromArray($BTree, $Mid + 1, $Max)
EndIf
Return $mid
EndFunc

And some examples:
#include <MWR_BTree.au3> $aTemp = StringSplit('a,b,c,d,e,f,g,h,i,j,k,l,m',',') $btTemp = _BtreeFromArray1($aTemp) MsgBox(0,'From 1-d Array',_BTreePrintKeys($btTemp)) $aTemp = IniReadSection(@WindowsDir & '\System.Ini', '386enh') $btTemp = _BtreeFromArray2($aTemp) MsgBox(0,'From 2-d Array',_BTreePrintKeys($btTemp))


Edit: Code patch.

Edited by blindwig, 28 July 2005 - 12:51 AM.


#16 blindwig

blindwig

    Universalist

  • Active Members
  • PipPipPipPipPipPip
  • 772 posts

Posted 28 July 2005 - 12:52 AM

Fixed a bug with the above routines (the _BTreeFromArray() routines) and pasted the new code into the above post.

#17 ohgreat

ohgreat

    Seeker

  • Active Members
  • 15 posts

Posted 05 October 2005 - 03:27 AM

Thanks for the neat code, was about to write something similar myself.

I've merged your original version plus all changes to date into the attached file below to save hassle for others.

Cheers ;)

Attached Files






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users