Jump to content

HashTable udf


eltorro
 Share

Recommended Posts

Here's a hash table udf in pure autoit, no COM object or dlls. It is not blazingly fast, but should be good enough to be useful.

It should be noted that Gary wrote a script which uses the scripting Dictionary object and can be found here.

Below the udf is a modified version of Gary's test which can be used as comparison. It simply shows that both scripts produce the same output.

If you have a situation where you can't or won't use the scripting Dictionary object, this may be your answer.

I believe that there is another similar script, but I did not find it. I must be losing my "search-foo" :)

Hash Table UDF:

[autoit] ;_HashTable.au3

; ====================================================================================================

===========================

; Here is a hash table/ dictionary in AutoIt. This was coded as of an exercise more than anything else.

; Gary posted a script some time ago that uses the scripiting dictionary object and it works fine (as always).

; See here: (http://www.autoitscript.com/forum/index.php?showtopic=47048)

; I just wanted to do it in pure AutoIt.;)

; I have tested it with the King James version of the Bible using the book chapter#:verse# as the key and the verse text as the value.

; Lookups are lightning quick, however, adding 32k line items to the table is not as fast as the scripting dictionary object.

; For a static object where there is just retrieval of data and no additions or changes, eg. indexing a file,

; Generate and write the hash table out to a file and then load the hash table from the file at program start.

; This could be quicker than hashing each time.

; ====================================================================================================

===========================

#Include-once

Global $b_HT_Init = 0

Local $i_HT_TableSize = 8 * 1024; 8k hash table size. I have tested this table size with 32k keys.

Local $i_HT_ItemCount = 0

Local $i_HT_ArrayCount = 0

Local $a_HT_HashTable[1] ; Holds the indexes associated with the hash value.

Local $a_HT_TableKeys[1] ; keeping keys and values separate makes it easy to return all the values as an array.

Local $a_HT_TableVals[1]

Local $sz_HT_Deleted = "" ; holds the indexes of deleted items unless all items are deleted.

; ====================================================================================================

===========================

; Hash Table functions

; ====================================================================================================

===========================

; #FUNCTION# ====================================================================================================

=================

; Description ...: Initializes the hash table settings

; Parameters ....: $i_HT_InitItems - Estimated number of items that will be stored.

; Return values .: None

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HashTableInit($i_HT_InitItems = 1000)

If $b_HT_Init = 1 Then

Dim $a_HT_HashTable[1]

Dim $a_HT_TableKeys[1]

Dim $a_HT_TableVals[1]

EndIf

$i_HT_ItemCount = 0

ReDim $a_HT_HashTable[$i_HT_TableSize + 1]

ReDim $a_HT_TableKeys[$i_HT_InitItems]

ReDim $a_HT_TableVals[$i_HT_InitItems]

For $x = 0 To UBound($a_HT_HashTable) - 1

$a_HT_HashTable[$x] = -1

Next

$b_HT_Init = 1

EndFunc ;==>_HashTableInit

; #FUNCTION# ====================================================================================================

=================

; Description ...: _HT_AddItem

; Parameters ....: $szKey - IN - The key to add to the table

; $szValue - IN - The value of the key

; Return values .: Success - The index of item added

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_AddItem($szKey, $szValue)

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

Local $hash, $iFound, $iNewIndex

$hash = _HT_GetHash($szKey)

$iFound = _HT_FindItem($szKey)

If $iFound = -1 Then

$iNewIndex = _HT_UpdateCollection($szKey, $szValue)

_HT_UpdateHashTable($hash, $iNewIndex)

Return $iNewIndex

EndIf

Return SetError(1, 0, 0)

EndFunc ;==>_HT_AddItem

; #FUNCTION# ====================================================================================================

=================

; Description ...: Updates the value for a given key

; Parameters ....: $szKey - IN - The key to change the value for

; $szValue - IN - The new value

; Return values .: Success - The index of the updated item

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_UpdateItem($szKey, $szValue)

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

Local $hash, $iFound

$hash = _HT_GetHash($szKey)

$iFound = _HT_FindItem($szKey)

If $iFound = -1 Then

$iFound = _HT_UpdateCollection($szKey, $szValue)

_HT_UpdateHashTable($hash, $iFound)

Else

$a_HT_TableVals[$iFound] = $szValue

EndIf

Return $iFound

EndFunc ;==>_HT_UpdateItem

; #FUNCTION# ====================================================================================================

=================

; Description ...: Get the value for the specified key

; Parameters ....: $szKey - IN - The key to get the value of

; Return values .: Success - The value associated with the key

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_GetItem($szKey)

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

Local $iFound = _HT_FindItem($szKey)

If $iFound > -1 Then

Return $a_HT_TableVals[$iFound]

EndIf

Return SetError(1, 0, 0)

EndFunc ;==>_HT_GetItem

; #FUNCTION# ====================================================================================================

=================

; Description ...: Returns the index of the specified key

; Parameters ....: $szKey - IN - The key to find

; Return values .: Success - The index of the key

; Failure - @error is set to 1 and -1 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_FindItem($szKey)

Local $hash, $iFound, $szIndexes, $aIndexes

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

$hash = _HT_GetHash($szKey)

$iFound = -1

If $a_HT_HashTable[$hash] <> -1 Then

$szIndexes = $a_HT_HashTable[$hash]

$aIndexes = _SplitKey($szIndexes, "|")

If IsArray($aIndexes) Then

For $i In $aIndexes

If $i <> "" Then

If $a_HT_TableKeys[$i] = $szKey Then

$iFound = $i

ExitLoop

EndIf

EndIf

Next

EndIf

EndIf

If $iFound = -1 Then Return SetError(1, 0, -1)

Return $iFound

EndFunc ;==>_HT_FindItem

; #FUNCTION# ====================================================================================================

=================

; Description ...: Returns an array of values

; Parameters ....: None

; Return values .: Success - An array of values

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_ToArray()

If $b_HT_Init <> 1 Then Return SetError(1, 0, 0)

;If $i_HT_ItemCount < 1 then Return SetError(1,0,0)

Local $aTemp[$i_HT_ItemCount]

Local $iCount = 0

For $i In $a_HT_TableKeys

If $i <> "" Then

If $iCount + 1 <= $i_HT_ItemCount Then

$aTemp[$iCount] = _HT_GetItem($i)

$iCount += 1

EndIf

EndIf

Next

ReDim $aTemp[$iCount]

Return $aTemp

EndFunc ;==>_HT_ToArray

; #FUNCTION# ====================================================================================================

=================

; Description ...: Returns the number of items in the table

; Parameters ....: None

; Return values .: Success - The number of items in the table

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_GetCount()

If $b_HT_Init = 1 Then

Return $i_HT_ItemCount

;Return $a_HT_TableKeys[0]

EndIf

Return SetError(1, 0, 0)

EndFunc ;==>_HT_GetCount

; #FUNCTION# ====================================================================================================

=================

; Description ...: Clears the table

; Parameters ....: None

; Return values .: None

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_RemoveAll()

Return _HashTableInit()

EndFunc ;==>_HT_RemoveAll

; #FUNCTION# ====================================================================================================

=================

; Description ...: Removes an item from the table

; Parameters ....: $szKey - IN - The key of the item to remove

; Return values .: Success - The index of the item removed

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_RemoveItem($szKey)

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

Local $hash, $iFound, $szIndexes, $szNewIndexes, $aIndexes

$hash = _HT_GetHash($szKey)

$iFound = _HT_FindItem($szKey)

If $iFound < 0 Then Return SetError(1, 0, 0)

$a_HT_TableKeys[$iFound] = ""

$a_HT_TableVals[$iFound] = ""

$i_HT_ItemCount -= 1

$sz_HT_Deleted &= $iFound & "|"

_HT_UpdateHashTable($hash, $iFound, 1)

Return $iFound

EndFunc ;==>_HT_RemoveItem

; #FUNCTION# ====================================================================================================

=================

; Description ...: Changes a key in the table leaving the value the same

; Parameters ....: $szKey - IN - The old key

; $szNewKey - IN - The new key

; Return values .: Success - 1

; Failure - @error is set to 1 and 0 is returned

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......:

; Related .......:

; ====================================================================================================

============================

Func _HT_ChangeKey($szKey, $szNewKey)

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

Local $iFound, $szValue, $hash, $hashNew

If _HT_FindItem($szNewKey) >= 0 Then Return SetError(1, 0, 0)

$iFound = _HT_FindItem($szKey)

If $iFound < 0 Then Return SetError(1, 0, 0)

$hash = _HT_GetHash($szKey)

$hashNew = _HT_GetHash($szNewKey)

_HT_UpdateHashTable($hash, $iFound, 1); remove old index from table

$a_HT_TableKeys[$iFound] = $szNewKey

_HT_UpdateHashTable($hashNew, $iFound); add new index to table

Return 1

EndFunc ;==>_HT_ChangeKey

; ====================================================================================================

===========================

; _HT_UpdateCollection: Internal function.

; Updates the arrays used for the key and value tables.

; ====================================================================================================

===========================

Func _HT_UpdateCollection($szKey, $szValue)

If $b_HT_Init = 0 Then Return SetError(1, 0, 0)

Local $hash, $iFound, $szDeleted, $aIndexes, $i

If $sz_HT_Deleted <> "" Then

$aIndexes = _SplitKey($sz_HT_Deleted, "|")

If IsArray($aIndexes) Then

For $i In $aIndexes

If $i < UBound($a_HT_TableKeys) Then

$a_HT_TableKeys[$i] = $szKey

$a_HT_TableVals[$i] = $szValue

$i_HT_ItemCount += 1

$szDeleted = ""

For $j In $aIndexes

If $i <> $j And $j <> "" Then

$szDeleted &= $j & "|"

EndIf

Next

$sz_HT_Deleted = $szDeleted

Return $i

EndIf

Next

Else

;hash table error, deleted list is corrupt.

Return SetError(1, 0, 0)

EndIf

EndIf

;no deleted items to find.

If $i_HT_ArrayCount + 1 >= UBound($a_HT_TableKeys) Then

ReDim $a_HT_TableKeys[uBound($a_HT_TableKeys) + 1000]

ReDim $a_HT_TableVals[uBound($a_HT_TableKeys) + 1000]

EndIf

$i_HT_ArrayCount += 1

$i_HT_ItemCount += 1

$a_HT_TableKeys[$i_HT_ArrayCount] = $szKey

$a_HT_TableVals[$i_HT_ArrayCount] = $szValue

Return $i_HT_ArrayCount

EndFunc ;==>_HT_UpdateCollection

; ====================================================================================================

===========================

; _HT_UpdateHashTable: Internal Function.

; Updates the index entries in the hash table array.

; ====================================================================================================

===========================

Func _HT_UpdateHashTable($hash, $iIndex, $iAction = 0)

Local $szIndexes = ""

If $iAction = 1 Then

;remove index for hash table

If $a_HT_HashTable[$hash] = -1 Then Return SetError(1, 0, 0)

$aIndexes = _SplitKey($a_HT_HashTable[$hash])

If @error Then Return SetError(1, 0, 0)

For $i In $aIndexes

If $i <> $iIndex And $i <> "" Then

$szIndexes &= $i & "|"

EndIf

Next

If $szIndexes = "" Then $szIndexes = -1

$a_HT_HashTable[$hash] = $szIndexes

EndIf

;add index to hash table

$szIndexes = $a_HT_HashTable[$hash]

If $a_HT_HashTable[$hash] = -1 Then $szIndexes = ""

$szIndexes &= $iIndex & "|"

$a_HT_HashTable[$hash] = $szIndexes

Return 1

EndFunc ;==>_HT_UpdateHashTable

; ====================================================================================================

===========================

; Hash functions.

; ====================================================================================================

===========================

; #FUNCTION# ====================================================================================================

=================

; Description ...: Get a hash value

; Parameters ....: $szKey - IN - The key to get a hash value for

; $iHashFunc - IN - The choice of has function.

; Return values .: Success - a hash value

; Author ........: Stephen Podhajecki {gehossafats at netmdc. com}

; Remarks .......: There are several one-way hash functions to choose from. Performance varies by the type of data being hashed.

; My test data was the King James Bible which contained 31102 keys. The results below are based on an 8k hash

; table. The hash that produced the least amount of empties is presumed to have the best distribution of keys.

; The second best distribution was choosen over the first based on speed.

; Related .......:

; ====================================================================================================

============================

Func _HT_GetHash($szKey,$iHashFunc = 0)

Local $hash = 0

Switch $iHashFunc

Case 1

$hash = __JBHash($szKey) ;481 empty table slots. fastest.

Case 2

$hash = __JSHash($szKey); 231 empties

Case 3

$hash = __FNV1aHash($szKey); 417 empties

Case 4

;this is the slowest hash.

$hash = __LKP3Hash($szKey);157 empty table slots. supposed to have a better distribution

Case Else

;not the slowest or fastest, but has the second best distribution amoung the hash function.

$hash = __JOAATHash($szKey); 186 empties

EndSwitch

$hash = Mod(Abs($hash), $i_HT_TableSize)

;ConsoleWrite($hash&@lf)

Return $hash

EndFunc ;==>_HT_GetHash

; ====================================================================================================

===========================

; __JBHash:

; ====================================================================================================

===========================

Func __JBHash($key)

Local $hash = 0

Local $j = 0

For $i In _SplitKey($key)

$j = BitXOR($j, 1)

If ($j) Then

$hash *= Asc($i)

Else

$hash += Asc($i)

EndIf

$hash += BitShift(BitAND($hash, 0xFFFF0000), 16);

$hash += BitShift(BitAND($hash, 0x0000FF00), 8);

Next

Return $hash

EndFunc ;==>__JBHash

; ====================================================================================================

===========================

; __JSHash: Justin Sobel hash algorithm

; ====================================================================================================

===========================

Func __JSHash($key)

Local $hash = 0x4e67c6a7

For $i In _SplitKey($key)

$hash = BitXOR($hash, _UBitShift($hash, -5) + Asc($i) + _UBitShift($hash, 2))

Next

Return $hash

EndFunc ;==>__JSHash

; ====================================================================================================

===========================

; __FNV1aHash: Fowler, Noll, and Vo hash. http://isthe.com/chongo/tech/comp/fnv/

; ====================================================================================================

===========================

Func __FNV1aHash($key)

If $key = "" Then Return SetError(1, 0, 0)

Local $hash = 2166136261;32bit offset basis

For $i In _SplitKey($key)

$hash = BitXOR($hash, Asc($i))

;$hash = $hash * 16777619; 32bit FNV_Prime

$hash += _UBitShift($hash, -1) + _UBitShift($hash, -4) + _UBitShift($hash, -7)

$hash += _UBitShift($hash, -8) + _UBitShift($hash, -24);

Next

Return $hash

EndFunc ;==>__FNV1aHash

; ====================================================================================================

===========================

; __LKP3Hash: Bob Jenkins lookup3 hash. http://burtleburtle.net/bob/c/lookup3.c

; adapted from http://www.vak.ru/doku.php/proj/hash/sources

; ====================================================================================================

===========================

Func __LKP3Hash($key, $c = 0)

Local $a, $b, $d, $iLen, $k

$iLen = StringLen($key)

$k = _SplitKey($key)

$a = 0x9e3779b9;the golden rule

$b = $a

$d = 0

While ($iLen > 3)

$a += Asc($k[$d])

$b += Asc($k[$d + 1])

$c += Asc($k[$d + 2])

__HASH_JEN_MIX($a, $b, $c)

$iLen -= 3;

$d += 3;

WEnd

Switch ($iLen)

Case 3

$c += Asc($k[$d + 2])

ContinueCase

Case 2

$b += Asc($k[$d + 1])

ContinueCase

Case 1

$a += Asc($k[$d])

__HASH_JEN_FINAL($a, $b, $c)

ContinueCase

Case 0

$c += Asc($k[uBound($k) - 1])

;;;

EndSwitch

Return $c

EndFunc ;==>__LKP3Hash

;part of lookup3

Func __HASH_JEN_ROT($x, $k)

Return BitOR(_UBitShift($x, -$k), (_UBitShift($x, (32 - $k))))

EndFunc ;==>__HASH_JEN_ROT

;part of lookup3

Func __HASH_JEN_MIX(ByRef $a, ByRef $b, ByRef $c)

$a = BitXOR($a, __HASH_JEN_ROT($c, 4))

$c += $b

$b -= $a

$b = BitXOR($b, __HASH_JEN_ROT($a, 6))

$a += $c

$c -= $b

$c = BitXOR($c, __HASH_JEN_ROT($b, 8))

$b += $a

$a -= $c

$a = BitXOR($a, __HASH_JEN_ROT($c, 16))

$c += $b

$b -= $a

$b = BitXOR($b, __HASH_JEN_ROT($a, 19))

$a += $c

$c -= $b

$c = BitXOR($c, __HASH_JEN_ROT($b, 4))

$b += $a

EndFunc ;==>__HASH_JEN_MIX

; part of lookup3

Func __HASH_JEN_FINAL(ByRef $a, ByRef $b, ByRef $c)

$c = BitXOR($c, $B)

$c -= __HASH_JEN_ROT($b, 14)

$a = BitXOR($c, $a)

$a -= __HASH_JEN_ROT($c, 11)

$b = BitXOR($a, $B)

$b -= __HASH_JEN_ROT($a, 25)

$c = BitXOR($b, $c)

$c -= __HASH_JEN_ROT($b, 16)

$a = BitXOR($c, $a)

$a -= __HASH_JEN_ROT($c, 4)

$b = BitXOR($a, $B)

$b -= __HASH_JEN_ROT($a, 14)

$c = BitXOR($b, $c)

$c -= __HASH_JEN_ROT($b, 24)

EndFunc ;==>__HASH_JEN_FINAL

; ====================================================================================================

===========================

; __JOAATHash: Jenkins One at a time hash http://en.wikipedia.org/wiki/Hash_table, http://www.burtleburtle.net/bob/hash/doobs.html

; ====================================================================================================

===========================

Func __JOAATHash($key)

Local $hash = 0;

For $i In _SplitKey($key)

$hash += ASC($i)

$hash += _UBitShift($hash, -10)

$hash = BitXOR($hash,_UBitShift($hash, 6))

Next

$hash += _UBitShift($hash, -3)

$hash = BitXOR($hash,_UBitShift($hash, 11))

$hash += _UBitShift($hash, -15)

Return $hash

EndFunc

; ====================================================================================================

===========================

; Helper functions

; ====================================================================================================

===========================

Func _UBitShift($value, $shift)

;Valik -- #192015

; Check for the sign bit.

Local $bSignBit

If BitAND($value, 0x80000000) Then

; Sign bit found, unset it.

$value = BitXOR($value, 0x80000000)

$bSignBit = True

EndIf

; Do a signed shift with the sign bit unset.

$value = BitShift($value, $shift)

; Check to see if the former sign bit needs set.

If $shift > 0 And $shift < 32 And $bSignBit Then $value = BitOR($value, 2 ^ (31 - $shift))

Return $value

EndFunc ;==>_UBitShift

; ====================================================================================================

===========================

; _SplitKey: Internal function

; splits a string into an array of characters and strip the count value from the first element.

; ====================================================================================================

===========================

Func _SplitKey($szKey, $szDelim = "")

If $szKey = "" Then Return SetError(1, 0, 0)

If IsArray($szKey) Then Return $szKey

Local $iCount, $szTemp = ""

Local $aTemp = StringSplit($szKey, $szDelim)

If Not @error Then

Local $iCount = $aTemp[0], $iTotal = 0

For $x = 1 To $iCount

$iTotal += 1

$aTemp[$x - 1] = $aTemp[$x]

Next

ReDim $aTemp[$iTotal]

Return $aTemp

EndIf

Return SetError(1, 0, 0)

EndFunc ;==>_SplitKey

Edited by GaryFrost
Link to comment
Share on other sites

where is the example script?????????????

:)

I had a problem with the autoit code tags, it would post empty so I had to switch it to code tags instead.

Thanks Gary for fixing the tags.

Edited by eltorro
Link to comment
Share on other sites

@Eltorro

Great UDF !!!

I used the scripting Dict Object in the Duplicate File Finder

So I could replace it with yours. But I also use the MD5 COM object for creating a MD5 checksum which is also a hash key.

I think the name of your UDF is not so good choosen in combination with other Hash functions.

Maybe concsider to change it to something like "DictionaryTable" or something.

Anyway great job.

regards

ptrex

Link to comment
Share on other sites

@Eltorro

Great UDF !!!

I used the scripting Dict Object in the Duplicate File Finder

So I could replace it with yours. But I also use the MD5 COM object for creating a MD5 checksum which is also a hash key.

I think the name of your UDF is not so good choosen in combination with other Hash functions.

Maybe concsider to change it to something like "DictionaryTable" or something.

Anyway great job.

regards

ptrex

Thanks for the thumbs up.

I guess people could save it as whatever they want. If you suspect that the function names might cause a conflict, I'll check it out. A simple find and replace should fix it either way.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...