Jump to content

Hash Table UDF


mrider
 Share

Recommended Posts

Attached is a hash table UDF - and my unit tests. Both are a bit too long to include in the post inline...

Many, many thanks to ezzetabi for the timely sorting post - without which I would have used recursion for my QuickSort.

The Hash UDF has the following functions:

_HashF_KillAll() Destroys the hash and releases the memory it consumes. Returns nothing.

_HashF_NewInstance() Creates a new hash holder. Returns a pointer to the new hash created. Use the pointer for all methods which call for "$ptr". $ptr will always be zero if only one hash exists, and so can be ignored in those circumstances.

_HashF_Put($key, $val[, $ptr]) Puts a key->val pair in the hash. Overwrites the current pair if it exists.

_HashF_Get($key[, $ptr]) Gets the value associated with the key passed. $ptr refers to a hash mapping instance.

_HashF_RemoveHash($ptr) Removes a hash mapping instance from the storage array. Silently fails if $ptr is not valid.

_HashF_RemovePair($key[, $ptr]) Removes a key->value mapping from the hash. Silently fails if key or pointer does not exist.

_HashF_Grow([$adder]) Grows the hash storage space by the amount supplied in $adder - or by the default amount. See _HashF_DefaultResizeAmt. This function is called internally, and is not normally needed, but can save a bit of time if called before adding a large number of items.

_HashF_Shrink([$shrinkType]) Shrinks the hash. Useful if a number of key->value pairs have been removed. Valid Shrink Types: 0 - (default) Shrinks the hash to minimum size plus the default size adder. 1 - Shrinks the hash to the absolute minimum size.

_HashF_DefaultResizeAmt($amt) Sets the default number of slots that will be added each time the hash grows. This is useful if items are added in a known quantity. For example, set this to ten if items are always added in groups of ten.

_HashF_KeyPairCount([$ptr]) Returns the number of key->value pairs in the hash pointed to by pointer.

_HashF_Keys([$ptr]) Returns an array of the keys in the hash pointed to by pointer. Element zero contains the number of elements in the array.

_HashF_Values([$ptr]) Returns an array of the values in the hash pointed to by pointer. Element zero contains the number of elements in the array.

_HashF_Entries([$ptr]) Returns a two-dimensional array of the keys and values in the hash. The first two values both contain the number of key->value pairs. The first array dimension will be sized to hold all elements, the second dimension has two elements - the first is the key, the second is the value. (The code comments has an excellent example)

_HashF_ValidatePtr($ptr) Ensures the pointer passed points to an actual hash.

_HashF_SortKeysAsc([$ptr]) Sorts the hash by key in ascending order.

_HashF_SortKeysDesc([$ptr]) Sorts the hash by key in descending order.

_HashF_SortValsAsc([$ptr]) Sorts the hash by value in ascending order.

_HashF_SortValsDesc([$ptr]) Sorts the hash by value in descending order.

How's my riding? Dial 1-800-Wait-There

Trying to use a computer with McAfee installed is like trying to read a book at a rock concert.

Link to comment
Share on other sites

I am too lazy to seek in your code 0_o ...

How did you managed collisions? And what hash functions you used (I am surprised you found a good one for a generic type like autoit variants without reinterpet casting...)?

Link to comment
Share on other sites

When all else fails, there's always brute force...

I create a three dimensional array:

The first dimension resolves to a hash instance - and starts out with one element. I store and return an identity integer for each hash instance. The identity number corresponds to the index when no hash instances have been deleted. When a hash instance is deleted, the last hash instance is copied to the hash instance to be deleted (unless it's the last instance), then the array is redimmed to be one element smaller.

The second dimension resolves to the last key-value pair in the hash in question (almost certainly there're a few empty slots - sometimes there are LOTS).

The third dimension has two elements: element 0 is the key, element 1 is the value.

When adding a key->value pair, I first search through the hash identified by the pointer passed using a simple == comparison. If the key is found, I update the value entry. If the key is not found, I add the key-value pair to the end of the array, and then add one to the "last entry" pointer. If adding an element would exceed the boundaries of the array, the whole thing is redimmed to be a bit larger.

Example array values:

[0][0][0] = number of elements in this hash

[0][0][1] = identity number (probably == 0)

Continues to

[n][0][0] = number of elements in this hash

[n][0][1] = identity number (probably == n)

[x][1][0] = first key in hash x

[x][1][1] = first value in hash x

Continues to

[x][n][0] = key n in hash x

[x][n][1] = value n in hash x

And

[EDIT]

[x][0][0] == n

[x][0][1] == identity number

Finding a hash instance by pointer means searching down through the array until [?][0][1] == pointer passed. From then on, the index number (?) is used to reference the actual array.

It's ugly, but it mostly works. It can waste a lot of space - and it'll probably fail if the key is something vague like a floating point number.

As an improvement, I'd like to use a series of anonymous arrays - where a two-dimensional hash array is dimmed inside a function. I would store a global array that holds references to those arrays. Originally I thought I wouldn't be able to resize the anonymous arrays - but I've since figured out that I could copy an anonymous array to a known array, redim the known array, then copy the reference back to the global array. This would have the advantage that each hash could be a different size. I need to test this in AutoIt though. I'm not sure if the memory would be released with my methodology.

I don't really know how to get past the vague key references.

Edited by mrider

How's my riding? Dial 1-800-Wait-There

Trying to use a computer with McAfee installed is like trying to read a book at a rock concert.

Link to comment
Share on other sites

I'm shocked that this topic hasn't had more replies. I read it, and thought, "I don't need it, but its cool." more than twice. I guess there aren't that many people used to programming in Perl here. I think its a great idea, but by the time I need it, I'll probably have already done it in another language.

601DisengageEnd Program

Link to comment
Share on other sites

#1 I would have loved it about a month ago. As it was, I needed this back then, so I wrote a whole set of hash functions in C++ STL and DLLCall() them all.

#2 I think most AutoIt scripters aren't classically trained programmers, and the idea of a hash is not intutive to them.

#3 It still great stuff!! Might use it in the future!

Link to comment
Share on other sites

Ooops #4, maybe you could add a pair of functions to load/save the hash to a file? I think that's the only thing my stuff did yours doesn't.

That's a capital idea! Sure, why not. Give me a day or two. I'd think that something like a Java Properties file would work nicely.

I'm shocked that this topic hasn't had more replies.

What makes me laugh is that my unit tests have been downloaded more frequently than the UDF.

I guess there aren't that many people used to programming in Perl here.

I suppose not. The funny thing is, I'm trying to NOT use Perl in my current project. See my original post and my follow up post for more info.

How's my riding? Dial 1-800-Wait-There

Trying to use a computer with McAfee installed is like trying to read a book at a rock concert.

Link to comment
Share on other sites

I think you are misinterpreting what I'm saying - and I'm definitely not trying to be insulting.

The point is that Perl is big on hashes, and those that use Perl regularly miss that functionality when using a language without that functionality.

The fact that only a few people have even looked at my hash UDF - and that NOBODY responded to my first post - and that I only got one response to my follow up post, suggests that not many AutoIt programmers are Perl programmers.

I personally use Perl more than AutoIt. However, I've had troubles with the freebie compilers when using the Win32 library - and my target machines don't have the Perl environment. I need AutoIt to control some functions associated with my latest project, and I don't want the hassles of Win32...

Edited by mrider

How's my riding? Dial 1-800-Wait-There

Trying to use a computer with McAfee installed is like trying to read a book at a rock concert.

Link to comment
Share on other sites

didn't misinterpret and not insulted, I also use Perl more than AutoIt, and you are correct Hashes are used alot in Perl, Use alot myself.

Fortunatly my company pays for komodo, dev kit etc...

so I don't have to worry about what is on the end-user's system.

SciTE for AutoItDirections for Submitting Standard UDFs

 

Don't argue with an idiot; people watching may not be able to tell the difference.

 

Link to comment
Share on other sites

  • 2 weeks later...

I'm writing a UDF library for dealing with 2-dimensional arrays that I am calling "tables". I'm trying to understand your hases library. I've never used perl (or, I think I have, but obviously whatever I was doing didn't require hashes).

Anyway, I don't get why you need 3 dimensions? Is it that in a hash such as [a][c], a is the hash table number and [c] is the actual table? So [0][1][0] would be unique in [0][x][0], but could be duplicated in [1][x][0]. Am I understanding right? If that's the case, then I really don't understand - that means all your tables would have to be the same size?

Well, anyway, the routines I'm writing would be suited to andling the output of the IniReadSection function, or, if I understand correctly, your _HashF_Entries function.

Link to comment
Share on other sites

I would like to understand more about "Hash Tables" and Hashing. I have just read a definition online, and it sounds to me it is basically a small database. Tell me if I am wrong in that.

I came upon the Hash Table posts when I wanted to just encrypt a string. I had heard of Hash Tables before and such. I just want to be sure I understand them properly.

JS

AutoIt Links

File-String Hash Plugin Updated! 04-02-2008 Plugins have been discontinued. I just found out.

ComputerGetInfo UDF's Updated! 11-23-2006

External Links

Vortex Revolutions Engineer / Inventor (Web, Desktop, and Mobile Applications, Hardware Gizmos, Consulting, and more)

Link to comment
Share on other sites

It would be *really* nice if hash tables were a built-in part of the AutoIt language. I have written these is C++ before, using double-hashing. It's been so long ago that I don't remember which hash function I used.

Would it be possible to incorporate the VB Scripting.Dictionary to do this?

-John

Link to comment
Share on other sites

Be nice if we had Stacks and arraylists too :(

edit - confused hashtables with stacks for a second

Edited by Insolence
"I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar.
Link to comment
Share on other sites

Be nice if we had Stacks and arraylists too :(

edit - confused hashtables with stacks for a second

<{POST_SNAPBACK}>

Could you explain the difference between all three things you are wanting... and how a hash table differs from a database?

Thanks,

JS

AutoIt Links

File-String Hash Plugin Updated! 04-02-2008 Plugins have been discontinued. I just found out.

ComputerGetInfo UDF's Updated! 11-23-2006

External Links

Vortex Revolutions Engineer / Inventor (Web, Desktop, and Mobile Applications, Hardware Gizmos, Consulting, and more)

Link to comment
Share on other sites

I would love to see hash tables built into the language -- not a UDF.

So you could have code like this:

$myvar["Version"]="3.1.1"
$myvar[311]="vers."

Ideally, you could have any data type for the key and for the value. Then you could iterate over the collection like this:

for $key in $myvar
   MsgBox(0,"key,value", $key, $myvar[$key])
endfor

Since I have not looked at the AutoIt source code, I don't know if it uses STL or not. If it did, maybe it could use the HashMap container.

What do you think?

-John

Edited by jftuga
Link to comment
Share on other sites

I would love to see hash tables built into the language -- not a UDF.

So you could have code like this:

$myvar["Version"]="3.1.1"
$myvar[311]="vers."

Ideally, you could have any data type for the key and for the value.  Then you could iterate over the collection like this:

for $key in $myvar
   MsgBox(0,"key,value", $key, $myvar[$key])
endfor

Since I have not looked at the AutoIt source code, I don't know if it uses STL or not.  If it did, maybe it could use the HashMap container.

What do you think?

-John

<{POST_SNAPBACK}>

It doesn't use STL.
Link to comment
Share on other sites

I'm not entirely sure about any of this

A Hashtable is kind of like an array, with a lookup time of O(1). At least in C# it looks something like this:

Hashtable Table = new Hashtable();
Table.Add("key","value");
System.Windows.Forms.MessageBox.Show( Table["key"] ); //Would spit out 'value'

A Stack is a First In First Out collection, example:

Stack myStack = new Stack();
myStack.Push("value");
System.Windows.Forms.MessageBox.Show( myStack.Pop ); //Would spit out 'value'

An ArrayList is an array that doesn't have a defined 'end', so you can always add stuff to it:

ArrayList Array = new ArrayList();
Array.Add ("value");
foreach ( string value in Array) 
System.Windows.Forms.MessageBox.Show( value  ); //Would spit out 'value'

Wrote these all off the top of my head before going to work, don't be surprised if you see errors =\

"I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar.
Link to comment
Share on other sites

A Stack is a First In First Out collection, example:

Correction, a stack is a last in first out container. The last element added to the container is the first element popped off. The very first element added to a stack will be the very last element to be popped off.

Also, a stack does not always return the value being popped. Pop() is supposed to remove the last element from the stack. In some situations, yes, it would be perfectly safe to return the value, but in other circumstances, it wouldn't be. The convention for this is probably language dependent, though.

Link to comment
Share on other sites

Ah yes my bad, I was writing that as fast as I could :( Thanks for checking those Valik.

Also, a stack does not always return the value being popped. Pop() is supposed to remove the last element from the stack. In some situations, yes, it would be perfectly safe to return the value, but in other circumstances, it wouldn't be. The convention for this is probably language dependent, though.

I think in C# it's pretty safe. You can use <stack>.Peek() to look at the top value without popping it, if that's what you mean.
"I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar.
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...