Jump to content

Unions in struct definitions - AutoIt UDF's


Mat
 Share

Recommended Posts

What is the correct way to deal with unions in a struct definition?

e.g. http://msdn.microsoft.com/en-us/library/ms682013(v=VS.85).aspx

I am doing this at the moment...

; #STRUCTURE# ===================================================================================================================
; Name...........: $tagCHAR_INFO_W
; Description ...: Specifies a Unicode character and its attributes. This structure is used by console functions to read from
;                  and write to a console screen buffer.
; Fields ........: UnicodeChar          - Unicode character of a screen buffer character cell.
;                  Attributes           - The character attributes. This member can be zero or any combination of the following values.
;                                       |FOREGROUND_BLUE - Text color contains blue.
;                                       |FOREGROUND_GREEN - Text color contains green.
;                                       |FOREGROUND_RED - Text color contains red.
;                                       |FOREGROUND_INTENSITY - Text color is intensified.
;                                       |BACKGROUND_BLUE - Background color contains blue.
;                                       |BACKGROUND_GREEN - Background color contains green.
;                                       |BACKGROUND_RED - Background color contains red.
;                                       |BACKGROUND_INTENSITY - Background color is intensified.
;                                       |COMMON_LVB_LEADING_BYTE - Leading byte.
;                                       |COMMON_LVB_TRAILING_BYTE - Trailing byte.
;                                       |COMMON_LVB_GRID_HORIZONTAL - Top horizontal
;                                       |COMMON_LVB_GRID_LVERTICAL - Left vertical.
;                                       |COMMON_LVB_GRID_RVERTICAL - Right vertical.
;                                       |COMMON_LVB_REVERSE_VIDEO - Reverse foreground and background attribute.
;                                       |COMMON_LVB_UNDERSCORE - Underscore.
; Author ........: Mat
; Remarks .......:
; ===============================================================================================================================

Global Const $tagCHAR_INFO_W = "WCHAR UnicodeChar; WORD Attributes"

; #STRUCTURE# ===================================================================================================================
; Name...........: $tagCHAR_INFO_A
; Description ...: Specifies a Ascii character and its attributes. This structure is used by console functions to read from
;                  and write to a console screen buffer.
; Fields ........: AsciiChar          - Unicode character of a screen buffer character cell.
;                  Attributes           - The character attributes. This member can be zero or any combination of the following values.
;                                       |FOREGROUND_BLUE - Text color contains blue.
;                                       |FOREGROUND_GREEN - Text color contains green.
;                                       |FOREGROUND_RED - Text color contains red.
;                                       |FOREGROUND_INTENSITY - Text color is intensified.
;                                       |BACKGROUND_BLUE - Background color contains blue.
;                                       |BACKGROUND_GREEN - Background color contains green.
;                                       |BACKGROUND_RED - Background color contains red.
;                                       |BACKGROUND_INTENSITY - Background color is intensified.
;                                       |COMMON_LVB_LEADING_BYTE - Leading byte.
;                                       |COMMON_LVB_TRAILING_BYTE - Trailing byte.
;                                       |COMMON_LVB_GRID_HORIZONTAL - Top horizontal
;                                       |COMMON_LVB_GRID_LVERTICAL - Left vertical.
;                                       |COMMON_LVB_GRID_RVERTICAL - Right vertical.
;                                       |COMMON_LVB_REVERSE_VIDEO - Reverse foreground and background attribute.
;                                       |COMMON_LVB_UNDERSCORE - Underscore.
; Author ........: Mat
; Remarks .......:
; ===============================================================================================================================

Global Const $tagCHAR_INFO_A = "CHAR  AsciiChar; WORD Attributes"

; #STRUCTURE# ===================================================================================================================
; Name...........: $tagCHAR_INFO_W
; Description ...: Specifies a Unicode character and its attributes. This structure is used by console functions to read from
;                  and write to a console screen buffer.
; Fields ........: UnicodeChar          - Unicode character of a screen buffer character cell.
;                  Attributes           - The character attributes. This member can be zero or any combination of the following values.
;                                       |FOREGROUND_BLUE - Text color contains blue.
;                                       |FOREGROUND_GREEN - Text color contains green.
;                                       |FOREGROUND_RED - Text color contains red.
;                                       |FOREGROUND_INTENSITY - Text color is intensified.
;                                       |BACKGROUND_BLUE - Background color contains blue.
;                                       |BACKGROUND_GREEN - Background color contains green.
;                                       |BACKGROUND_RED - Background color contains red.
;                                       |BACKGROUND_INTENSITY - Background color is intensified.
;                                       |COMMON_LVB_LEADING_BYTE - Leading byte.
;                                       |COMMON_LVB_TRAILING_BYTE - Trailing byte.
;                                       |COMMON_LVB_GRID_HORIZONTAL - Top horizontal
;                                       |COMMON_LVB_GRID_LVERTICAL - Left vertical.
;                                       |COMMON_LVB_GRID_RVERTICAL - Right vertical.
;                                       |COMMON_LVB_REVERSE_VIDEO - Reverse foreground and background attribute.
;                                       |COMMON_LVB_UNDERSCORE - Underscore.
; Author ........: Mat
; Remarks .......:
; ===============================================================================================================================

Global Const $tagCHAR_INFO = $tagCHAR_INFO_W

Mat

Edit: And should I default to unicode or ascii?

Edited by Mat
Link to comment
Share on other sites

What is the correct way to deal with unions in a struct definition?

Edit: And should I default to unicode or ascii?

You cannot use $tagCHAR_INFO_A only $tagCHAR_INFO_W as unions assume the size of the largest member, hence WCHAR in this case.

UDFS & Apps:

Spoiler

DDEML.au3 - DDE Client + Server
Localization.au3 - localize your scripts
TLI.au3 - type information on COM objects (TLBINF emulation)
TLBAutoEnum.au3 - auto-import of COM constants (enums)
AU3Automation - export AU3 scripts via COM interfaces
TypeLibInspector - OleView was yesterday

Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE 

Link to comment
Share on other sites

You cannot use $tagCHAR_INFO_A only $tagCHAR_INFO_W as unions assume the size of the largest member, hence WCHAR in this case.

I thought that the idea was that whichever you assigned to would then become larger, and so be used. A more complex example is here. There the union involves structs as well.

Link to comment
Share on other sites

I thought that the idea was that whichever you assigned to would then become larger, and so be used. A more complex example is here. There the union involves structs as well.

Nope. See, union is typical C construction and in C the size of a type is always known at compile-time (special case are polymorphic classes), thus when several variables occupy the same space (like in a union) compiler must reserve enough of it to contain the largest one.

S. http://en.wikipedia.org/wiki/Union_(computer_science)#C.2FC.2B.2B

UDFS & Apps:

Spoiler

DDEML.au3 - DDE Client + Server
Localization.au3 - localize your scripts
TLI.au3 - type information on COM objects (TLBINF emulation)
TLBAutoEnum.au3 - auto-import of COM constants (enums)
AU3Automation - export AU3 scripts via COM interfaces
TypeLibInspector - OleView was yesterday

Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE 

Link to comment
Share on other sites

Nope. See, union is typical C construction and in C the size of a type is always known at compile-time (special case are polymorphic classes), thus when several variables occupy the same space (like in a union) compiler must reserve enough of it to contain the largest one.

S. http://en.wikipedia.org/wiki/Union_(computer_science)#C.2FC.2B.2B

Kk, that explains it well in space terms, but that doesn't mean much here. I haven't had the experience of using them, but from an outsiders point of view it looks like an easy way of allowing devs to choose whether to use char or wchar (in this case) without using lots of different structures.

Link to comment
Share on other sites

So how do I then apply the theory to a function such as: http://msdn.microsoft.com/en-us/library/ms684344(VS.85).aspx

It wants me to give an array of INPUT_RECORD structs. Each of those has a more complex union of yet more structs... It doesn't explain it well on MSDN, but I suppose i need an element for each of the possible INPUT_RECORD structures (5).

Link to comment
Share on other sites

A union is the size of it's largest member. The following union is 64-bits:

union a
{
    int i;
    __int64 i64;
};
If you are using a union inside a structure then you must ensure you use a combination of appropriate types to produce the correct length. For example, take the following C structure:

struct has_union
{
    int before;
    union the_union
    {
    int i;
    __int64 i64;
    };
    int after;
};
The size of that structure is 128 bits (ignoring alignment issues which is a whole new can of worms).

To represent that structure in AutoIt you have to know in advance whether you want to access i or i64 from the union. If you want to access i64 then it's really easy, you just do this:

Local $struct = DllStructCreate("int before; int64 i64; int after;")

If you want to access i, however, things get really really complicated really really fast. At the very least you would need to write the structure like this:

Local $struct = DllStructCreate("int before; int i; int padding; int after;")

Notice the addition of the "int padding". You must pad the length of the structure out so that enough space is taken up for the largest member of the union.

But that's not enough. That code will likely work just fine on 32-bit windows but is probably going to fail on 64-bit windows. A 64-bit union can be aligned differently than 2 separate 32-bit integers. You must ensure that the "int i" and "int padding" part of the structure is packed into 64-bits and not into 128-bits due to alignment. You can use the "align" feature of DllStructCreate() inline to change alignment on-the-fly to solve this very issue.

If you're confused, you should be, it's confusing as hell. It requires intricate knowledge of things not even many C programmers will know or care about. I know that I very rarely - if ever - care about alignment and padding when I write C++ but I must know and care when using AutoIt to work with this sort of stuff because I have to tell AutoIt exactly how it needs to lay out the memory.

I strongly suggest you write a DLL with a thin wrapper around the union stuff to remove the need to specify a union in AutoIt. That's probably going to be less complicated than trying to learn the alignment rules and creating complex code that's compatible on 32-bit and 64-bit Windows

Link to comment
Share on other sites

But that's not enough. That code will likely work just fine on 32-bit windows but is probably going to fail on 64-bit windows. A 64-bit union can be aligned differently than 2 separate 32-bit integers. You must ensure that the "int i" and "int padding" part of the structure is packed into 64-bits and not into 128-bits due to alignment. You can use the "align" feature of DllStructCreate() inline to change alignment on-the-fly to solve this very issue.

If you're confused, you should be, it's confusing as hell.

As long as you stick with the standard Win API and MS types it is not that bad. MS guarantees (with rare and mostly well documented exceptions) that standard types remain backward compatible on all Windows platforms, i.e. you can count on WORD to be always 16, DWORD - always 32, INT (not int) - 32 bit wide etc. The same applies to structs defined in PSDK.

If AutoIt follows this pattern in DllStructCreate - and to my knowledge it does, it shouldn't be that complicated to write platform independent code with structs and unions.

UDFS & Apps:

Spoiler

DDEML.au3 - DDE Client + Server
Localization.au3 - localize your scripts
TLI.au3 - type information on COM objects (TLBINF emulation)
TLBAutoEnum.au3 - auto-import of COM constants (enums)
AU3Automation - export AU3 scripts via COM interfaces
TypeLibInspector - OleView was yesterday

Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE 

Link to comment
Share on other sites

doudou, I'm not talking about type size. I'm talking about alignment. Yes, you're guaranteed an int will be 32-bits but that int may be aligned on 4-byte boundaries on x86 and 8-byte boundaries on x64. I'm a bit fuzzy on the rules as it's been a few weeks since I last looked at it. Some types have explicit alignment, some types use the Windows API default. The default is different between x86 and x64. A lot of stuff lines up on both platforms by luck, not by design. When it doesn't align things go crazy and you get/pass weird garbage values.

Link to comment
Share on other sites

doudou, I'm not talking about type size. I'm talking about alignment. Yes, you're guaranteed an int will be 32-bits but that int may be aligned on 4-byte boundaries on x86 and 8-byte boundaries on x64. I'm a bit fuzzy on the rules as it's been a few weeks since I last looked at it. Some types have explicit alignment, some types use the Windows API default. The default is different between x86 and x64. A lot of stuff lines up on both platforms by luck, not by design. When it doesn't align things go crazy and you get/pass weird garbage values.

Doesn't AutoIt follow the platform alignment rules dependent on if the script runs with AutoIt32 or AutoIt64?

If it doesn't then it is indeed difficult or rather impossible to write portable DllStruct code in AU3... No 64bit system at hand in the moment to test it.

UDFS & Apps:

Spoiler

DDEML.au3 - DDE Client + Server
Localization.au3 - localize your scripts
TLI.au3 - type information on COM objects (TLBINF emulation)
TLBAutoEnum.au3 - auto-import of COM constants (enums)
AU3Automation - export AU3 scripts via COM interfaces
TypeLibInspector - OleView was yesterday

Coder's last words before final release: WE APOLOGIZE FOR INCONVENIENCE 

Link to comment
Share on other sites

Yes, AutoIt does follow the rules, but that's precisely the problem with a union. In the example above using an int and int64 in a union the size of the resulting union is 64-bits. However, to access only the int portion of the union you must also declare a dummy padding variable to ensure the structure is the correct size. That's where platform alignment comes in. We don't want platform alignment for those two variables, we want those two variables to occupy exactly 64 contiguous bits accessible via two 32-bit integers. On x86 that's fine because that's how alignment works. On x64 alignment works totally differently and will almost always separate the two variables with some padding causing the structure to not only be too large but also everything after that will be shifted to the right by the amount of padding.

Remember, AutoIt does not natively support unions so the user must know how the union will map into memory (knowledge of endianess is important as well and I think my example above may be wrong, I may have the padding and i values swapped).

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...