Sign in to follow this  
Followers 0
oceanwaves

char array in C/C++ and in DllStruct* function

4 posts in this topic

Hi Guys,

  I have a little question want to understand :

  suppose a char array -- "char[100]", it is 100 bytes, every element is 1 byte.  In the C/C++, if I pass some unicode characters (assume every unicode needs 2 bytes.) in this array, I know  per 2 element can represent a unicode character and C/C++ can output them normally.

  

   But in the Autoit, when I call the Dllstruct* function, the code as below:

$a = DllStructCreate("char text[2]") ; this array only 2 bytes
DllStructSetData($a, "text", "你"); this character is chinese
$b = DllStructGetData($a, 1)
MsgBox(4096, "", $b); it can output normally, no problem

this code is working, BUT.........if I change as below:

$a = DllStructCreate("char text[10]"); it has 10 bytes
DllStructSetData($a, "text", "你好"); 2 Chinese characters
$b = DllStructGetData($a, 1)
MsgBox(4096, "", $b); output is "你?"

Why?

:ermm:  Maybe I have little obsessive-compulsive disorder.......

Anyway, thanks in advance for your help. :zorro:

Share this post


Link to post
Share on other sites



You can't just assume every Unicode codepoint is 2 bytes and expect correct operation. For instance, the Euro character is 3 bytes in UTF8. UTF8 may need from 1 to 4 bytes.

To store a Unicode string in a DllStruct, you need to make a convertion to UTF8, determine the actual lenght of the string (in bytes) and allocate the char[N] array in the structure. Of course the API must expect an UTF8 string of bytes.

Alternatively for APIs which expect a UTF16 string just use wchar[N] but you must remember that AutoIt merely handles UCS-2, the restriction of UTF16 to the BMP0 (roughly 64K codepoints). UTF16 representation of codepoints in higher planes is not garanteed to work correctly with all AutoIt built-in string functions.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

You can't just assume every Unicode codepoint is 2 bytes and expect correct operation. For instance, the Euro character is 3 bytes in UTF8. UTF8 may need from 1 to 4 bytes.

To store a Unicode string in a DllStruct, you need to make a convertion to UTF8, determine the actual lenght of the string (in bytes) and allocate the char[N] array in the structure. Of course the API must expect an UTF8 string of bytes.

Alternatively for APIs which expect a UTF16 string just use wchar[N] but you must remember that AutoIt merely handles UCS-2, the restriction of UTF16 to the BMP0 (roughly 64K codepoints). UTF16 representation of codepoints in higher planes is not garanteed to work correctly with all AutoIt built-in string functions.

 

Hi Jchd,

  Thanks your reply. Yes, you are right, not every unicode only needs 2 bytes. But for my example code, if I just puting one unicode character into the array[2] separately, the output is OK and I think at least for these 2 characters, 2 bytes representing one character is enough. But when I put them together  into the array[10]....note I allocated 10 bytes space to them, just  first character can be outputed, the second is to be "?".  So I think this question maybe not related with the how many bytes using for unicode.

Share this post


Link to post
Share on other sites

When you set data as AutoIt string (Unicode) to a structure element defined by char or char* AutoIt converts (read: emasculates) the Unicode string to ANSI and this is what the callee will see.

To pass a Unicode verbatim you must either pass the string to wchar or wchar*, or first convert your string to UTF8 and pass that to a byte or char or char* element.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0