Jump to content

[Solved] GB2312 to UTF8 - Charcter encoding

Recommended Posts


I wanted to convert GB2312 (chinese) character encoding to UTF8. I will describe the problem as mentioned below and also let me know as to where I am going wrong in understanding the character encoding (source code also included).

The Subject headers of an email contain the following

where the format is :

When it comes to displaying UTF8 or GB2312 *individually* in different emails is not a problem, however when I want to display both these character-encodings , only one of them will get displayed.

This can be achieved by defining the charset in the email-msg body.

Content-Type: text/html;
Content-Type: text/html;

If all goes well you can view chinese characters.


MzE2Njg3OTU4o6zA67K7v6q1xM/6Y8rb = 316687958£¬Àë²»¿ªµÄÏúcÊÛ

Save the mentioned .7z and extract the .eml file and open this file in your fav. email-client

The code I am using is as follows and the output when replaced in the eml file doesnt give me any chinese characters .

$sText = '316687958£¬Àë²»¿ªµÄÏúcÊÛ'
Func _ConvertAnsiToUtf8($sText)
Local $tUnicode = _WinAPI_MultiByteToWideChar($sText)
If @error Then Return SetError(@error, 0, "")
Local $sUtf8 =_WinAPI_WideCharToMultiByte(DllStructGetPtr($tUnicode), 65001)
If @error Then Return SetError(@error, 0, "")
Return SetError(0, 0, $sUtf8)
EndFunc ;==>_ConvertAnsiToUtf8

Thanks in advance.


After searching found this:


static int GB2312ToUtf8(const char* gb2312, char* utf8)
int len = MultiByteToWideChar(CP_ACP, 0, gb2312, -1, NULL, 0);
wchar_t* wstr = new wchar_t[len+1];
memset(wstr, 0, len+1);
MultiByteToWideChar(CP_ACP, 0, gb2312, -1, wstr, len);
len = WideCharToMultiByte(CP_UTF8, 0, wstr, -1, NULL, 0, NULL, NULL);
utf8 = new char[len+1];
memset(utf8, 0, len+1);
WideCharToMultiByte(CP_UTF8, 0, wstr, -1, utf8, len, NULL, NULL);
if(wstr) delete[] wstr;
return len;
Edited by DeltaRocked
Link to post
Share on other sites


Partially solved this issue based on the various resouces available in Autoit Forums itself. A big thanks to AZJIO for making available the encoding.au3 in one of the posts.

_EncodingToUnicode_API() was picked up from encoding.au3 which is available in the post over here,

Another version by Arilvv can also be found over here

Note to Self: to get the conversion right, one needs to know the *correct* codepage identifier which is available here


Image of the conversion : http://imm.io/15phW

$sCodePage_Identifier=936 ;GB2312
; refer to http://msdn.microsoft.com/en-us/library/windows/desktop/dd317756%28v=vs.85%29.aspx
; for more information on the codepage identifier.
; Base64 of the below mentioned string : MzE2Njg3OTU4o6zA67K7v6q1xM/6Y8rb

Func _EncodingToUnicode_API($sString,$sCodePage_Identifier)
Local $BufferSize = StringLen($sString) * 2
Local $Buffer = DllStructCreate("byte[" & $BufferSize & "]")

Local $Return = DllCall("Kernel32.dll", "int", "MultiByteToWideChar", _
"int", $sCodePage_Identifier, _
"int", 0, _
"str", $sString, _
"int", StringLen($sString), _
"ptr", DllStructGetPtr($Buffer), _
"int", $BufferSize)

Local $UnicodeBinary = DllStructGetData($Buffer, 1)
Local $UnicodeHex1 = StringReplace($UnicodeBinary, "0x", "")
Local $StrLen = StringLen($UnicodeHex1)
Local $UnicodeString, $UnicodeHex2, $UnicodeHex3

For $i = 1 To $StrLen Step 4
$UnicodeHex2 = StringMid($UnicodeHex1, $i, 4)
$UnicodeHex3 = StringMid($UnicodeHex2, 3, 2) & StringMid($UnicodeHex2, 1, 2)
$UnicodeString &= ChrW(Dec($UnicodeHex3))
$Buffer = 0
Return $UnicodeString


MIME-Decode for Subject and correctly identify the character encoding and complete the conversion.

Edited by DeltaRocked
Link to post
Share on other sites
  • 1 year later...


I also want to convert GB2312 to UTF8 and I would like to try the script that is mentioned in the second post. However, when I try to run it, AutoIt says: "Cannot parse #include". Any idea what might be wrong?



Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Create New...