Jump to content

[solved] convert string into ANSI from UTF-8/Unicode


Myicq
 Share

Go to solution Solved by jchd,

Recommended Posts

I have a need to convert a unicode string into the appropriate ANSI encoding.

Purpose is to insert the ANSI string into an old application, which with proper encoding selected, displays the same character.

This seems like a simple job.

I have the following:

#include <WinAPI.au3>
$string = "שָׁלוֹם" ; shalom in Hebrew
$out = _WinAPI_WideCharToMultiByte($string)
ConsoleWrite($out & @CRLF )

My expected outcome is (using my default ANSI, Western Europe codepage):

ùÈÑìåÉí

which when encoded in Win1255 ("Hebrew") correctly displays שָׁלוֹם.

Problem is that I can ONLY get question marks, or empty string, no matter when I put as parameter for the WideChartoMultiByte...

What am I missing here ?

How can I tell the WideCharToMultiByte function which encoding I would like ?

So far I am looking for

UTF8 ==> GBK (or GB2312) for Chinese/Simplified

UTF8==> ANSI (Hebres, Win1255)

UTF8==> KoreanJohab

I can't be the only person ever to do this ?

The conversion works perfectly using sofware such as "Charco" (freeware), but I would love to have this natively in my script.

Edited by Myicq

I am just a hobby programmer, and nothing great to publish right now.

Link to comment
Share on other sites

$out = _WinAPI_WideCharToMultiByte($string, 65001)

ConsoleWrite($out & @CRLF )

Well that results in same string I started with :)

I need to have the multi byte string, so that שָׁלוֹם ==> bytes [F9 C8 D1 EC E5 C9 ED] which when encoded with Win1255 Hebrew gives me the "Shalom" again. But in an old application that does not understand UniCode.

I am just a hobby programmer, and nothing great to publish right now.

Link to comment
Share on other sites

I don't know much about this, but according to unicode support section of help file.

UTF 8 has to be read in using the appropriate flags, so perhaps your literal string

does not fit that criteria.

Also, according to MSDN widechartomulibyte is for utf 16.

AutoIt Absolute Beginners    Require a serial    Pause Script    Video Tutorials by Morthawt   ipify 

Monkey's are, like, natures humans.

Link to comment
Share on other sites

What about this?

Func _ANSIToUnicode($sString)
    #cs
        Local Const $SF_ANSI = 1
        Local Const $SF_UTF16_LE = 2
        Local Const $SF_UTF16_BE = 3
        Local Const $SF_UTF8 = 4
    #ce
    Local Const $SF_ANSI = 1, $SF_UTF8 = 4
    Return BinaryToString(StringToBinary($sString, $SF_ANSI), $SF_UTF8)
EndFunc   ;==>_ANSIToUnicode

Func _UnicodeToANSI($sString)
    #cs
        Local Const $SF_ANSI = 1
        Local Const $SF_UTF16_LE = 2
        Local Const $SF_UTF16_BE = 3
        Local Const $SF_UTF8 = 4
    #ce
    Local Const $SF_ANSI = 1, $SF_UTF8 = 4
    Return BinaryToString(StringToBinary($sString, $SF_UTF8), $SF_ANSI)
EndFunc   ;==>_UnicodeToANSI
Edited by guinness

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

@AZJIO .. this looks like what I need. Just need to find out how to register, my Russian is kind of rusty... ;)

Could you perhaps upload the script to some temp filehost like http://ge.tt ?

If not I will try to create an account.

(Wonder how many other good resources are hiding in the Russian / French / German / Brazilian.. sites. )

Cheers.

myicq

I am just a hobby programmer, and nothing great to publish right now.

Link to comment
Share on other sites

Out of curiosity did you try the examples I posted or did you just skip my post?

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

Out of curiosity did you try the examples I posted or did you just skip my post?

Guiness,

I did try that, and sorry to say that it only returned the same string

I had:

$string = "שָׁלוֹם" ; shalom..
$out = _UnicodeToANSI($string)
; rule out that ConsoleWrite plays trick on me..
filewrite(@ScriptDir & "out.txt", $out)

Func _UnicodeToANSI($sString)
#cs
Local Const $SF_ANSI = 1
Local Const $SF_UTF16_LE = 2
Local Const $SF_UTF16_BE = 3
Local Const $SF_UTF8 = 4
#ce
Local Const $SF_ANSI = 1, $SF_UTF8 = 4
Return BinaryToString(StringToBinary($sString, $SF_UTF8), $SF_ANSI)
EndFunc ;==>_UnicodeToANSI

Results are:

Expected: F9 C8 D1 EC E5 C9 ED    ; 7 byte:  Shalom in Win-1255
Actual:  D7 A9 D6 B8 D7 81 D7 9C D7 95 D6 B9 D7 9D  ; 14 byte: Shalom in UTF-8

Same if I put the input file in UCS2-LE, output is just the same as I input.

The conversion done by the "Charco" program is correct, not sure how it's done (http://www.marblesoftware.com/Marble_Software/Charco.html)

If I convert to UCS2 and look at the mapping [http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1255.TXT] it is correct for each and every character / point. But I doubt it should be necessary to use a lookup table for this operation ?

Hope this gives some clues. Help still much appreciated !

I am just a hobby programmer, and nothing great to publish right now.

Link to comment
Share on other sites

Myicq,

Fair enough. I will look into then.

UDF List:

 
_AdapterConnections()_AlwaysRun()_AppMon()_AppMonEx()_ArrayFilter/_ArrayReduce_BinaryBin()_CheckMsgBox()_CmdLineRaw()_ContextMenu()_ConvertLHWebColor()/_ConvertSHWebColor()_DesktopDimensions()_DisplayPassword()_DotNet_Load()/_DotNet_Unload()_Fibonacci()_FileCompare()_FileCompareContents()_FileNameByHandle()_FilePrefix/SRE()_FindInFile()_GetBackgroundColor()/_SetBackgroundColor()_GetConrolID()_GetCtrlClass()_GetDirectoryFormat()_GetDriveMediaType()_GetFilename()/_GetFilenameExt()_GetHardwareID()_GetIP()_GetIP_Country()_GetOSLanguage()_GetSavedSource()_GetStringSize()_GetSystemPaths()_GetURLImage()_GIFImage()_GoogleWeather()_GUICtrlCreateGroup()_GUICtrlListBox_CreateArray()_GUICtrlListView_CreateArray()_GUICtrlListView_SaveCSV()_GUICtrlListView_SaveHTML()_GUICtrlListView_SaveTxt()_GUICtrlListView_SaveXML()_GUICtrlMenu_Recent()_GUICtrlMenu_SetItemImage()_GUICtrlTreeView_CreateArray()_GUIDisable()_GUIImageList_SetIconFromHandle()_GUIRegisterMsg()_GUISetIcon()_Icon_Clear()/_Icon_Set()_IdleTime()_InetGet()_InetGetGUI()_InetGetProgress()_IPDetails()_IsFileOlder()_IsGUID()_IsHex()_IsPalindrome()_IsRegKey()_IsStringRegExp()_IsSystemDrive()_IsUPX()_IsValidType()_IsWebColor()_Language()_Log()_MicrosoftInternetConnectivity()_MSDNDataType()_PathFull/GetRelative/Split()_PathSplitEx()_PrintFromArray()_ProgressSetMarquee()_ReDim()_RockPaperScissors()/_RockPaperScissorsLizardSpock()_ScrollingCredits_SelfDelete()_SelfRename()_SelfUpdate()_SendTo()_ShellAll()_ShellFile()_ShellFolder()_SingletonHWID()_SingletonPID()_Startup()_StringCompact()_StringIsValid()_StringRegExpMetaCharacters()_StringReplaceWholeWord()_StringStripChars()_Temperature()_TrialPeriod()_UKToUSDate()/_USToUKDate()_WinAPI_Create_CTL_CODE()_WinAPI_CreateGUID()_WMIDateStringToDate()/_DateToWMIDateString()Au3 script parsingAutoIt SearchAutoIt3 PortableAutoIt3WrapperToPragmaAutoItWinGetTitle()/AutoItWinSetTitle()CodingDirToHTML5FileInstallrFileReadLastChars()GeoIP databaseGUI - Only Close ButtonGUI ExamplesGUICtrlDeleteImage()GUICtrlGetBkColor()GUICtrlGetStyle()GUIEventsGUIGetBkColor()Int_Parse() & Int_TryParse()IsISBN()LockFile()Mapping CtrlIDsOOP in AutoItParseHeadersToSciTE()PasswordValidPasteBinPosts Per DayPreExpandProtect GlobalsQueue()Resource UpdateResourcesExSciTE JumpSettings INISHELLHOOKShunting-YardSignature CreatorStack()Stopwatch()StringAddLF()/StringStripLF()StringEOLToCRLF()VSCROLLWM_COPYDATAMore Examples...

Updated: 22/04/2018

Link to comment
Share on other sites

@Wise Ones

I don't have the skills myself to dig into this - at least not if I have to know what's going on. So this is just a pointer to a possibile solution.

I found the .NET class "Encoding" which seems to contain a conversion API.

From an example:

public static void PrintCPBytes(string str, int codePage)
{
Encoding targetEncoding;
byte[] encodedChars;
// Get the encoding for the specified code page.
targetEncoding = Encoding.GetEncoding(codePage);
// Get the byte representation of the specified string.
encodedChars = targetEncoding.GetBytes(str);
// Print the bytes.
Console.WriteLine
("Byte representation of '{0}' in Code Page '{1}':", str, codePage);
for (int i = 0; i < encodedChars.Length; i++)
Console.WriteLine("Byte {0}: {1}", i, encodedChars[i]);
}

Found here

There is a lot more information here

So the question basically is: how to use Encoding.GetEncoding etc from within AutoIT ? The only thread I could find about this topic was closed because of GameAutomation.

Hope this helps. It would be nice to have this turn into a UDF for all Windows Codepages ;)

I am just a hobby programmer, and nothing great to publish right now.

Link to comment
Share on other sites

  • Solution

You're all making this way more complex than it really is.

The UTF-16 LE (or rather UCS2-LE that AutoIt uses) to ANSI conversion will use the codepage specified in the _WinAPI_MultiByteToWideChar call.

So the conversion you need is:

#include <WinAPI.au3>
$string = "שָׁלוֹם" ; shalom in Hebrew
$out = _WinAPI_WideCharToMultiByte($string, 1255) ; explicit Hebrew codepage

Pass $out to your old application.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

You're all making this way more complex than it really is.

The UTF-16 LE (or rather UCS2-LE that AutoIt uses) to ANSI conversion will use the codepage specified in the _WinAPI_MultiByteToWideChar call.

So the conversion you need is:

#include <WinAPI.au3>
$string = "שָׁלוֹם" ; shalom in Hebrew
$out = _WinAPI_WideCharToMultiByte($string, 1255) ; explicit Hebrew codepage

$string = "Москва"  ; Russian known city
$out = _WinAPI_WideCharToMultiByte($string, 1251) ; explicit Cyrillic codepage
filewrite(@ScriptDir & "out2.txt", $out)

Pass $out to your old application.

jchd,

indeed it seems like everyone incl me was running around like headless chicken.

What confused me I think was the help file, which shows only a few values (and vague descriptions) of possible values for codepage. Indeed the MSDN documentation also does not mention that other parameters are allowed.

Where can I suggest examples and additions to the help in this matter ?

And for others, the list of Windows ANSI codepages are here:

http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/

So thanks for your help, jchd.

I am just a hobby programmer, and nothing great to publish right now.

Link to comment
Share on other sites

Indeed the MSDN documentation also does not mention that other parameters are allowed.

Untrue. The page clearly mentions:

Parameters

CodePage [in]

Code page to use in performing the conversion. This parameter can be set to the value of any code page that is installed or available in the operating system. For a list of code pages, see Code Page Identifiers. Your application can also specify one of the values shown in the following table.

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...