Sign in to follow this  
Followers 0
fubar99

How to convert Unicode characters to ANSI using AutoIt?

21 posts in this topic

I am reading a Java properties file with AutoIt that includes some unicode characters in u0xxx format.
How can I convert them to ANSI with AutoIt?
 

Share this post


Link to post
Share on other sites



look at _WinAPI_WideCharToMultiByte in the help

Share this post


Link to post
Share on other sites

look at _WinAPI_WideCharToMultiByte in the help

 

I tried

_WinAPI_WideCharToMultiByte("\u0393\u03A1\u0397")

but nothing happens. It returns the same string!

Share this post


Link to post
Share on other sites

you need to populate the code page parameter

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

I am reading a Java properties file with AutoIt that includes some unicode characters in u0xxx format.

How can I convert them to ANSI with AutoIt?

 

 

I tried

_WinAPI_WideCharToMultiByte("\u0393\u03A1\u0397")

but nothing happens. It returns the same string!

 

try :

#include <FF.au3>
_FFConnect()
    Local $sJavascript = ""
    $sJavascript = 'try {'
    $sJavascript &= 'var string = unescape("\u0393\u03A1\u0397");alert(string)'
    $sJavascript &= ' }catch(e){}finally{string;}'
    _FFCmd($sJavascript)
_FFDisConnect()
Exit
Edited by Iczer

Share this post


Link to post
Share on other sites

 

try :

#include <FF.au3>
_FFConnect()
    Local $sJavascript = ""
    $sJavascript = 'try {'
    $sJavascript &= 'var string = unescape("\u0393\u03A1\u0397");alert(string)'
    $sJavascript &= ' }catch(e){}finally{string;}'
    _FFCmd($sJavascript)
_FFDisConnect()
Exit

I downloaded and used '?do=embed' frameborder='0' data-embedContent>>

but the above script just hangs with this output:

_FFConnect: OS: WIN_7 WIN32_NT 7601 Service Pack 1
_FFConnect: AutoIt: 3.3.8.1
_FFConnect: FF.au3: 0.6.0.1b-3
_FFConnect: IP: 127.0.0.1
_FFConnect: Port:   4242
_FFConnect: Delay:  2ms

Still, I am not sure I want to do like this...

I just cannot imagine there is no other build-in or almost build-in way to do this!

Share this post


Link to post
Share on other sites

you need to populate the code page parameter

tried all possible values; still the same result.

Share this post


Link to post
Share on other sites

I downloaded and used '?do=embed' frameborder='0' data-embedContent>>

but the above script just hangs with this output:

i forgot - you need start FF before script - you get message in FF

anyway try next:

;$string = "\u0393\u03A1\u0397"
$string = "1-st [ \u0393 ] second [ \u03A1 ] third [ \u0397 ]"

$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{4}))","' & ChrW(0x$2) & '") & "'")
MsgBox(64,"",$decodedString)

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

 

i forgot - you need start FF before script - you get message in FF

anyway try next:

;$string = "\u0393\u03A1\u0397"
$string = "1-st [ \u0393 ] second [ \u03A1 ] third [ \u0397 ]"

$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{4}))","' & ChrW(0x$2) & '") & "'")
MsgBox(64,"",$decodedString)

 

Thanks, this actually works!

But since my regex sucks, could you tell me if this works properly for strings like these:

"u0393u03A1u0397", "u0393 u03A1 u0397", "my name is: u0393u03A1 lastname: u0397"

Also, even though the MsgBox() seems ok, the output is not ANSI !

I use this output as an email subject for _INetSmtpMail().

Edited by fubar99

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Thanks, this actually works!

But since my regex sucks, could you tell me if this works properly for strings like these:

"u0393u03A1u0397", "u0393 u03A1 u0397", "my name is: u0393u03A1 lastname: u0397"

you can see :

$string = "\u0393\u03A1\u0397" & "  |  "& "\u0393 \u03A1 \u0397" & "  |  "& "my name is: \u0393\u03A1 lastname: \u0397"
$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{4}))","' & ChrW(0x$2) & '") & "'")
MsgBox(64,"",$decodedString)

Also, even though the MsgBox() seems ok, the output is not ANSI !

I use this output as an email subject for _INetSmtpMail().

see "Standardized subsets " table on the http://en.wikipedia.org/wiki/Unicode

Row 03 mean Greek alphabet  http://en.wikipedia.org/wiki/Greek_characters_in_Unicode

now ASCII would be http://en.wikipedia.org/wiki/Windows-1253

so now all you need is replace hex Unicode digits  to hex ASCII digits - it can be done by adding 48 dec or 0x30 hex to second 2 digits of 4 digits unicode symbol

as i'm do not have Greek Windows, try :

$string = "\u0393\u03A1\u0397" & "  |  "& "\u0393 \u03A1 \u0397" & "  |  "& "my name is: \u0393\u03A1 lastname: \u0397"
;$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{4}))","' & ChrW(0x$2) & '") & "'")
$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{2})([[:xdigit:]]{2}))","' & Chr(0x30  + 0x$3) & '") & "'")
MsgBox(64,"",$decodedString)
Edited by Iczer

Share this post


Link to post
Share on other sites
as i'm do not have Greek Windows, try :
$string = "\u0393\u03A1\u0397" & "  |  "& "\u0393 \u03A1 \u0397" & "  |  "& "my name is: \u0393\u03A1 lastname: \u0397"
;$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{4}))","' & ChrW(0x$2) & '") & "'")
$decodedString = Execute("'" & StringRegExpReplace($string, "(\\u([[:xdigit:]]{2})([[:xdigit:]]{2}))","' & Chr(0x30  + 0x$3) & '") & "'")
MsgBox(64,"",$decodedString)

 

I appreciate the time you have spent in this!

This actually produces the same output as before.

The MsgBox() shows the correct Greek chars and my email subject the same mambo-jumbo cr@p...

In Java I am able to send the same email as an html message with no problem.

Perhaps the problem lies with _INetSmtpMail()?

Is there some other way to send an html email with the correct content type?

Share this post


Link to post
Share on other sites

You need to convert your string to UTF8 and send the html as UTF8. Using the help feature will ... help!


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

You need to convert your string to UTF8 and send the html as UTF8. Using the help feature will ... help!

 

Whoao! Now, why didn't I think to press the help button!? :sorcerer:

I apologize for not  pressing the "Like This" button for your reply; the forum system won't allow me just yet! :censored:

Your reply is just sooooo much help! :thumbsup:

Share this post


Link to post
Share on other sites

try adding on beginning of you e-mail :

<meta charset="UTF-8" />

but remember - HTML E-mail is EVIL!  :shifty::D

Share this post


Link to post
Share on other sites

Whoao! Now, why didn't I think to press the help button!? :sorcerer:

I apologize for not  pressing the "Like This" button for your reply; the forum system won't allow me just yet! :censored:

Your reply is just sooooo much help! :thumbsup:

Sorry my Lord I didn't have much time to expose a full tutorial. In my hurry I meant "search", not "help".

Now, after seing your attitude, I'm happy I didn't waste more time.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Sorry my Lord I didn't have much time to expose a full tutorial. In my hurry I meant "search", not "help".

Now, after seing your attitude, I'm happy I didn't waste more time.

 

Whatever do you mean? I just expressed my gratitude!

Do you mean to say that your HELP reply was not helpful in the first place??? o:)

Share this post


Link to post
Share on other sites

Sorry I had a terribly busy and bad day and misread your answer for a criticism that my post was dry and not explicit enough.

I should have taken the time to dig relevant posts and point you to them but was in a bit of a hurry.

If that laconic post of mine actually served anything, then I'm fully happy with that.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Sorry I had a terribly busy and bad day and misread your answer for a criticism that my post was dry and not explicit enough.

I should have taken the time to dig relevant posts and point you to them but was in a bit of a hurry.

If that laconic post of mine actually served anything, then I'm fully happy with that.

 

Listen, I am just trying to be polite when people misunderstand me to be an idiot that does not even read the manual!

RTFM is always my moto, but if a 1 day search comes out empty, you ought to post in a forum for others with years of expertise in AutoIt to help.

If you have a solid example that translates unicode strings e.g. u0xxx in ansi strings for anyone's windows locale, then your answer is welcome.

Believe me, the answer is not in the help section; i just found some functions to start with but came up dry.

This is must question and I just cannot belive noone else has already answer it for AutoIt.

Bear with me, I know enough about unicode, ansi, etc and have a working implementation in Java, but I am just novice with AutoIt!

Also the years I spent in C,C++ programming will do me no good since at that time the only Windows in existence was the allmighty DOS and the only codepage was ASCII...

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

Here's a sample of some posts of mine about this and similar subjects:

Once you have an AutoIt string with all your greek Unicode characters decoded, you can convert it to UTF8

 

Local $U8String = BinaryToString(StringToBinary($sString, 4), 1)

or to your Windows codepage (hoping it contains the needed characters)

Local $U8String = BinaryToString(StringToBinary($sString, 1), 1)
Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I created this for a Non-Unicode (Delphi 6) program some time ago. 

Hope you can read through and understand, it's not well commented ;)  Ask if in doubt.

Thanks to the people that helped me do this!

#include <ButtonConstants.au3>
#include <EditConstants.au3>
#include <GUIConstantsEx.au3>
#include <GUIListBox.au3>
#include <StaticConstants.au3>
#include <WindowsConstants.au3>
#include <WinAPI.au3>

#cs
-----  there are others as well ...
57002 Devanagari (Hindi, Marathi, Sanskrit, Konkani)
57003 Bengali
57004 Tamil
57005 Telugu
57006 Assamese (same as Bengali)
57007 Oriya
57008 Kannada
57009 Malayalam
57010 Gujarati
57011 Punjabi (Gurmukhi)
#ce

#Region ### START Koda GUI section ### Form=C:\Programmer\AutoIt3\KODA\Forms\unicode-to-ansi-tool.kxf
$Form1 = GUICreate("UniCode->ANSI multitool", 1006, 378, 297, 359)
GUISetFont(8, 400, 0, "Arial Unicode MS")
$Label1 = GUICtrlCreateLabel("Unicode => ANSI conversion", 16, 16, 425, 48)
GUICtrlSetFont(-1, 25, 400, 0, "Arial Unicode MS")
$lstCP = GUICtrlCreateList("", 16, 80, 161, 246)
GUICtrlSetData(-1, "1250 - Central Europe|1251 - Cyrillic|1252 - Western Europe|1253 - Greek|1254 - Turkish|1255 - Hebrew|1256 - Arabic|1257 - Baltic|1258 - Vietnam|874 - Thai|932 - Japanese Shift-JIS|936 - Chinese Simplified|949 - Korean|950 - Chinese Traditional|57010 Gujarati")

$Label2 = GUICtrlCreateLabel("Unicode text paste here", 216, 80, 120, 19)
$inpOriginal = GUICtrlCreateInput("тест", 216, 104, 729, 23)

$Label3 = GUICtrlCreateLabel("ANSI conversion here", 216, 152, 500, 19)
$inpANSI = GUICtrlCreateInput("", 216, 176, 729, 23)

$Label4 = GUICtrlCreateLabel("ESC/HEX encoded UTF8 for barcodes", 216, 238, 500, 19)
$inpEscapeCoded = GUICtrlCreateInput("", 216, 250,729,23)



$btnToClip = GUICtrlCreateButton("To Clipboard", 216, 280, 187, 57)
$chkLogInput = GUICtrlCreateCheckbox("Log INPUT lines", 640, 16, 97, 17)
$chkLogOutput = GUICtrlCreateCheckbox("Log OUTPUT lines", 640, 40, 97, 17)
; $btnSetHotkey = GUICtrlCreateButton("Set CtrlF1 = paste clipboard", 784, 16, 185, 57)
GUICtrlSetBkColor(-1, 0xFF0000)
GUISetState(@SW_SHOW)
#EndRegion ### END Koda GUI section ###

_GUICtrlListBox_SetCurSel($lstCP, 1)
HotKeySet("^{F1}", "_convert")

While 1
    $nMsg = GUIGetMsg()
    Switch $nMsg
        Case $GUI_EVENT_CLOSE
            Exit
        case $btnToClip
            _convert()
    EndSwitch
WEnd

Func _convert_key()
    GUICtrlSetData($inpOriginal, ClipGet())
    _convert()
EndFunc

Func _convert()
    local $cp = _get_cp()
    ;MsgBox(0,"","My CP = " & $cp)
    local $outstr = _WinAPI_WideCharToMultiByte(GUICtrlRead($inpOriginal), $cp) ;
    GUICtrlSetData($inpANSI, $outstr)
    local $hexstring = StringToBinary(guictrlread($inpOriginal), 4)
        ; thanks for help with this one. jhcd ?
    Local $sEscaped = StringRegExpReplace(StringMid($hexstring, 3), '([[:xdigit:]]{2})', '\\x$1') 
    guictrlsetdata($inpEscapeCoded, $sEscaped)
    ClipPut($outstr)
EndFunc

Func _get_cp()
    local $string = _GUICtrlListBox_GetText($lstCP, _GUICtrlListBox_GetCurSel($lstCP))
    $cp = StringRegExp($string, "^([0-9]+)", 1)
    ;msgbox(0,"",$cp[0])
    return $cp[0]
EndFunc

I am just a hobby programmer, and nothing great to publish right now.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0