Jump to content

Why does ChrW(AscW("πŸ”Š")) not produce πŸ”Š?


Go to solution Solved by jchd,

Recommended Posts

I need to use a unicode character(πŸ”Š) in my code and I find that it is impossible to use ChrW() function. Is copy and paste the only option?

$str = "Unicode character I want to use: πŸ”Š" & @CRLF
$str &= "AscW('πŸ”Š'): " & AscW('πŸ”Š') & @CRLF
$str &= "ChrW(" & AscW('πŸ”Š') & "): " & ChrW(AscW("πŸ”Š"))

MsgBox(0, '', $str)

Β 

Link to comment
Share on other sites

16 minutes ago, AspirinJunkie said:

The character has the Unicode value (decimal) 128266.

Thanks for the reply. I also saw the code 128266 in the unicode character table, but AscW("πŸ”Š") produced 55357 which is within 0-65535 range, so I thought ChrW(55357) woud give meΒ πŸ”Š.

Link to comment
Share on other sites

AutoIt internally uses UCS2 for strings and is therefore limited to the first 65536 characters of the Unicode set.
In the concrete case your character is interpreted by AutoIt internally as 2 own characters and if you pass them untouched to an output, as you did, then the output may interpret them again as 4 byte UTF-8 characters and output them correctly.

However, AutoIt itself has never internally figured out that it is a 4-byte UTF-8 character.:

$sString = 'πŸ”Š'

ConsoleWrite(StringLen($sString) & @CRLF)

Β 

Link to comment
Share on other sites

10 hours ago, CYCho said:

I need to use a unicode character(πŸ”Š) in my code

I may have misunderstood your question or what you are actually trying to do, but if not...

If the code that you are referring to is setting the data of a control in a GUI, then you don't need to do anything special other than making sure that the control's font can display the character/symbol/emoji.Β  As you can see below, I have used 2 fonts that can display the speaker, Segoe UI Emoji and Segoe UI Symbol.Β  Each font represents the same symbol a little differently, so which font you choose makes a difference.Β  If the code you are referring to is setting the data of some other output, like a log file, then the same is true.Β  You need to make sure that the special characters are displayed using an appropriate font.

Example:

#include <GUIConstantsEx.au3>

; Define form
Global $Form1  = GUICreate("Form1", 196, 151, 402, 124)
Global $Label1 = GUICtrlCreateLabel("", 72, 32, 36, 20)
Global $Label2 = GUICtrlCreateLabel("", 72, 62, 36, 20)

; Set control fonts
GUICtrlSetFont($Label1, 12, 400, 0, "Segoe UI Emoji")
GUICtrlSetFont($Label2, 12, 400, 0, "Segoe UI Symbol")

GUICtrlSetData($Label1, "πŸ”Š")
GUICtrlSetData($Label2, "πŸ”Š")

GUISetState(@SW_SHOW)

Do
Until GUIGetMsg() = $GUI_EVENT_CLOSE

Displayed form:

image.png.c7c3541940733cd6820808427f2371a5.png

Β 

Edit:
In regards to the question in your topic's title and the example script in your initial post:
Even if ChrW()/AscW() would've worked, MsgBox() wouldn't have displayed the speaker symbol correctly.Β  That's because MsgBox(), with default system fonts, does not use a font that will correctly display the speaker symbol.

Β 

Β 

Edited by TheXman
Link to comment
Share on other sites

I think MsgBox() font it's OS dependent and can be configured but it would change the font of all message boxes and probably doesn't worth since you can create a custom message box fairly easy.

When the words fail... music speaks.

Link to comment
Share on other sites

2 hours ago, TheXman said:

If the code that you are referring to is setting the data of a control in a GUI

You are correct. I used it in myΒ WINMM.DLL Media Player. I was wondering if I could use ChrW("unicode code") instead of copying and pasting the character. The fact that AscW("πŸ”Š") returned 55357 which is in the range covered by ChrW() function made me wonder why I can't use ChrW(55357).

As always, many thanks to every one of you for paying attention to my question.

Link to comment
Share on other sites

  • Solution

While AutoIt uses UCS2 (that is the Unicode BMP, the first 64k codepoints of Unicode), you can still use the full Unicode range. AutoIt string function won't all work correctly above UCS2, but Windows knows how to handle UTF16le.

When you paste the πŸ”Š codepoint somewhere, Windows stores two 16-bit encoding units in UTF16le. In this case, the 16LE encoding forces to use a first 16-bit word (called a high surrogate) then a second word (the low surrogate) with the remaining bits of the codepoint value of the speaker glyph, 0x1F50A.

You can do either way: paste the character with codepoint > 0xD800 in your source, or build a string containing the surrogate pair. The following code demonstrates this, where the function cw() is a Unicode-aware ConsoleWrite:

Local $spk = "πŸ”Š"
Local $UTF16le = ChrW(0xD83D) & ChrW(0xDD0A)   ; high- then low-surrogates
cw($spk)
cw($UTF16le)

MsgBox(0, "", $spk & @LF & $UTF16le)

My console output (verbatim):

+>Setting Hotkeys...--> Press Ctrl+Alt+Break to Restart or Ctrl+BREAK to Stop.
πŸ”Š
πŸ”Š
+>17:32:02 AutoIt3.exe ended.rc:0
+>17:32:02 AutoIt3Wrapper Finished.
>Exit code: 0Β Β Β  Time: 17.24

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

As an aside, AscW("πŸ”Š") returns the first 16-bit word of the string, that is the high surrogate, namely 0xD83D (= decimal 55357), which doesn't represent anything alone since it misses the low surrogate value.

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Think about surrogates as 6-bit prefixed values that helps to identify bytes of UTF-16 strings and the rest of 20 bits are split as 10 bits for each word.

For this particular character πŸ”Š (U+1F50A) you get the encoding by subtracting 0x10000 (it's obvious why) and then the remaining part (F50A - 0000111101Β  0100001010) it's padded if it's necessary and then split with 10 bits on each word. So this is encoded as UTF-16 as D83D DD0A.

Quote

1101100000111101Β Β Β  1101110100001010

Edited by Andreik

When the words fail... music speaks.

Link to comment
Share on other sites

Found a function example by trancexx on how to calculate the high and low surrogate.

#include <GUIConstantsEx.au3>

Local $hGUI = GUICreate("Test", 200, 90)
Local $hLabel = GUICtrlCreateLabel("", 5, 5, 190, 130)
GUICtrlSetFont($hLabel, 48)
; https://www.gaijin.at/de/infos/unicode-zeichentabelle-piktogramme-3
GUICtrlSetData($hLabel, " " & _ChrW(0x1F50A) & " " & _ChrW(0x1F525))
GUISetState(@SW_SHOW)

While 1
    $msg = GUIGetMsg()
    If $msg = $GUI_EVENT_CLOSE Then ExitLoop
WEnd

Func _ChrW($iCodePoint) ; By trancexx
    ; https://www.autoitscript.com/forum/topic/149307-another-unicode-question/?do=findComment&comment=1064547
    If $iCodePoint <= 0xFFFF Then Return ChrW($iCodePoint)
    If $iCodePoint > 0x10FFFF Then Return SetError(1, 0, "")
    Local $tOut = DllStructCreate("word[2]")

    Local $high_surrogate = BitShift($iCodePoint, 10) + 0xD7C0
    Local $low_surrogate = BitAND($iCodePoint, 0x3FF) + 0xDC00

    ConsoleWrite("CodePoint = " & @tab& @tab & Hex($iCodePoint,4) & @crlf)
    ConsoleWrite("High Surrogate = " & @tab & Hex($high_surrogate, 4) & @CRLF)
    ConsoleWrite("Low Surrogate = " & @tab & Hex($low_surrogate, 4) & @CRLF)
    ConsoleWrite(@CRLF)

    DllStructSetData($tOut, 1, $high_surrogate, 1)
    DllStructSetData($tOut, 1, $low_surrogate, 2)
    Return BinaryToString(DllStructGetData(DllStructCreate("byte[4]", DllStructGetPtr($tOut)), 1), 2)
EndFunc   ;==>_ChrW

Β 

Link to comment
Share on other sites

Another example for fun πŸ˜‰.

#include <GUIConstantsEx.au3>

GUICreate("Pictograms Listview", 420, 800)
Local $idListview = GUICtrlCreateListView("Pictograms-1|Pictograms-2|Pictograms-3|Pictograms-4", 10, 10, 400, 780)
GUICtrlSetFont(-1, 40)

; https://www.gaijin.at/de/infos/unicode-zeichentabelle-piktogramme-1
For $i = 0x1F300 To 0x1F3FF
    GUICtrlCreateListViewItem(_ChrW($i) & "|" & _ChrW($i + 256) & "|" & _ChrW($i + 256 + 256) & "|" & _ChrW($i + 256 + 256 + 256), $idListview)
Next

GUISetState(@SW_SHOW)

While 1
    Switch GUIGetMsg()
        Case $GUI_EVENT_CLOSE
            ExitLoop
    EndSwitch
WEnd


Func _ChrW($iCodePoint) ; By trancexx
    ; https://www.autoitscript.com/forum/topic/149307-another-unicode-question/?do=findComment&comment=1064547
    If $iCodePoint <= 0xFFFF Then Return ChrW($iCodePoint)
    If $iCodePoint > 0x10FFFF Then Return SetError(1, 0, "")
    Local $tOut = DllStructCreate("word[2]")

    Local $high_surrogate = BitShift($iCodePoint, 10) + 0xD7C0
    Local $low_surrogate = BitAND($iCodePoint, 0x3FF) + 0xDC00

    ConsoleWrite("CodePoint = " & @TAB & @TAB & Hex($iCodePoint, 4) & @CRLF)
    ConsoleWrite("High Surrogate = " & @TAB & Hex($high_surrogate, 4) & @CRLF)
    ConsoleWrite("Low Surrogate = " & @TAB & Hex($low_surrogate, 4) & @CRLF)
    ConsoleWrite(@CRLF)

    DllStructSetData($tOut, 1, $high_surrogate, 1)
    DllStructSetData($tOut, 1, $low_surrogate, 2)
    Return BinaryToString(DllStructGetData(DllStructCreate("byte[4]", DllStructGetPtr($tOut)), 1), 2)
EndFunc   ;==>_ChrW

Β 

Edited by KaFu
Link to comment
Share on other sites

Example of a single glyph (once rendered) using pre-computed surrogates:

; A familly with different Fitzpatrick settings = only one glyph
$s = ChrW(0xD83D) & ChrW(0xDC68) & ChrW(0xD83C) & ChrW(0xDFFB) & ChrW(0x200D) & ChrW(0xD83D) & ChrW(0xDC69) & ChrW(0xD83C) & ChrW(0xDFFF) & ChrW(0x200D) & ChrW(0xD83D) & ChrW(0xDC66) & ChrW(0xD83C) & ChrW(0xDFFD)
MsgBox(0, "", $s)

If displayed on a Unicode-aware console (I'm using font DejaVu), this is

πŸ‘¨πŸ»β€πŸ‘©πŸΏβ€πŸ‘¦πŸ½

else in the MsgBox the default font is much less pretty.

EDIT: I just notice that the html-ed string is showing as 3 separate glyphs (ZWJ gets ignored), contrary to what I get displayed here.

2023-07-18_193948.jpg

Edited by jchd

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

  • 2 weeks later...

For those who prefer to see all the bits at work:

;~ Variable names
;~ $xyzt
;~  ||||______   M = mask    V = variable part
;~  |||_______   S = surrogate
;~  ||________   H = high    L = low           C = codepoint
;~  |_________   i = int     b = binary        h = hex


Local $iC = 0x1F50A     ; codepoint
ConsoleWrite(Hex($iC, 6) & @LF)
Local $bC = _IntToString($iC, 2)
$bC = StringRight(_StringRepeat("0", 32) & $bC, 32)
ConsoleWrite($bC & @LF)
ConsoleWrite("00000000000<---bHSV--><--bLSV-->" & @LF)
ConsoleWrite(@LF)
Local $iHSM = 0xD7C0
Local $bHSM = _IntToString($iHSM, 2)
Local $hHSM = Hex(_StringToInt($bHSM, 2), 4)
ConsoleWrite("high surrogate mask          = " & $bHSM & "   " & $hHSM & @LF)
Local $bHSV = "00000" & StringMid($bC, 12, 11)
Local $hHSV = Hex(_StringToInt($bHSV, 2), 4)
ConsoleWrite("high surrogate variable part = " & $bHSV & "   " & $hHSV & @LF)
Local $iHS = $iHSM + _StringToInt($bHSV, 2)
Local $hHS = Hex($iHS, 4)
ConsoleWrite("high surrogate               = " & _IntToString($iHS, 2) & "   " & $hHS & @LF)
ConsoleWrite(@LF)
Local $iLSM = 0xDC00
Local $bLSM = _IntToString($iLSM, 2)
Local $hLSM = Hex(_StringToInt($bLSM, 2), 4)
ConsoleWrite("low surrogate mask           = " & $bLSM & "   " & $hLSM & @LF)
Local $bLSV = "000000" & StringRight($bC, 10)
Local $hLSV = Hex(_StringToInt($bLSV, 2), 4)
ConsoleWrite("low surrogate variable part  = " & $bLSV & "   " & $hLSV & @LF)
Local $iLS = $iLSM + _StringToInt($bLSV, 2)
Local $hLS = Hex($iLS, 4)
ConsoleWrite("low surrogate                = " & _IntToString($iLS, 2) & "   " & $hLS & @LF)


Func _StringToInt($s, $base = 16)
    Return DllCall("msvcrt.dll", "int64:cdecl", "_wcstoi64", "wstr", $s, "ptr*", 0, "int", $base)[0]
EndFunc   ;==>_StringToInt

Func _IntToString($i, $base = 16)
    Return DllCall("msvcrt.dll", "wstr:cdecl", "_i64tow", "int64", $i, "wstr", "", "int", $base)[0]
EndFunc   ;==>_IntToString

Β 

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...