Jump to content
qwert

Why do some symbols display and others not?

Recommended Posts

qwert
Posted (edited)

I've been experimenting with symbol sets after I noticed that certain symbols could be placed in a text control, but others could not be.  Here's a small script I've used to test:

;
;       Font test of symbols
;
Opt("GUIOnEventMode", 1)
$GUI_EVENT_CLOSE = -3

GUICreate("Font Test", 400, 200)
GUICtrlCreateLabel("Declared as Segoe UI Semibold:", 20, 25, 300, 20)
GUICtrlCreateInput("This is a test", 20, 50, 360, 30)
GUICtrlSetFont(-1, 14, 400, 0, "Segoe UI Semibold")
GUISetState()

GUISetOnEvent($GUI_EVENT_CLOSE, "CLOSE")

While 1
    Sleep(100)
WEnd

Func CLOSE()
    Exit
EndFunc

The captured screen below shows the result of pasting 3 symbols into the text control on the gui.  Two are displayed as "unknown", but the checkmark is accepted.  Yet the checkmark is not a defined character in Segoe UI Semibold (at least not according to the Windows Font feature in control panel).

What is allowing the checkmark to display when it's not in the character set?

Thanks in advance for any help.

 

5b3002d131c32_SymbolTest.PNG.4b0bdac40c71a86812af32504f21fecf.PNG

 

 

Edited by qwert

Share this post


Link to post
Share on other sites
jchd

I believe I have an explanation.

Both ChrW(0x274E) = and ChrW(0x2705) = have only been defined in Unicode version 6.0 while ChrW(0x2714) = ✔ ( HEAVY CHECK MARK ) was already part of Unicode v1.1

Unicode v6.0 was released in February 2011, Win7 Pro x64 is dated late 2009.

My Win7 Pro does show ✔ and ❎ in the charset applet but classifies the latter as "not a character".

I suppose ❎ is part of Segoe UI Symbol font file, but is tagged as "not a char" in Windows internal tables, which the Unicode renderer uses.
I'm afraid you'd have to use a more recent version of Windows or apply some unknown (by me) magic update but I've no other information about this point.

Also note that I pasted black characters for both ❎ and ✅ yet the code editor here shows then in colors, probably to denote possible issue in displaying them in some browser setups.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
qwert

Thanks for your response.  Those are good points to consider.

Given the updates to the character set definitions over years, it sounds likley that the "interpreted" character set displayed by Control Panel differs from the actual font definition file ... and the differences were never reconciled.

And it's true that I've done my testing on Win7 Pro 64-bit, so there are probably other differences with Win10.  I'll try my demo script in Win10 this week.

Ultimately, I'd like to identify a set of "safe" symbols that can be used in both Win7 and Win10, but that don't require the excesses of fonts like Segoe UI Symbol.

I appreciate your help.

Share this post


Link to post
Share on other sites
jchd
24 minutes ago, qwert said:

Given the updates to the character set definitions over years, it sounds likley that the "interpreted" character set displayed by Control Panel differs from the actual font definition file ... and the differences were never reconciled.

Things work differently: font files define only the parameters for drawing the glyphs, along with other parameters about their rendering. It doesn't contain any Unicode property for every codepoint supported, except its codepoint. Unicode properties are only stored once in the renderer since they are independant of fonts.

Discussions in a large consortium like Unicode take time, often very long time. When inclusion of definitions for a new block is considered some codepoints are quickly agreed upon while others are subject to much longer scrutiny. This is necessary since once published in official release nothing can be changed, forever. So it's common that fonts include not-yet-official characters, by anticipation, relying on the fact that they are already carved in stoned even if the official release is still a work in progress.

You can browse the whole Unicode charset using ressources offered by this website: https://r12a.github.io/
You'll probably learn a lot about scripts and Unicode.

It's one of the most comprehensive source of information about Unicode, along with the Unicode website itself but this one a often a dry reading.

In particular use the Uniview applet to display most details about a block of characters, including the Unicode version which defined it (use the Filter tab and hit the "Show age" arrow when a block is displayed).


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
qwert

jchd, that's an overwhelming amount of information ... but thanks.  I've never looked into how Unicode was developed or managed.  But I now have an inkling, anyway, for the whys and hows of the character sets.

What I've concluded is that I'll have use a basic empirical method to determine which symbols I can use.  If a symbol works in both Win7 and Win10, then I'll trust that it's there for the long term.  ("... it's common that fonts include not-yet-official characters ...")

The advantage I have is that I only need 6 or 8.  And if turns out that one isn't truly supported, I'll switch to another.  But over the long term, I'll have to keep in mind that I'm operating "off the menu", so to speak.

However it works out for me, I appreciate the light you've shone on this subject.  Thanks.

 

 

Share this post


Link to post
Share on other sites
jchd

What you can do to determine which char is supported in Win7 (implying they are supported forever) is to write a simple program to display successive blocks of characters in the range of interest (maybe you can skip foreign scripts you don't use and exotic blocks like aegyption hieroplyphs, antique musical symbols, etc.). Use a msgbox to diplay small blocks of a few lines of, say, 16 chars each. Youll' notice which are supported if you restrict the ranges to what's useful for you. Don't force a font for the msgbox, let Windows choose.

To select which blocks may be useful, select symbol and dingbat blocks from this list. Don't try anything outside BMP (basic multilingual plane) since very little fonts include anything beyond BMP and those how do aren't supported in old Wins.

I attach an Excel file offering a limited list of symbols and dingbats, based on Unicode v5.2. Unsupported characters display as []

U_symbols.xlsx

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites
qwert
Posted (edited)

jchd, thanks.  That's a very useful spreadsheet. (Its a welcome relief after dealing with the Win7 "find character" feature for so many days.)

About a third of those you've listed show only a white square in Segoe UI ... and far more in Arial.  (Column B can easily be set to any font ... and even replicated for side-by-side comparisons.) By zooming the sheet, it's very easy to see what the working ones actually look like.

Nice.

 

Edited by qwert

Share this post


Link to post
Share on other sites
jchd

That was my goal in extracting a limited range of my own SQlite Unicode database to make it easier for you to perform such comparisons.
I've restricted the table to codepoints having general category = 'SO' (symbol other) and excluded a number of small ranges out of interest here.

select printf('U+%s', codepoint) "U_Codepoint", Glyph, CharacterName from unicodedata
where 
generalcategory like 'so%' and
codepoint like '02%' and
bidiclass like 'on' and
charactername not like 'cjk%' and
charactername not like 'kan%' and
charactername not like 'ocr%' and
charactername not like 'coptic%' and
charactername not like 'ideographic%' and
charactername not like 'metrical%'

My Unicode data is outdated since it must come from version 5.2 or 6.0 (that is years ago) but it still fits my needs for now.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×