Invisible red dot UNICODE character

Skysnake · March 5, 2019

'F‌inal' ; <========= problem between F and i 

'F‌inal' ; <========= problem between F and i 

'Final' ; <========= Notepad ANSI, no problem

final1.png.72128541afbe484cd5c3252613b87264.png

F‌inal

final2.png.017cc6c3b4857088e9955e9ac946329d.png

PostgreSQL complains about that red dot. Notepad++ marks the first word.

Notepad said it contains Unicode. Saved as ANSI and put back. Red dot is gone.

the Text is inserted manually into an AutoIt Input box. Things get done to it, and it ends up in a SQL database. The data seems fine, but when I start generating reports, all kinds of funny problems show up.

If I can ID that character, I can remove it.

Any ideas?

Edited March 5, 2019 by Skysnake
Typo

user4157124 · March 5, 2019

Possibly:

StringRegExpReplace('F‌inal', '[\x{200C}\x{200B}]', '')

Edited March 15, 2019 by user4157124

Subz · March 5, 2019

You could also replace non printable characters:

StringRegExpReplace(GuiCtrlRead($sInput), "[^[:print:]]", "")

jchd · March 5, 2019

With a bit broader range of removal:

StringRegExpReplace('F‌inal', '[\x{200B}-\x{200D}]', '')

Yet removing ZWJ (U+0200D) may change semantic of Unicode flux in some languages.

You may also discover other such special codepoints, depending on the nature of the input text.

TheDcoder · March 5, 2019

4 hours ago, Skysnake said:

'F‌inal' ; <========= problem between F and i 

'F‌inal' ; <========= problem between F and i 

'Final' ; <========= Notepad ANSI, no problem

Going by the character which my browser claims to be in between F and i, it is a Zero-width non-joiner character in unicode (code point 8204), it is used in some languages to represent more complex writing systems

Sign In

Invisible red dot UNICODE character

Recommended Posts

Skysnake

user4157124

Subz

jchd

TheDcoder

Create an account or sign in to comment

Create an account

Sign in

Similar Content

How to use Unicode in Blat aka can I include Blatdll.h when using Blat?

Remove Unicode (BOM) from beginning of string?

How to read unicode utf8 string from sub process (Run) with StdoutRead()?

cmd Unicode text maker

[SOLVED] Umlaut.. help

Browse

AutoIt Resources

Release

Beta