Sign in to follow this  
Followers 0
daywalkereg

help in unicodes or ALT codes

6 posts in this topic

i'm working on program that reads some files and transform them into specific format but i found that the files encode arabic letter in unicode formate that i can't convert to the original format (arabic) with autoit and cannot find any solution, something like this :

u0628 , u0633 , u0625 or u064a

the only thing that i found is this page u0628 so please any one help me with this


1 £0\\/3 |-|3® $0 |\\/|µ(|-|

Share this post


Link to post
Share on other sites



i'm working on program that reads some files and transform them into specific format but i found that the files encode arabic letter in unicode formate that i can't convert to the original format (arabic) with autoit and cannot find any solution, something like this :

u0628 , u0633 , u0625 or u064a

the only thing that i found is this page u0628 so please any one help me with this

I bet you are interpreting data captured from JSON format. Did you forget to mention there is a backslash before the u.... (\u11111\u2222\u3333) or is your data really like u1111u2222u3333 ?

You can use this to get yous Unicode characters correctly, should work (I leave the \\ to account for a single backlash in your stream).

$s = StringRegexpReplace($s, "\\u([[:xdigit:]]{2,4})", '" & chr(0x$1) & "')
    $s = Execute('"' & $s & '"')

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

I bet you are interpreting data captured from JSON format. Did you forget to mention there is a backslash before the u.... (\u11111\u2222\u3333) or is your data really like u1111u2222u3333 ?

You can use this to get yous Unicode characters correctly, should work (I leave the \\ to account for a single backlash in your stream).

$s = StringRegexpReplace($s, "\\u([[:xdigit:]]{2,4})", '" & chr(0x$1) & "')
    $s = Execute('"' & $s & '"')

thanks for your reply but i have the chars in this formate "u0628\u0633 \u0625\u064a\u0647" so i just need to translate each code into its orignal form Edited by daywalkereg

1 £0\\/3 |-|3® $0 |\\/|µ(|-|

Share this post


Link to post
Share on other sites

thanks for your reply but i have the chars in this formate "\u0628\u0633 \u0625\u064a\u0647" so i just need to translate each code into its orignal form

Then, this will surely work:

Local $str = "Here is your example codes:\u0634\u0628\u0633where you can find other text as well\u0625\u064a\u0647\u0638"
Local $s = StringRegexpReplace($str, "\\u([[:xdigit:]]{2,4})", '"&chrw(0x$1)&"')
$s = '"' & $s & '"'
$s = Execute($s)
_ConsoleWrite($s & @LF)

It produces thispost-44800-12635150202834_thumb.jpg

Notice that I use a special version of ConsoleWrite able to display Unicode. This also need to switch the codepage of Scite to Unicode rather than User.

Func _ConsoleWrite($sString)
    Local $aResult = DllCall("kernel32.dll", "int", "WideCharToMultiByte", "uint", 65001, "dword", 0, "wstr", $sString, "int", -1, _
                                "ptr", 0, "int", 0, "ptr", 0, "ptr", 0)
    If @error Then Return SetError(1, @error, 0)
    Local $tText = DllStructCreate("char[" & $aResult[0] & "]")
    $aResult = DllCall("Kernel32.dll", "int", "WideCharToMultiByte", "uint", 65001, "dword", 0, "wstr", $sString, "int", -1, _
                            "ptr", DllStructGetPtr($tText), "int", $aResult[0], "ptr", 0, "ptr", 0)
    If @error Then Return SetError(2, @error, 0)
    ConsoleWrite(DllStructGetData($tText, 1))
EndFunc

Tell me if this works for you.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

OMG, that worked very well thank you very much :D


1 £0\\/3 |-|3® $0 |\\/|µ(|-|

Share this post


Link to post
Share on other sites

Glad it helped. BTW say hello to your local tetrahedras :D

For your information as well as for subsequent readers interessed with this thread, there is another pseudo-standard for embedding Unicode characters in a JSON or JSON-like stream, where characters are coded like in"uuuuuu\xabvvvvv", with \xab the hexadecimal representation of the character. I never encountered a 16-bit version of this (e.g. \xabcd) but it might be in use somewhere.

The regexp to use is very easy to deduce: $s = StringRegexpReplace($s, "\\(x[[:xdigit:]]{2})", '"&chr(0$1)&"')

If you wish to handle both cases \u and \x with 2 or 4 hex digits, simply use: $s = StringRegexpReplace($s, "\\([xu][[:xdigit:]]{2,4})", '"&chr(0$1)&"')

It's a bit imprecise in that it will decode \uabc with only 3 hex digits without warning (I bet this is very unlikely to be used anywhere), but the \u or \x guard should make it relatively difficult to confuse with unexpected occurence in random text, especially given that \u or \x sequences are expected in the said text.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0