carl1905

Unicode stringToBinary for unsupported language

37 posts in this topic

#1 ·  Posted (edited)

Hi, I'm looking for a way to convert string to original binary.

The reversal process was already solved. BinaryToString for unsupported language

How can I convert unicode string to original binary format(Shift-JIS)?

$text(Unicode) = "データのダウンロードに失敗しました。" = "0xFFFEC730FC30BF306E30C030A630F330ED30FC30C9306B303159576557307E3057305F300230"

<Original form>
$stringToBinary(Shift-JIS) = "0x8366815B835E82CC835F83458393838D815B836882C98EB8947382B582DC82B582BD8142"

 

Edited by carl1905

Share this post


Link to post
Share on other sites



Use _WinAPI_WideCharToMultiByte().


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

1 hour ago, jchd said:

Use _WinAPI_WideCharToMultiByte().

Hi. You mean something like this? But it doesn't seem to work.

#include <WinAPI.au3>
Global Const $CP_SHIFT_JIS = 932

$String = 0xFFFEC730FC30BF306E30C030A630F330ED30FC30C9306B303159576557307E3057305F300230
$test = _WinAPI_MultiByteToWideChar($String, $CP_SHIFT_JIS)
MsgBox($MB_SYSTEMMODAL, "Title", $test)

 

Edited by carl1905

Share this post


Link to post
Share on other sites

Much simpler:

$text(Unicode) = "データのダウンロードに失敗しました。"
$test = _WinAPI_MultiByteToWideChar($text, $CP_SHIFT_JIS)
MsgBox($MB_SYSTEMMODAL, "Title", $test)

But of course the MsgBox won't display JIS correctly.

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

1 hour ago, jchd said:

Much simpler:

$text(Unicode) = "データのダウンロードに失敗しました。"
$test = _WinAPI_MultiByteToWideChar($text, $CP_SHIFT_JIS)
MsgBox($MB_SYSTEMMODAL, "Title", $test)

But of course the MsgBox won't display JIS correctly.

I can't understand what do you mean. Exactly what I want to do is read unicode from txt file and convert it to binary(Shift-JIS).

Let's assume that I read unicode from txt file,

"0xFFFEC730FC30BF306E30C030A630F330ED30FC30C9306B303159576557307E3057305F300230"

named it as $text then

$text = "0xFFFEC730FC30BF306E30C030A630F330ED30FC30C9306B303159576557307E3057305F300230"

So, the next step is..

$test = _WinAPI_MultiByteToWideChar($text, $CP_SHIFT_JIS)

And then the $test will be..

"0x8366815B835E82CC835F83458393838D815B836882C98EB8947382B582DC82B582BD8142"

Is it right? I think that I missed something..

Edited by carl1905

Share this post


Link to post
Share on other sites

You confuse the ASCII reading of a binary UTF16-LE file with a UTF16-LE (e.g. native AutoIt) string. Did you try to run my example code?

As you code it, the variable $text contains exactly what is inside the double quotes, i.e. not what you want.

First, don't read the text as binary, don't make the BOM part of the string. The simplest way is to FileRead the file into $text.

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#7 ·  Posted (edited)

2 hours ago, jchd said:

You confuse the ASCII reading of a binary UTF16-LE file with a UTF16-LE (e.g. native AutoIt) string. Did you try to run my example code?

As you code it, the variable $text contains exactly what is inside the double quotes, i.e. not what you want.

First, don't read the text as binary, don't make the BOM part of the string. The simplest way is to FileRead the file into $text.

Thank you. Here is sample txt file and code that I made. It generates Shift-JIS txt from text file but it have some problems.

1) The original number of strings are 3 not 2. But in the 'NEW_sample.txt' there is only 2 strings.

2) I want to add 0x00

$plus = 0x00

at the end of string but my code convert it to 0x30.

#include <File.au3>
#include <Binary.au3>
#include <winapi.au3>
Global Const $CP_SHIFT_JIS = 932
Dim $NEWdata
$TxtPath = FileOpenDialog("Select the TXT file", @ScriptDir, "text files (*.txt)",1)
If @error = 1 Then Exit
_FileReadToArray($TxtPath,$NEWdata)

$strings = 3

$Newtext = ""
$plus = 0x00
For $i = 1 To $strings
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i],"<cf>",@CRLF)
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i],"<lf>",@LF)
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i],"<cr>",@CR)
    $bNewText = _WinAPI_WideCharToMultiByte ($NEWdata[$i],$CP_SHIFT_JIS)
    $Newtext &= $bNewText
    $Newtext &= $plus
Next
$Newfile = $Newtext
$hNewfile = FileOpen ("NEW_"&CompGetFileName($TxtPath), 2+16)
FileWrite ($hNewfile, $Newfile)
FileClose ($hNewfile)
TrayTip ("Import", "Finish!", 3)
sleep (3000)

Func CompGetFileName($Path)
If StringLen($Path) < 4 Then Return -1
$ret = StringSplit($Path,"\",2)
If IsArray($ret) Then
Return $ret[UBound($ret)-1]
EndIf
If @error Then Return -1
EndFunc

sample.txt

Edited by carl1905

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

2016-05-02_175417.png.082426f0088826d327

#include <File.au3>
#include <Winapi.au3>
;~ #include <Binary.au3>    ; I don't know what this is supposed to refer to, but move this outside std UDFs directory

Global Const $CP_SHIFT_JIS = 932
Local $TxtPath = FileOpenDialog("Select the TXT file", @ScriptDir, "text files (*.txt)", 1)
If @error = 1 Then Exit
Local $NEWdata = FileReadToArray($TxtPath)
Local $Newtext
For $i = 0 To UBound($NEWdata) - 1
    ; are those 3 replacements necessary?
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i], "<cf>", @CRLF)
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i], "<lf>", @LF)
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i], "<cr>", @CR)
    $Newtext &= _WinAPI_WideCharToMultiByte($NEWdata[$i], $CP_SHIFT_JIS) & Chr(0)
Next
$hNewfile = FileOpen("NEW_" & CompGetFileName($TxtPath), 2 + 16)
FileWrite($hNewfile, $Newtext)
FileClose($hNewfile)
TrayTip("Import", "Finish!", 3)
Sleep(3000)

Func CompGetFileName($Path)
    If StringLen($Path) < 4 Then Return -1
    $ret = StringSplit($Path, "\", 2)
    If IsArray($ret) Then
        Return $ret[UBound($ret) - 1]
    EndIf
    If @error Then Return -1
EndFunc   ;==>CompGetFileName

This modified version should do what you want. Beware the difference between 0x00 and Chr(0).

 

Edited by jchd
1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

1 hour ago, jchd said:

2016-05-02_175417.png.082426f0088826d327

#include <File.au3>
#include <Winapi.au3>
;~ #include <Binary.au3>    ; I don't know what this is supposed to refer to, but move this outside std UDFs directory

Global Const $CP_SHIFT_JIS = 932
Local $TxtPath = FileOpenDialog("Select the TXT file", @ScriptDir, "text files (*.txt)", 1)
If @error = 1 Then Exit
Local $NEWdata = FileReadToArray($TxtPath)
Local $Newtext
For $i = 0 To UBound($NEWdata) - 1
    ; are those 3 replacements necessary?
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i], "<cf>", @CRLF)
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i], "<lf>", @LF)
    $NEWdata[$i] = StringRegExpReplace($NEWdata[$i], "<cr>", @CR)
    $Newtext &= _WinAPI_WideCharToMultiByte($NEWdata[$i], $CP_SHIFT_JIS) & Chr(0)
Next
$hNewfile = FileOpen("NEW_" & CompGetFileName($TxtPath), 2 + 16)
FileWrite($hNewfile, $Newtext)
FileClose($hNewfile)
TrayTip("Import", "Finish!", 3)
Sleep(3000)

Func CompGetFileName($Path)
    If StringLen($Path) < 4 Then Return -1
    $ret = StringSplit($Path, "\", 2)
    If IsArray($ret) Then
        Return $ret[UBound($ret) - 1]
    EndIf
    If @error Then Return -1
EndFunc   ;==>CompGetFileName

This modified version should do what you want. Beware the difference between 0x00 and Chr(0).

 

Thank you. However, even though I used your modified code, in my computer the output text file is different from you.

NEW_sample.txt

Edited by carl1905

Share this post


Link to post
Share on other sites

#10 ·  Posted

All I can say is that I obtain the result shown running the exact code on the posted text file verbatim, under both release and beta, both X86 and x64.

Your setup must be differing somehow.

 


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#11 ·  Posted

4 hours ago, jchd said:

All I can say is that I obtain the result shown running the exact code on the posted text file verbatim, under both release and beta, both X86 and x64.

Your setup must be differing somehow.

 

Well.. I tested it on another computer, but the result is still same that of mine. I will check what is the problem.

Share this post


Link to post
Share on other sites

#12 ·  Posted

It seems that there are nothing wrong in the code and I use 3.3.14.0 version. Even though I delete autoit and reinstall but I still have no idea why I can't get the same result.

Share this post


Link to post
Share on other sites

#13 ·  Posted (edited)

OK?
 

#include <WinAPI.au3>
Global Const $CP_SHIFT_JIS = 932

Global $sFilePathIN = FileOpenDialog("Select the TXT file", @ScriptDir, "text files (*.txt)", 3)
If @error Then Exit
Global $sFilePathOUT = FileSaveDialog("Select local to save output", _SplitPath($sFilePathIN, 6), "Text (*.txt)", 16, "NEW_" & _SplitPath($sFilePathIN, 5))
If @error Then Exit
FileChangeDir(_SplitPath($sFilePathIN, 6))

Global $sHexAdd = "00"

Local $hOpen = FileOpen($sFilePathIN, 16384)
Local $sContent = FileRead($hOpen)
FileClose($hOpen)
$sContent = StringRegExpReplace($sContent, "<cf>", @CRLF)
$sContent = StringRegExpReplace($sContent, "<lf>", @LF)
$sContent = StringRegExpReplace($sContent, "<cr>", @CR)
$sContent = _WinAPI_WideCharToMultiByte($sContent, $CP_SHIFT_JIS)
$hOpen = FileOpen($sFilePathOUT, 2 + 8 + 16)
FileWrite($hOpen, Binary(StringToBinary($sContent) & $sHexAdd))
FileClose($hOpen)
TrayTip("Import", "Finish!", 3)
MsgBox(32, "", "Import Finish!", 3)


Func _SplitPath($sFilePath, $sType = 0)
    Local $sDrive, $sDir, $sFileName, $sExtension, $sReturn
    Local $aArray = StringRegExp($sFilePath, "^\h*((?:\\\\\?\\)*(\\\\[^\?\/\\]+|[A-Za-z]:)?(.*[\/\\]\h*)?((?:[^\.\/\\]|(?(?=\.[^\/\\]*\.)\.))*)?([^\/\\]*))$", 1)
    If @error Then
        ReDim $aArray[5]
        $aArray[0] = $sFilePath
    EndIf
    $sDrive = $aArray[1]
    If StringLeft($aArray[2], 1) == "/" Then
        $sDir = StringRegExpReplace($aArray[2], "\h*[\/\\]+\h*", "\/")
    Else
        $sDir = StringRegExpReplace($aArray[2], "\h*[\/\\]+\h*", "\\")
    EndIf
    $aArray[2] = $sDir
    $sFileName = $aArray[3]
    $sExtension = $aArray[4]
    If $sType = 1 Then Return $sDrive
    If $sType = 2 Then Return $sDir
    If $sType = 3 Then Return $sFileName
    If $sType = 4 Then Return $sExtension
    If $sType = 5 Then Return $sFileName & $sExtension
    If $sType = 6 Then Return $sDrive & $sDir
    If $sType = 7 Then Return $sDrive & $sDir & $sFileName
    Return $aArray
EndFunc   ;==>_SplitPath

 

Function:

#include <WinAPI.au3>

;~ _ConvertEncodingFile(@DesktopDir & "\sample.txt", @DesktopDir & "\NEW_sample.txt", 932) ;iCodepage: SHIFT_JIS = 932

Func _ConvertEncodingFile($sFilePathIN, $sFilePathOUT = Default, $iCodePage = 65001) ;iCodepage: UTF-8=65001
    ; iCodepage: https://msdn.microsoft.com/en-us/library/windows/desktop/dd317756(v=vs.85).aspx
    If Not FileExists($sFilePathIN) Then Return SetError(1, 0, 0)
    Local $hOpen = FileOpen($sFilePathIN, 16384)
    If $sFilePathOUT = Default Then $sFilePathOUT = $sFilePathIN
    Local $sContent = FileRead($hOpen)
    If @error Or StringStripWS($sContent, 8) = "" Then Return SetError(2, 0, 0)
    FileClose($hOpen)
    $sContent = _WinAPI_WideCharToMultiByte($sContent, $iCodePage, 1)
    If @error Then Return SetError(3, @error, 0)
    $hOpen = FileOpen($sFilePathOUT, 2 + 8 + 16)
    FileWrite($hOpen, $sContent)
    If @error Then Return SetError(4, @error, 0)
    FileClose($hOpen)
    Return 1
EndFunc   ;==>_ConvertEncodingFile

 

Edited by Trong
#

Regards,
 

Share this post


Link to post
Share on other sites

#15 ·  Posted (edited)

On 5/2/2016 at 10:10 PM, carl1905 said:

2) I want to add 0x00

$plus = 0x00

 

 

You need read and understand the script!  and change it to match your purpose!

#include <WinAPI.au3>
Global Const $CP_SHIFT_JIS = 932

Global $sFilePathIN = FileOpenDialog("Select the TXT file", @ScriptDir, "text files (*.txt)", 3)
If @error Then Exit
Global $sFilePathOUT = FileSaveDialog("Select local to save output", _SplitPath($sFilePathIN, 6), "Text (*.txt)", 16, "NEW_" & _SplitPath($sFilePathIN, 5))
If @error Then Exit
FileChangeDir(_SplitPath($sFilePathIN, 6))

Local $hOpen = FileOpen($sFilePathIN, 16384)
Local $sContent = FileRead($hOpen)
FileClose($hOpen)
$sContent = StringRegExpReplace($sContent, "<cf>", @CRLF)
$sContent = StringRegExpReplace($sContent, "<lf>", @LF)
$sContent = StringRegExpReplace($sContent, "<cr>", @CR)
$sContent = _WinAPI_WideCharToMultiByte($sContent, $CP_SHIFT_JIS, 1)
$hOpen = FileOpen($sFilePathOUT, 2 + 8 + 16)
FileWrite($hOpen, $sContent)
FileClose($hOpen)
TrayTip("Import", "Finish!", 3)
MsgBox(32, "", "Import Finish!", 3)


Func _SplitPath($sFilePath, $sType = 0)
    Local $sDrive, $sDir, $sFileName, $sExtension, $sReturn
    Local $aArray = StringRegExp($sFilePath, "^\h*((?:\\\\\?\\)*(\\\\[^\?\/\\]+|[A-Za-z]:)?(.*[\/\\]\h*)?((?:[^\.\/\\]|(?(?=\.[^\/\\]*\.)\.))*)?([^\/\\]*))$", 1)
    If @error Then
        ReDim $aArray[5]
        $aArray[0] = $sFilePath
    EndIf
    $sDrive = $aArray[1]
    If StringLeft($aArray[2], 1) == "/" Then
        $sDir = StringRegExpReplace($aArray[2], "\h*[\/\\]+\h*", "\/")
    Else
        $sDir = StringRegExpReplace($aArray[2], "\h*[\/\\]+\h*", "\\")
    EndIf
    $aArray[2] = $sDir
    $sFileName = $aArray[3]
    $sExtension = $aArray[4]
    If $sType = 1 Then Return $sDrive
    If $sType = 2 Then Return $sDir
    If $sType = 3 Then Return $sFileName
    If $sType = 4 Then Return $sExtension
    If $sType = 5 Then Return $sFileName & $sExtension
    If $sType = 6 Then Return $sDrive & $sDir
    If $sType = 7 Then Return $sDrive & $sDir & $sFileName
    Return $aArray
EndFunc   ;==>_SplitPath

 

Edited by Trong

Regards,
 

Share this post


Link to post
Share on other sites

#16 ·  Posted

5 minutes ago, VIP said:

 

You need read and understand the script!  and change it to match your purpose!

#include <WinAPI.au3>
Global Const $CP_SHIFT_JIS = 932

Global $sFilePathIN = FileOpenDialog("Select the TXT file", @ScriptDir, "text files (*.txt)", 3)
If @error Then Exit
Global $sFilePathOUT = FileSaveDialog("Select local to save output", _SplitPath($sFilePathIN, 6), "Text (*.txt)", 16, "NEW_" & _SplitPath($sFilePathIN, 5))
If @error Then Exit
FileChangeDir(_SplitPath($sFilePathIN, 6))

Local $hOpen = FileOpen($sFilePathIN, 16384)
Local $sContent = FileRead($hOpen)
FileClose($hOpen)
$sContent = StringRegExpReplace($sContent, "<cf>", @CRLF)
$sContent = StringRegExpReplace($sContent, "<lf>", @LF)
$sContent = StringRegExpReplace($sContent, "<cr>", @CR)
$sContent = _WinAPI_WideCharToMultiByte($sContent, $CP_SHIFT_JIS)
$hOpen = FileOpen($sFilePathOUT, 2 + 8 + 16)
FileWrite($hOpen, StringToBinary($sContent))
FileClose($hOpen)
TrayTip("Import", "Finish!", 3)
MsgBox(32, "", "Import Finish!", 3)


Func _SplitPath($sFilePath, $sType = 0)
    Local $sDrive, $sDir, $sFileName, $sExtension, $sReturn
    Local $aArray = StringRegExp($sFilePath, "^\h*((?:\\\\\?\\)*(\\\\[^\?\/\\]+|[A-Za-z]:)?(.*[\/\\]\h*)?((?:[^\.\/\\]|(?(?=\.[^\/\\]*\.)\.))*)?([^\/\\]*))$", 1)
    If @error Then
        ReDim $aArray[5]
        $aArray[0] = $sFilePath
    EndIf
    $sDrive = $aArray[1]
    If StringLeft($aArray[2], 1) == "/" Then
        $sDir = StringRegExpReplace($aArray[2], "\h*[\/\\]+\h*", "\/")
    Else
        $sDir = StringRegExpReplace($aArray[2], "\h*[\/\\]+\h*", "\\")
    EndIf
    $aArray[2] = $sDir
    $sFileName = $aArray[3]
    $sExtension = $aArray[4]
    If $sType = 1 Then Return $sDrive
    If $sType = 2 Then Return $sDir
    If $sType = 3 Then Return $sFileName
    If $sType = 4 Then Return $sExtension
    If $sType = 5 Then Return $sFileName & $sExtension
    If $sType = 6 Then Return $sDrive & $sDir
    If $sType = 7 Then Return $sDrive & $sDir & $sFileName
    Return $aArray
EndFunc   ;==>_SplitPath

 

Ok. I'll try it.

Share this post


Link to post
Share on other sites

#17 ·  Posted (edited)

On 2016. 5. 2. at 6:59 AM, jchd said:

Your setup must be differing somehow.

 Hi, jchd. I figure out what caused the difference. When I change the system locale to English. I can get the same result that of yours. However, when I change the locale back to the Korean, I get the wrong result. You may also check it.

 Anyway, would autoit also can handle locale? I mean, is there a way to adjust the modified code working on Multibyte locale system? Do I have to run it English locale only?

Edited by carl1905

Share this post


Link to post
Share on other sites

#18 ·  Posted

Jeez, I don't see right now where the locale impacts processing. To find out, can you add debug statements (BinaryLen, MsgBox and StringLen, for instance) at various places of the script and compare between En and Kr locale settings. BTW I'm using French locale but that doesn't differ much from En as far as encoding is concerned.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

#19 ·  Posted (edited)

47 minutes ago, jchd said:

Jeez, I don't see right now where the locale impacts processing. To find out, can you add debug statements (BinaryLen, MsgBox and StringLen, for instance) at various places of the script and compare between En and Kr locale settings. BTW I'm using French locale but that doesn't differ much from En as far as encoding is concerned.

Yes. For example, in this sample code, $output_len is different from locale. English locale calculates it as 16, but Korean locale calculates it as 8.

#include <WinAPI.au3>

Global Const $CP_SHIFT_JIS = 932

$string = "よ、未来の美人!"
$unicode_len = BinaryLen($string)
MsgBox(0,"title",$unicode_len) ;=> It saids 8 on both English and Korean locale.

$output = _WinAPI_WideCharToMultiByte ($string , 932 , True )
$output_len = BinaryLen($output)
MsgBox(0,"title",$output_len) ;=> It saids 16 on English locale. However, it saids 8 on Korean locale.

$hNewfile = FileOpen("NEW_test.txt", 2 + 16)
FileWrite($hNewfile, $output)
FileClose($hNewfile)
TrayTip("Import", "Finish!", 3)
Sleep(3000)

 

Edited by carl1905

Share this post


Link to post
Share on other sites

#20 ·  Posted

3 hours ago, carl1905 said:

It saids 16 on English locale. However, it saids 8 on Korean locale.

That is expected: only asian (korean, japanese, ...) codepages use actual multibyte encoding. Western codepages only see a string of bytes, each of them being an individual character.

What surprises me more is that you can only obtain 2 strings in your output with korean locale while looking at the hex dump of the result we (westerners) can see 3 strings.

 

1 person likes this

This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • carl1905
      By carl1905
      Hi, I have in trouble converting *.txt to original binary format. I attached my sample text file.
      The one what I want is
      1) If text contains "[" and "]" then Split text into array.
      2) Convert StringToBinary btw "[" and "]" 
      3) Convert other strings(Japanese) to Shift-JIS.
      For example, like this.
      $text_unicode = "すばらしい!<lf>すばらしいです、[010004300020FF]さん!" [010004300020FF] to 0x010004300020FF and other text values into Shift-JIS. $result = 0x82B782CE82E782B582A282C582B78141010004300020FF82B382F18149 Result in hex editor
      My problem is related with this post. [Solved] Extracting text from string and reinsert it.
       
    • ViciousXUSMC
      By ViciousXUSMC
      So I am trying to write a script that can take the current computer name and write it to a registry key.
      This key uses REG_BINARY type of key and looks like this:
      Value 1 Name: LocalName Type: REG_BINARY Data: 00000000 54 4f 55 47 48 42 4f 4f - 4b 36 30 39 38 00 00 00 TOUGHBOOK6098... 00000010 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000020 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000030 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000040 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000050 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000060 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000070 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000080 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 00000090 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 000000a0 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 000000b0 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 000000c0 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 000000d0 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 000000e0 00 00 00 00 00 00 00 00 - 00 00 00 00 00 00 00 00 ................ 000000f0 00 00 00 00 00 00 00 00 - ........ Its apparently very important that all the ending 00's are in there I found this out as I was working on some other keys that had to do with security settings, and if the key was say 01 00 00 putting any less or any extra ending zeros would cause the change in the regsitry to not effect the software.
      So what I am attempting to do is use the @ComputerName macro and plug it into a StringtoBinary() function and then write it to the registry.
      The challenge I face is how to fill in the entire binary value for the registry key and not just the converted strings value.
      I wonder if there is some easy code to do that.  I imagine there must be a few ways but I am looking for the most straight forward/easiest. 
      Also on a side note, instead of using RegWrite is there a way to create a .REG file that I can call from CMD with Reg Import?
      This is being done at an enterprise level and the users do not have access to write to the registry, but they can import a reg file via .bat or in my case a Autoit .exe with @ComSpec
      Currently I am importing all the "static" keys I need for configuration, but computer name is dynamic so that is why I am trying to find a way to use the @ComputerName macro to insert that value.
      Regards,
    • Rickname
      By Rickname
      The StringToByte function , I expected to return 0s and 1s as strings displayed, but in fact it returns in hex if Im not wrong like 0xH3J4H....
      How can I make to return to me the expression of a string in bits ?