Sign in to follow this  
Followers 0
lsakizada

Please help to convert Unicode to UTF8

3 posts in this topic

#1 ·  Posted (edited)

Does someone help me to understand why my script does not convert UNICODE to UTF-8?

I grab the code of the unicodetoutf8 function from the forum posts somewhere.

if $CmdLine[0] <> 2 then 
    MsgBox(0,0,"Uses: U2UTF8 'Path to Source Unicode File' 'Path to Destination UTF-8 File' ")
    exit
EndIf
    
Dim $UnicodeFile    = $CmdLine[1]
Dim $UTF8FILE       = $CmdLine[2]   


$File1 = FileOpen($UnicodeFile, 4); 4 - raw read mode
$Unicode = FileRead($File1, FileGetSize($UnicodeFile))

consolewrite ($Unicode & @CRLF)
$UTF8String = Unicode2Utf8($Unicode)


consolewrite ($UTF8String & @CRLF)
FileClose ($File1)

$file = FileOpen($UTF8FILE,128+2)
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf
FileWrite($file, $UTF8String)
FileClose ($file)

Func Unicode2Utf8($UniString)
    
    If Not IsBinary($UniString) Then
        SetError(1)
        MsgBox(0,0,"not binary")
        Return $UniString
    EndIf

    Local $UniStringLen = StringLen($UniString)
    Local $BufferLen = $UniStringLen * 2
    Local $Input = DllStructCreate("byte[" & $BufferLen & "]")
    Local $Output = DllStructCreate("char[" & $BufferLen & "]")
    DllStructSetData($Input, 1, $UniString)
    Local $Return = DllCall("kernel32.dll", "int", "WideCharToMultiByte", _
        "int", 65001, _
        "int", 0 , _
        "ptr", DllStructGetPtr($Input), _
        "int", $UniStringLen / 2, _
        "ptr", DllStructGetPtr($Output), _
        "int", $BufferLen, _
        "int", 0, _
        "int", 0)   
    Local $Utf8String = DllStructGetData($Output, 1)
    $Output = 0
    $Input = 0
    Return $Utf8String
EndFunc
Edited by lsakizada

Be Green Now or Never (BGNN)!

Share this post


Link to post
Share on other sites



I think this because you operate with binary string, but try write it as UTF8. So changing flags for writing to 16+2 solve problem.

But if you use unicode OS (2k/XP/Vista) you don't need use Unicode2Utf8 at all, Autoit unicode build did the job, and your code can be just:

if $CmdLine[0] <> 2 then 
    MsgBox(0,0,"Uses: U2UTF8 'Path to Source Unicode File' 'Path to Destination UTF-8 File' ")
    exit
EndIf
    
Dim $UnicodeFile    = $CmdLine[1]
Dim $UTF8FILE         = $CmdLine[2]    

$File1 = FileOpen($UnicodeFile, 0); 
$Unicode = FileRead($File1, FileGetSize($UnicodeFile))
FileClose($File1)

$file = FileOpen($UTF8FILE, 128+2)
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf
FileWrite($file, $Unicode)
FileClose ($file)

Share this post


Link to post
Share on other sites

I think this because you operate with binary string, but try write it as UTF8. So changing flags for writing to 16+2 solve problem.

But if you use unicode OS (2k/XP/Vista) you don't need use Unicode2Utf8 at all, Autoit unicode build did the job, and your code can be just:

if $CmdLine[0] <> 2 then 
    MsgBox(0,0,"Uses: U2UTF8 'Path to Source Unicode File' 'Path to Destination UTF-8 File' ")
    exit
EndIf
    
Dim $UnicodeFile    = $CmdLine[1]
Dim $UTF8FILE         = $CmdLine[2]    

$File1 = FileOpen($UnicodeFile, 0); 
$Unicode = FileRead($File1, FileGetSize($UnicodeFile))
FileClose($File1)

$file = FileOpen($UTF8FILE, 128+2)
If $file = -1 Then
    MsgBox(0, "Error", "Unable to open file.")
    Exit
EndIf
FileWrite($file, $Unicode)
FileClose ($file)

WOW!!! amazing solution. thank you very much LazyCAT!


Be Green Now or Never (BGNN)!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0