Jump to content
Sign in to follow this  
chenxu

Is there any UDF to convert UTF-16 file to UTF-8 file or ANSI formatted file?

Recommended Posts

chenxu

Is there any UDF to convert UTF-16 file to UTF-8 file or ANSI formatted file?

Share this post


Link to post
Share on other sites
ProgAndy

Use FileOpen with the different options. open file for Read with original CharSet and create a new file with the desired charset :)

The obly problem is to get to know, which charset is used in original file.


*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Share this post


Link to post
Share on other sites
Squirrely1

Using progAndy's idea, I wrote this, but I couldn't test it except to find out that it duplicates an ANSI file:

Global $opFile = FileOpen("UTF_16_file.txt", 16)
Sleep(2000)
Global $contents = FileRead($opFile)
Sleep(2000)
FileClose($opFile)
Sleep(300)
$opFile = FileOpen("ANSI_version.txt", 2)
FileWrite($opFile, $contents)

Das Häschen benutzt Radar

Share this post


Link to post
Share on other sites
chenxu

This is the file to be converted:

and, this is the same content file I need

How to convert? Any method, including third party utility, is appreciated

Share this post


Link to post
Share on other sites
Squirrely1

I used some of Siao's recommendations and code to build your function - it doesn't convert foreign characters not in the target character set, of course ...

Opt("MustDeclareVars", 1)

Global $in_file = @MyDocumentsDir & "\file2beConverted.xml"

Global $out_file = @MyDocumentsDir & "\ConvertedUTF-8.xml"
Convert_EncodedFile($in_file, $out_file, "UTF-8")

Global $out_file = @MyDocumentsDir & "\ConvertedANSI.xml"
Convert_EncodedFile($in_file, $out_file, "ANSI")


;=================================================================================================
; Name:     Convert_EncodedFile
; Purpose:  Special file encoding conversion
; Notes:    Doesn't convert all foreign characters in to the target character set because they don't 
;   exist there sometimes.
;   Takes a few seconds to run.
; Authors:  Siao - modified by Squirrely1
;=================================================================================================
Func Convert_EncodedFile($inFile, $outFile, $sEncoding = "ANSI")
    
    Local $opFile, $iMode = 2
    Switch $sEncoding; out-file encoding
        Case "Unicode"
            $iMode = 34
            
        Case "Unicode big endian"
            $iMode = 66
            
        Case "UTF-8"
            $iMode = 130
            
    EndSwitch

; Read ...
    Local $contents = FileRead($inFile)
    Sleep(2000)
    
; Write ...
    $opFile = FileOpen($outFile, $iMode)
    Sleep(1200)
    If $opFile = -1 Then
        MsgBox(0, "Error", "Unable to open the file for writing.")
        Exit
    EndIf
    Local $Ret = FileWrite($outFile, $contents)
    If Not $Ret Then
        MsgBox(0, "Error", "Error writing to file.")
        Exit
    EndIf
    Sleep(1000)
    FileClose($opFile)
    Sleep(300)
    
EndFunc;==>ConvertEncodedFile

In the case of the provided files, you were only missing a font-family and some link text, not breaking the page - you could write some StringReplace commands to finish building this function.

Edited by Squirrely1

Das Häschen benutzt Radar

Share this post


Link to post
Share on other sites
chenxu

Thanks a lot, I wrote myself a java application and it works fine by now. The code is:

package com.cx.test;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

/**
 * @author Administrator
 *
 */
public class FileConverter {
    public static void main(String args[]) throws Exception {
        if (args.length == 0) {
            System.exit(0);
        }
        File file = new File(args[0]);
        FileInputStream fi = new FileInputStream(file);
        
        InputStreamReader ir = new InputStreamReader(fi, "UTF-8");
        char[] ch = new char[(int)file.length()];
        ir.read(ch, 0, ch.length);
        FileOutputStream fo = new FileOutputStream(args[0]);
        BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fo,"gb2312"));
        bw.write(ch);
        bw.flush();
    }
    
}

Share this post


Link to post
Share on other sites
Squirrely1

So, treasonously, you have decided to turn your back on AutoIt altogether, for this. Well, in your defence, the island of Java is closer to China than is the city where Jon is seeking asylum - Seattle.


Das Häschen benutzt Radar

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×