Jump to content
Sign in to follow this  
chenxu

Is there any UDF to convert UTF-16 file to UTF-8 file or ANSI formatted file?

Recommended Posts

Use FileOpen with the different options. open file for Read with original CharSet and create a new file with the desired charset :)

The obly problem is to get to know, which charset is used in original file.


*GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes

Share this post


Link to post
Share on other sites

Using progAndy's idea, I wrote this, but I couldn't test it except to find out that it duplicates an ANSI file:

Global $opFile = FileOpen("UTF_16_file.txt", 16)
Sleep(2000)
Global $contents = FileRead($opFile)
Sleep(2000)
FileClose($opFile)
Sleep(300)
$opFile = FileOpen("ANSI_version.txt", 2)
FileWrite($opFile, $contents)

Share this post


Link to post
Share on other sites

This is the file to be converted:

and, this is the same content file I need

How to convert? Any method, including third party utility, is appreciated

Share this post


Link to post
Share on other sites

I used some of Siao's recommendations and code to build your function - it doesn't convert foreign characters not in the target character set, of course ...

Opt("MustDeclareVars", 1)

Global $in_file = @MyDocumentsDir & "\file2beConverted.xml"

Global $out_file = @MyDocumentsDir & "\ConvertedUTF-8.xml"
Convert_EncodedFile($in_file, $out_file, "UTF-8")

Global $out_file = @MyDocumentsDir & "\ConvertedANSI.xml"
Convert_EncodedFile($in_file, $out_file, "ANSI")


;=================================================================================================
; Name:     Convert_EncodedFile
; Purpose:  Special file encoding conversion
; Notes:    Doesn't convert all foreign characters in to the target character set because they don't 
;   exist there sometimes.
;   Takes a few seconds to run.
; Authors:  Siao - modified by Squirrely1
;=================================================================================================
Func Convert_EncodedFile($inFile, $outFile, $sEncoding = "ANSI")
    
    Local $opFile, $iMode = 2
    Switch $sEncoding; out-file encoding
        Case "Unicode"
            $iMode = 34
            
        Case "Unicode big endian"
            $iMode = 66
            
        Case "UTF-8"
            $iMode = 130
            
    EndSwitch

; Read ...
    Local $contents = FileRead($inFile)
    Sleep(2000)
    
; Write ...
    $opFile = FileOpen($outFile, $iMode)
    Sleep(1200)
    If $opFile = -1 Then
        MsgBox(0, "Error", "Unable to open the file for writing.")
        Exit
    EndIf
    Local $Ret = FileWrite($outFile, $contents)
    If Not $Ret Then
        MsgBox(0, "Error", "Error writing to file.")
        Exit
    EndIf
    Sleep(1000)
    FileClose($opFile)
    Sleep(300)
    
EndFunc;==>ConvertEncodedFile

In the case of the provided files, you were only missing a font-family and some link text, not breaking the page - you could write some StringReplace commands to finish building this function.

Edited by Squirrely1

Share this post


Link to post
Share on other sites

Thanks a lot, I wrote myself a java application and it works fine by now. The code is:

package com.cx.test;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;

/**
 * @author Administrator
 *
 */
public class FileConverter {
    public static void main(String args[]) throws Exception {
        if (args.length == 0) {
            System.exit(0);
        }
        File file = new File(args[0]);
        FileInputStream fi = new FileInputStream(file);
        
        InputStreamReader ir = new InputStreamReader(fi, "UTF-8");
        char[] ch = new char[(int)file.length()];
        ir.read(ch, 0, ch.length);
        FileOutputStream fo = new FileOutputStream(args[0]);
        BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fo,"gb2312"));
        bw.write(ch);
        bw.flush();
    }
    
}

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×
×
  • Create New...