chenxu Posted May 2, 2008 Posted May 2, 2008 Is there any UDF to convert UTF-16 file to UTF-8 file or ANSI formatted file?
ProgAndy Posted May 2, 2008 Posted May 2, 2008 Use FileOpen with the different options. open file for Read with original CharSet and create a new file with the desired charset The obly problem is to get to know, which charset is used in original file. *GERMAN* [note: you are not allowed to remove author / modified info from my UDFs]My UDFs:[_SetImageBinaryToCtrl] [_TaskDialog] [AutoItObject] [Animated GIF (GDI+)] [ClipPut for Image] [FreeImage] [GDI32 UDFs] [GDIPlus Progressbar] [Hotkey-Selector] [Multiline Inputbox] [MySQL without ODBC] [RichEdit UDFs] [SpeechAPI Example] [WinHTTP]UDFs included in AutoIt: FTP_Ex (as FTPEx), _WinAPI_SetLayeredWindowAttributes
Squirrely1 Posted May 2, 2008 Posted May 2, 2008 Using progAndy's idea, I wrote this, but I couldn't test it except to find out that it duplicates an ANSI file: Global $opFile = FileOpen("UTF_16_file.txt", 16) Sleep(2000) Global $contents = FileRead($opFile) Sleep(2000) FileClose($opFile) Sleep(300) $opFile = FileOpen("ANSI_version.txt", 2) FileWrite($opFile, $contents) Das Häschen benutzt Radar
Siao Posted May 2, 2008 Posted May 2, 2008 From the helpfile about FileRead:"Both ANSI and UTF16/UTF8 text formats can be read - AutoIt will automatically determine the type."See example #2 here http://www.autoitscript.com/forum/index.ph...mp;#entry477865 "be smart, drink your wine"
chenxu Posted May 2, 2008 Author Posted May 2, 2008 This is the file to be converted: and, this is the same content file I need How to convert? Any method, including third party utility, is appreciated
Squirrely1 Posted May 2, 2008 Posted May 2, 2008 (edited) I used some of Siao's recommendations and code to build your function - it doesn't convert foreign characters not in the target character set, of course ... expandcollapse popupOpt("MustDeclareVars", 1) Global $in_file = @MyDocumentsDir & "\file2beConverted.xml" Global $out_file = @MyDocumentsDir & "\ConvertedUTF-8.xml" Convert_EncodedFile($in_file, $out_file, "UTF-8") Global $out_file = @MyDocumentsDir & "\ConvertedANSI.xml" Convert_EncodedFile($in_file, $out_file, "ANSI") ;================================================================================================= ; Name: Convert_EncodedFile ; Purpose: Special file encoding conversion ; Notes: Doesn't convert all foreign characters in to the target character set because they don't ; exist there sometimes. ; Takes a few seconds to run. ; Authors: Siao - modified by Squirrely1 ;================================================================================================= Func Convert_EncodedFile($inFile, $outFile, $sEncoding = "ANSI") Local $opFile, $iMode = 2 Switch $sEncoding; out-file encoding Case "Unicode" $iMode = 34 Case "Unicode big endian" $iMode = 66 Case "UTF-8" $iMode = 130 EndSwitch ; Read ... Local $contents = FileRead($inFile) Sleep(2000) ; Write ... $opFile = FileOpen($outFile, $iMode) Sleep(1200) If $opFile = -1 Then MsgBox(0, "Error", "Unable to open the file for writing.") Exit EndIf Local $Ret = FileWrite($outFile, $contents) If Not $Ret Then MsgBox(0, "Error", "Error writing to file.") Exit EndIf Sleep(1000) FileClose($opFile) Sleep(300) EndFunc;==>ConvertEncodedFile In the case of the provided files, you were only missing a font-family and some link text, not breaking the page - you could write some StringReplace commands to finish building this function. Edited May 2, 2008 by Squirrely1 Das Häschen benutzt Radar
chenxu Posted May 3, 2008 Author Posted May 3, 2008 Thanks a lot, I wrote myself a java application and it works fine by now. The code is: package com.cx.test; import java.io.BufferedWriter; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.InputStreamReader; import java.io.OutputStreamWriter; /** * @author Administrator * */ public class FileConverter { public static void main(String args[]) throws Exception { if (args.length == 0) { System.exit(0); } File file = new File(args[0]); FileInputStream fi = new FileInputStream(file); InputStreamReader ir = new InputStreamReader(fi, "UTF-8"); char[] ch = new char[(int)file.length()]; ir.read(ch, 0, ch.length); FileOutputStream fo = new FileOutputStream(args[0]); BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(fo,"gb2312")); bw.write(ch); bw.flush(); } }
Squirrely1 Posted May 4, 2008 Posted May 4, 2008 So, treasonously, you have decided to turn your back on AutoIt altogether, for this. Well, in your defence, the island of Java is closer to China than is the city where Jon is seeking asylum - Seattle. Das Häschen benutzt Radar
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now