leuce Posted October 1, 2007 Posted October 1, 2007 G'day everyone I wrote a script that does StringRegExpReplace on a file... and the replace works beautifully, but somewhere in the whole process the file's encoding gets busted. I open the file using UTF8, so my question is... does StringRegExpReplace respect the Unicode encoding of a file? Just for interest sake, here's the script: expandcollapse popup#include <file.au3> ; The file filelist.txt contains full paths of files to be processed. $inputfile1 = FileOpen ("filelist.txt", 128) ; This is a horrible hack but I can't figure out how to do it cleanly HotKeySet("^!z", "compare") While 1 $inputline1 = FileReadLine($inputfile1) If @error = -1 Then ExitLoop $a = 1 Send ("^!z") While $a = 1 Sleep ("250") WEnd $num = $num+1 Wend FileClose($inputfile1) MsgBox (1, "All done!", "Processed " & $num & " files", "") Exit Func compare() $txt1 = FileOpen ($inputline1, 128) $txtcontent1 = FileRead ($txt1) ; Normally I would make backups, but this time I don't need/want them ; $backup = FileOpen ($inputline1 & "_backup", 130) ; FileWrite ($backup, $txtcontent1) $txtcontent2 = StringReplace ($txtcontent1, @TAB, " ") $txtcontent2 = StringRegExpReplace ($txtcontent1, "# \(pofilter\).+", @LF) $txtcontent3 = StringRegExpReplace ($txtcontent2, "# \(pofilter\).+", @LF) $txtcontent4 = StringRegExpReplace ($txtcontent3, "# \(pofilter\).+", @LF) $txtcontent5 = StringRegExpReplace ($txtcontent4, "# \(pofilter\).+", @LF) FileClose ($txt1) $txt2 = FileOpen ($inputline1, 130) FileWrite ($txt2, $txtcontent5) FileClose ($txt2) FileClose ($inputline1) $a = 2 EndFunc I've also included a file that I might use this script on. The letter "ê" would be changed to something disfigured in the output version, for example, which indicates the file was opened using a non-Unicode codepage somewhere in the process. Your help is appreciated
leuce Posted October 4, 2007 Author Posted October 4, 2007 I may have solved it. The Unicode StringRegExpReplace works fine if the file has BOM, but it doesn't work if the file doesn't have a BOM, even if FileOpen opens it as Unicode.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now