leuce Posted October 1, 2007 Share Posted October 1, 2007 G'day everyone I wrote a script that does StringRegExpReplace on a file... and the replace works beautifully, but somewhere in the whole process the file's encoding gets busted. I open the file using UTF8, so my question is... does StringRegExpReplace respect the Unicode encoding of a file? Just for interest sake, here's the script: expandcollapse popup#include <file.au3> ; The file filelist.txt contains full paths of files to be processed. $inputfile1 = FileOpen ("filelist.txt", 128) ; This is a horrible hack but I can't figure out how to do it cleanly HotKeySet("^!z", "compare") While 1 $inputline1 = FileReadLine($inputfile1) If @error = -1 Then ExitLoop $a = 1 Send ("^!z") While $a = 1 Sleep ("250") WEnd $num = $num+1 Wend FileClose($inputfile1) MsgBox (1, "All done!", "Processed " & $num & " files", "") Exit Func compare() $txt1 = FileOpen ($inputline1, 128) $txtcontent1 = FileRead ($txt1) ; Normally I would make backups, but this time I don't need/want them ; $backup = FileOpen ($inputline1 & "_backup", 130) ; FileWrite ($backup, $txtcontent1) $txtcontent2 = StringReplace ($txtcontent1, @TAB, " ") $txtcontent2 = StringRegExpReplace ($txtcontent1, "# \(pofilter\).+", @LF) $txtcontent3 = StringRegExpReplace ($txtcontent2, "# \(pofilter\).+", @LF) $txtcontent4 = StringRegExpReplace ($txtcontent3, "# \(pofilter\).+", @LF) $txtcontent5 = StringRegExpReplace ($txtcontent4, "# \(pofilter\).+", @LF) FileClose ($txt1) $txt2 = FileOpen ($inputline1, 130) FileWrite ($txt2, $txtcontent5) FileClose ($txt2) FileClose ($inputline1) $a = 2 EndFunc I've also included a file that I might use this script on. The letter "ê" would be changed to something disfigured in the output version, for example, which indicates the file was opened using a non-Unicode codepage somewhere in the process. Your help is appreciated Link to comment Share on other sites More sharing options...
leuce Posted October 4, 2007 Author Share Posted October 4, 2007 I may have solved it. The Unicode StringRegExpReplace works fine if the file has BOM, but it doesn't work if the file doesn't have a BOM, even if FileOpen opens it as Unicode. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now