Sign in to follow this  
Followers 0
leuce

Does StringRegExpReplace do Unicode?

2 posts in this topic

G'day everyone

I wrote a script that does StringRegExpReplace on a file... and the replace works beautifully, but somewhere in the whole process the file's encoding gets busted. I open the file using UTF8, so my question is... does StringRegExpReplace respect the Unicode encoding of a file?

Just for interest sake, here's the script:

#include <file.au3>

; The file filelist.txt contains full paths of files to be processed.
$inputfile1 = FileOpen ("filelist.txt", 128)

; This is a horrible hack but I can't figure out how to do it cleanly

HotKeySet("^!z", "compare")

While 1
$inputline1 = FileReadLine($inputfile1)
If @error = -1 Then ExitLoop
$a = 1
Send ("^!z")
While $a = 1
Sleep ("250")
WEnd
$num = $num+1
Wend

FileClose($inputfile1)

MsgBox (1, "All done!", "Processed " & $num & " files", "")

Exit

Func compare()

$txt1 = FileOpen ($inputline1, 128)
$txtcontent1 = FileRead ($txt1)
; Normally I would make backups, but this time I don't need/want them
; $backup = FileOpen ($inputline1 & "_backup", 130)
; FileWrite ($backup, $txtcontent1)
$txtcontent2 = StringReplace ($txtcontent1, @TAB, " ")


$txtcontent2 = StringRegExpReplace ($txtcontent1, "# \(pofilter\).+", @LF)
$txtcontent3 = StringRegExpReplace ($txtcontent2, "# \(pofilter\).+", @LF)
$txtcontent4 = StringRegExpReplace ($txtcontent3, "# \(pofilter\).+", @LF)
$txtcontent5 = StringRegExpReplace ($txtcontent4, "# \(pofilter\).+", @LF)

FileClose ($txt1)
$txt2 = FileOpen ($inputline1, 130)
FileWrite ($txt2, $txtcontent5)
FileClose ($txt2)

FileClose ($inputline1)

$a = 2

EndFunc

I've also included a file that I might use this script on. The letter "ê" would be changed to something disfigured in the output version, for example, which indicates the file was opened using a non-Unicode codepage somewhere in the process.

Your help is appreciated

Share this post


Link to post
Share on other sites



I may have solved it. The Unicode StringRegExpReplace works fine if the file has BOM, but it doesn't work if the file doesn't have a BOM, even if FileOpen opens it as Unicode.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0