Jump to content
Sign in to follow this  

Does StringRegExpReplace do Unicode?

Recommended Posts


G'day everyone

I wrote a script that does StringRegExpReplace on a file... and the replace works beautifully, but somewhere in the whole process the file's encoding gets busted. I open the file using UTF8, so my question is... does StringRegExpReplace respect the Unicode encoding of a file?

Just for interest sake, here's the script:

#include <file.au3>

; The file filelist.txt contains full paths of files to be processed.
$inputfile1 = FileOpen ("filelist.txt", 128)

; This is a horrible hack but I can't figure out how to do it cleanly

HotKeySet("^!z", "compare")

While 1
$inputline1 = FileReadLine($inputfile1)
If @error = -1 Then ExitLoop
$a = 1
Send ("^!z")
While $a = 1
Sleep ("250")
$num = $num+1


MsgBox (1, "All done!", "Processed " & $num & " files", "")


Func compare()

$txt1 = FileOpen ($inputline1, 128)
$txtcontent1 = FileRead ($txt1)
; Normally I would make backups, but this time I don't need/want them
; $backup = FileOpen ($inputline1 & "_backup", 130)
; FileWrite ($backup, $txtcontent1)
$txtcontent2 = StringReplace ($txtcontent1, @TAB, " ")

$txtcontent2 = StringRegExpReplace ($txtcontent1, "# \(pofilter\).+", @LF)
$txtcontent3 = StringRegExpReplace ($txtcontent2, "# \(pofilter\).+", @LF)
$txtcontent4 = StringRegExpReplace ($txtcontent3, "# \(pofilter\).+", @LF)
$txtcontent5 = StringRegExpReplace ($txtcontent4, "# \(pofilter\).+", @LF)

FileClose ($txt1)
$txt2 = FileOpen ($inputline1, 130)
FileWrite ($txt2, $txtcontent5)
FileClose ($txt2)

FileClose ($inputline1)

$a = 2


I've also included a file that I might use this script on. The letter "ê" would be changed to something disfigured in the output version, for example, which indicates the file was opened using a non-Unicode codepage somewhere in the process.

Your help is appreciated

Share this post

Link to post
Share on other sites

I may have solved it. The Unicode StringRegExpReplace works fine if the file has BOM, but it doesn't work if the file doesn't have a BOM, even if FileOpen opens it as Unicode.

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  


Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.