HurleyShanabarger Posted October 26, 2017 Share Posted October 26, 2017 (edited) Hi guys, I have a huge amount of files that are coded in UTF-8 but for backwards compatibility I need to convert it to ANSI. No big deal I thought, but if I do that using AutoIt it fails for me. I tested it with 3.3.14.2 and the current beta 3.3.15.0 - did I miss something? For $iMode = 5 To 11 _FileWrite_Mode(2^$iMode) Next Func _FileWrite_Mode($iMode) FileDelete(@DesktopDir & "\test.txt") $_lv_hFile = FileOpen(@DesktopDir & "\test.txt", $iMode + 2) FileWrite($_lv_hFile, "test") FileClose($_lv_hFile) ConsoleWrite("----------------------------" & @CRLF) ConsoleWrite("Expected:" & @TAB & $iMode & @CRLF) ConsoleWrite("Detected:" & @TAB & FileGetEncoding(@DesktopDir & "\test.txt") & @CRLF) EndFunc ;==>_FileWrite_Mode Edited October 26, 2017 by HurleyShanabarger Link to comment Share on other sites More sharing options...
AspirinJunkie Posted October 26, 2017 Share Posted October 26, 2017 If there is no BOM FileGetEncoding has to estimate the Encoding with the help of heuristics. In your case the string "test" has not enough information to separate ANSI from UTF8 without BOM. That's because a file with the text "test" is on binary level completely the same in both encodings. So FileEncoding has to guess in this case. But that doesn't mean that the file isn't written in ANSI-encoding. It only means that FileGetEncoding never had a chance to seperate the correct encoding from another one with perfect precision. Add for an example a "€" sign to your file string and you will see that now FileGetEncoding estimate the encoding correctly. Link to comment Share on other sites More sharing options...
HurleyShanabarger Posted October 26, 2017 Author Share Posted October 26, 2017 Thank you. That basically means if I convert the files using $FO_ANSI it will be created as ANSI and it is just the detection that fails. If I want to check a file if it has been converted and the result return as $FO_UTF8_NOBOM I can just append "€" to the file and recheck it - if it results as $FO_ANSI the file is already converted - correct? Link to comment Share on other sites More sharing options...
AspirinJunkie Posted October 26, 2017 Share Posted October 26, 2017 7 minutes ago, HurleyShanabarger said: That basically means if I convert the files using $FO_ANSI it will be created as ANSI and it is just the detection that fails. Yes 7 minutes ago, HurleyShanabarger said: If I want to check a file if it has been converted and the result return as $FO_UTF8_NOBOM I can just append "€" to the file and recheck it - if it results as $FO_ANSI the file is already converted - correct? No. If you mix up encodings in different write operations the result can be misleading. For example: $s_File = @ScriptDir & "\Test.txt" ; write File in UTF8 $hFile = FileOpen($s_File, 2 + 256) FileWriteLine($hFile, "This is a test with some special chars like € or @ or ÄÖÜ") FileClose($hFile) ConsoleWrite(FileGetEncoding($s_File) & @CRLF) ; add € in ANSI: $hFile = FileOpen($s_File, 1 + 512) FileWrite($hFile, "€") FileClose($hFile) ConsoleWrite(FileGetEncoding($s_File) & @CRLF) ShellExecute($s_File) So in this case FileGetEncoding recognizes the encoding after all as ANSI but the text before is still in UTF8-encoding. This leads to display errors when viewing in a text editor. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now