Jump to content

FileOpen in ANSI


Recommended Posts

Hi guys,

I have a huge amount of files that are coded in UTF-8 but for backwards compatibility I need to convert it to ANSI. No big deal I thought, but if I do that using AutoIt it fails for me. I tested it with 3.3.14.2 and the current beta 3.3.15.0 - did I miss something?

For $iMode = 5 To 11
    _FileWrite_Mode(2^$iMode)
Next

Func _FileWrite_Mode($iMode)
    FileDelete(@DesktopDir & "\test.txt")
    $_lv_hFile = FileOpen(@DesktopDir & "\test.txt", $iMode + 2)
    FileWrite($_lv_hFile, "test")
    FileClose($_lv_hFile)
    ConsoleWrite("----------------------------" & @CRLF)
    ConsoleWrite("Expected:" & @TAB & $iMode & @CRLF)
    ConsoleWrite("Detected:" & @TAB & FileGetEncoding(@DesktopDir & "\test.txt") & @CRLF)
EndFunc   ;==>_FileWrite_Mode

 

Edited by HurleyShanabarger
Link to comment
Share on other sites

If there is no BOM FileGetEncoding has to estimate the Encoding with the help of heuristics.
In your case the string "test" has not enough information to separate ANSI from UTF8 without BOM.
That's because a file with the text "test" is on binary level completely the same in both encodings.
So FileEncoding has to guess in this case.
But that doesn't mean that the file isn't written in ANSI-encoding.
It only means that FileGetEncoding never had a chance to seperate the correct encoding from another one with perfect precision.

Add for an example a "€" sign to your file string and you will see that now FileGetEncoding estimate the encoding correctly.

 

Link to comment
Share on other sites

Thank you. That basically means if I convert the files using $FO_ANSI it will be created as ANSI and it is just the detection that fails. If I want to check a file if it has been converted and the result return as $FO_UTF8_NOBOM I can just append "€" to the file and recheck it - if it results as $FO_ANSI the file is already converted - correct?

Link to comment
Share on other sites

7 minutes ago, HurleyShanabarger said:

That basically means if I convert the files using $FO_ANSI it will be created as ANSI and it is just the detection that fails.

Yes

7 minutes ago, HurleyShanabarger said:

If I want to check a file if it has been converted and the result return as $FO_UTF8_NOBOM I can just append "€" to the file and recheck it - if it results as $FO_ANSI the file is already converted - correct?

No. If you mix up encodings in different write operations the result can be misleading.
For example:

$s_File = @ScriptDir & "\Test.txt"

; write File in UTF8
$hFile = FileOpen($s_File, 2 + 256)
FileWriteLine($hFile, "This is a test with some special chars like € or @ or ÄÖÜ")
FileClose($hFile)
ConsoleWrite(FileGetEncoding($s_File) & @CRLF)

; add € in ANSI:
$hFile = FileOpen($s_File, 1 + 512)
FileWrite($hFile, "€")
FileClose($hFile)
ConsoleWrite(FileGetEncoding($s_File) & @CRLF)

ShellExecute($s_File)

So in this case FileGetEncoding recognizes the encoding after all as ANSI but the text before is still in UTF8-encoding.
This leads to display errors when viewing in a text editor.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...