4 posts in this topic
I need to read log files into an array to search for errors. However when I display the array I get garbage or "chinese characters". Our developers say they are using UTF-8, but FileGetEncoding says the logs are "2048" or $FO_UTF16_BE_NOBOM (2048) = Use Unicode UTF16 Big Endian (without BOM) from the Encoding codes in FileOpen().
There is an app called Detenc that detects the encoding used by files. You have to guess, but it returns correctly when I set the Encoder for UTF-8. I understand Encoding is not etched in stone, but the first character of the file is a capital B, using HxD Hex Editor.
I even have another topic here about running PowerShell to reencode the file so AutoIt will store the file properly in the array - See:
So I am trying to figure out why AutoIt thinks my logs are not UTF-8.
Here is sample code:
#include <array.au3> #include <File.au3> Local $aRetArrayFile _FileReadToArray("C:\Logs\Myplayer1.log", $aRetArrayFile) _ArrayDisplay($aRetArrayFile) I won't post the results as it is illegible, but I did attach a screenshot of the _ArrayDisplay results, and this is the first line of the Log file:
BANNER 10/10/2017 15:56:00 ====================================================================== And the Hex from the beginning of the file:
42 41 4E 4E 45 52 20 31 30 2F 31 30 2F 32 30 31 37 20 31 34 3A 33 31 3A 33 35 20 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 0D 0A 42 41 4E 4E 45 52 20 So I don't understand why AutoIt thinks the file is UTF16 BE.
If I can get the Powershell script running, I have a workaround.
BTW none of my other arrays display as garbage, just the log files.
Rereading my post, what seems to be missing is the question. I guess my question is, does anyone know why these logs are being displayed incorrectly?
i need to save files with ANSI-Encoding. Since 126.96.36.199 Auto-It it doesn't work in any direction.
I tried the following:
#include <FileConstants.au3> FileDelete(@ScriptDir&"\Test.txt") $o = FileOpen(@ScriptDir&"\Test.txt", BitOR($FO_BINARY,$FO_ANSI,$FO_OVERWRITE)) FileWrite($o, "Test") FileClose($o) Or
#include <FileConstants.au3> FileDelete(@ScriptDir&"\Test.txt") $o = FileOpen(@ScriptDir&"\Test.txt", 514) FileWrite($o, "Test") FileClose($o) Both create UTF-8 encoded files.
What am i doing wrong?
I'm using the code below to send mails using our internal relay server.
We got a Helpdesk system named Remedy. Our users can send us a mails using outlook 2010, and we'll get a ticket.
The problem is if I send a mail using the above script our ticket system can't display unicode characters, like ex: Æ Ø Å. It will display them as: questionmarks: "? ? ?" inside our ticket system. In the outlook inbox it looks fine showing unicode symbols, but in our ticket system the unicode characters will be replaced by questionmarks.
The thing is, if they send a mail using outlook, it works fine, but using the script above it doesen't.
I tried to save my script with encoding: UTF-8 with BOM, but it didn't fix it.
All suggestions are very welcome
I've ported toUTF8() function (truly, the whole Encoding class) by Sebastián Grignoli to AutoIt. It offers useful functions to force a string to be in a specified charset in a really easy way.
From the readme file:
$utf8_string = toUTF8($utf8_or_latin1_or_mixed_string) $latin1_string = toLatin1($utf8_or_latin1_or_mixed_string) Also:
$utf8_string = fixUTF8($garbled_utf8_string) fixUTF8() converts the string to UTF-8 repeatedly until make sure it has only UTF-8 valid chars (it's really UTF-8).
#include 'forceutf8.au3' MsgBox(0, '', fixUTF8( 'Ã£Ã©' ) ) Will output:
ãé Note that it's just a port. If you look at both the source codes together (PHP and AutoIt), you'll see that they're exactly the same thing, but in different approaches (PHP arrays converted to Scripting.Dictionary objects, function renames, syntax porting, a few functions completely rewritten due to differences between PHP and AutoIt). Therefore, all credits goes to Sebastián Grignol.
It seems that it works only with latin/roman alphabet (used by English).
Download ZIP from Github
Fork me on Github
sqlite database written in ANSI code reading？The current version is based on UTF 8 encoding to read and write。
UNICODE or ANSI transfer method