Sign in to follow this  
Followers 0
Jibberish

File Encoding for Arrays

2 posts in this topic

I need to read log files into an array to search for errors. However when I display the array I get garbage or "chinese characters". Our developers say they are using UTF-8, but FileGetEncoding says the logs are "2048" or $FO_UTF16_BE_NOBOM (2048) = Use Unicode UTF16 Big Endian (without BOM) from the Encoding codes in FileOpen().

There is an app called Detenc that detects the encoding used by files. You have to guess, but it returns correctly when I set the Encoder for UTF-8. I understand Encoding is not etched in stone, but the first character of the file is a capital B, using HxD Hex Editor.

I even have another  topic here about running PowerShell to reencode the file so AutoIt will store the file properly in the array - See:

So I am trying to figure out why AutoIt thinks my logs are not UTF-8.

Here is sample code:

#include <array.au3>
#include <File.au3>

Local $aRetArrayFile
    _FileReadToArray("C:\Logs\Myplayer1.log", $aRetArrayFile)
    _ArrayDisplay($aRetArrayFile)

I won't post the results as it is illegible, but I did attach a screenshot of the _ArrayDisplay results, and this is the first line of the Log file:

BANNER 10/10/2017 15:56:00 ======================================================================

And the Hex from the beginning of the file:

42 41 4E 4E 45 52 20 31 30 2F 31 30 2F 32 30 31 37 20 31 34 3A 33 31 3A 33 35 20 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 0D 0A 42 41 4E 4E 45 52 20

So I don't understand why AutoIt thinks the file is UTF16 BE.

If I can get the Powershell script running, I have a workaround.

BTW none of my other arrays display as garbage, just the log files.

Weird.

Rereading my post, what seems to be missing is the question. I guess my question is, does anyone know why these logs are being displayed incorrectly?

Cheers

Jibs

logjunk.png

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Try using FileRead(), after opening the file with the correct encoding, and then use StringSplit() to create the array.

Edited by czardas
1 person likes this

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • Jibberish
      By Jibberish
      Hi all,
      I need to read a log file into an array, but the log file is encoded as $FO_UTF16_BE_NOBOM (2048) = Use Unicode UTF16 Big Endian (without BOM) per FileGetEncoding (it returns 2048).
      I have searched how to convert these log files to UTF-8 and finally found a Powershell command. Since then I have been racking my brain trying to get the function to work. The command itself works from a Powerscript prompt:
      C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command Get-Content C:\Logs\Myplayer_10-10-17-02-31.log | Set-Content -Encoding utf8 C:\Logs\Myplayer1.log This is my sandbox;
      #include <array.au3> #include <File.au3> Local $aArrayLogFile Local $sLogDir = "C:\Logs\" Local $sLogFile = "Myplayer_10-10-17-02-31.log" Local $sConvertedLog = "ConvertedLog.log" Local $sLogDirFile = $sLogDir&$sLogFile RunWait("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command Get-Content "&$sLogDirFile&" | Set-Content -Encoding utf8 "&$sConvertedLog,$sLogDir) _FileReadToArray($sLogDirFile, $aArrayLogFile) _ArrayDisplay($aArrayLogFile) Also tried
      RunWait("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command Get-Content "&$sLogDirFile&" | Set-Content -Encoding utf8 "&$sConvertedLog,$sLogDir) and
      ShellExecuteWait("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe"," -Command Get-Content "&$sLogDirFile&" | Set-Content -Encoding utf8 "&$sConvertedLog,$sLogDir) Tried without -Command and a bunch of other parameters that were sprinkled throughout the internet from people trying to get this to work.
      Thanks
      Jibs
    • rootx
      By rootx
      I need help with unicode char ü I get some text from online json but if try to read 4 example Zürich I heave  Zürich.
      How can I convert with autoit unicode to a clear character readable? thx
    • virhonestum
      By virhonestum
      Hey,
      I've coma accross a very odd problem. I want to download a CSV-File, and process the contents.
      This is the extremely simplyfied AutoIT-Code, given the file is already downloaded:
      $f= @ScriptDir & "\TestFile.csv" $file = FileOpen($f,0) Local $line = FileReadLine($file) MsgBox(0,"",$line) The downloaded CSV file I want to process contains something like this:
      Artikelnummer;EAN-Code;Artikelname;Artikelgewicht;Beschreibung;Kurzbeschreibung;Eigenschaften;Technische-Daten;Bild1;Bild2;Bild3;Bild4;Bild5;Bild6;Bild7;Bild8;Lieferbar;"Lieferbar Ab";Versandzeit;"UVP-Preis inkl. MwSt.";"Preis1";"Preis2";Hersteller L7335272;5420025602129; Mini Light XLR ;0.1000; JB Systems Schwanenhalsleuchte mit XLR Anschluss. ;;;;http://www.example.com/media/images/org/pic20070114153500a.jpg;;;;;;;;JA;; 1-3 Tage ;12,90;12,90;6,57; JB Systems L3320502;540207025601636; Mini Light LED BNC ;0.1000; JB Systems LED Schwanenhalsleuchte mit BNC Anschluss. ;;;;http://www.example.com/media/images/org/pic20061231171705a.jpg;;;;;;;;JA;; 1-3 Tage ;29,90;25,89;15,26; JB Systems L1332254;542002556023143; Mini Light LED XLR ;0.1000; JB Systems LED Schwanenhalsleuchte mit XLR Anschluss. ;;;;http://www.example.com/media/images/org/pic20061231171728a.jpg;;;;;;;;JA;; 1-3 Tage ;29,90;25,89;15,26; JB Systems L8302591;504200256280277; Spiegelkugel 10cm ;0.5000; JB Systems Spiegelkugel 10cm Durchmesser mit einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324214825a.jpg;;;;;;;;JA;; 1-3 Tage ;5,50;4,90;2,81; JB Systems L7302932;542000256510222; Spiegelkugel 20cm ;0.8400; JB Systems Spiegelkugel 20cm Durchmesser mit einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324214907a.jpg;;;;;;;;JA;; 1-3 Tage ;12,90;11,50;6,58; JB Systems L2350293;534200562064239; Spiegelkugel 30cm ;2.1300; JB Systems Spiegelkugel 30 cm Durchmesser mit einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324214956a.jpg;;;;;;;;JA;; 1-3 Tage ;26,90;23,00;13,72; JB Systems L3302984;545200252024246; Spiegelkugel 40cm ;3.5000; JB Systems Spiegelkugel 40cm Durchmesser mit Sicherungsring und einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324215050a.jpg;;;;;;;;JA;; 1-3 Tage ;54,90;49,00;28,00; JB Systems L9302495;542205056225600; Spiegelkugel 50cm ;5.3900; JB Systems Spiegelkugel 50cm Durchmesser mit Sicherungsring und einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324215122a.jpg;;;;;;;;JA;; 1-3 Tage ;89,00;79,00;45,39; JB Systems But the message box that pops up after FileReadLine contains this:
      䅲瑩步汮畭浥爻䕁中䍯摥㭁牴楫敬湡浥㭁牴楫敬来睩捨琻䉥獣桲敩扵湧㭋畲穢敳捨牥楢畮朻䕩来湳捨慦瑥渻呥捨湩獣桥ⵄ慴敮㭂楬搱㭂楬搲㭂楬搳㭂楬搴㭂楬搵㭂楬搶㭂楬搷㭂楬搸㭌楥晥牢慲㬢䱩敦敲扡爠䅢∻噥牳慮摺敩琻≕噐ⵐ牥楳⁩湫氮⁍睓琮∻≐牥楳ㄢ㬢偲敩猲∻䡥牳瑥汬敲ੌ㜳㌵㈷㈻㔴㈰〲㔶〲ㄲ㤻M楮椠䱩杨琠塌刀㬰⸱〰〻J䈠卹獴敭猠卣桷慮敮桡汳汥畣桴攠浩琠塌删䅮獣桬畳献;㬻㭨瑴瀺⼯睷眮數慭灬攮捯洯浥摩愯業慧敳⽯牧⽰楣㈰〷〱ㄴㄵ㌵〰愮橰朻㬻㬻㬻㭊䄻㬀ㄭ㌠呡来;ㄲⰹ〻ㄲⰹ〻㘬㔷㬀䩂⁓祳瑥浳 I've attached both files I use. 
      My guess is, that there's something wrong with the encoding, but I'm not sure how to fix it.
       
      Thank you very much for your help
      - virhonestum
      Encodingtester.au3
      TestFile.csv
    • 4bst1n3nz
      By 4bst1n3nz
      Hello,
      i need to save files with ANSI-Encoding. Since 3.3.14.2 Auto-It it doesn't work in any direction.
      I tried the following:
      #include <FileConstants.au3> FileDelete(@ScriptDir&"\Test.txt") $o = FileOpen(@ScriptDir&"\Test.txt", BitOR($FO_BINARY,$FO_ANSI,$FO_OVERWRITE)) FileWrite($o, "Test") FileClose($o) Or
      #include <FileConstants.au3> FileDelete(@ScriptDir&"\Test.txt") $o = FileOpen(@ScriptDir&"\Test.txt", 514) FileWrite($o, "Test") FileClose($o) Both create UTF-8 encoded files.
      What am i doing wrong?
      Thank you!