virhonestum

FileReadLine outputs gibberish

9 posts in this topic

Hey,

I've coma accross a very odd problem. I want to download a CSV-File, and process the contents.

This is the extremely simplyfied AutoIT-Code, given the file is already downloaded:

$f= @ScriptDir & "\TestFile.csv"

$file = FileOpen($f,0)
Local $line = FileReadLine($file)
MsgBox(0,"",$line)

The downloaded CSV file I want to process contains something like this:

Artikelnummer;EAN-Code;Artikelname;Artikelgewicht;Beschreibung;Kurzbeschreibung;Eigenschaften;Technische-Daten;Bild1;Bild2;Bild3;Bild4;Bild5;Bild6;Bild7;Bild8;Lieferbar;"Lieferbar Ab";Versandzeit;"UVP-Preis inkl. MwSt.";"Preis1";"Preis2";Hersteller
L7335272;5420025602129; Mini Light XLR ;0.1000; JB Systems Schwanenhalsleuchte mit XLR Anschluss. ;;;;http://www.example.com/media/images/org/pic20070114153500a.jpg;;;;;;;;JA;; 1-3 Tage ;12,90;12,90;6,57; JB Systems 
L3320502;540207025601636; Mini Light LED BNC ;0.1000; JB Systems LED Schwanenhalsleuchte mit BNC Anschluss. ;;;;http://www.example.com/media/images/org/pic20061231171705a.jpg;;;;;;;;JA;; 1-3 Tage ;29,90;25,89;15,26; JB Systems 
L1332254;542002556023143; Mini Light LED XLR ;0.1000; JB Systems LED Schwanenhalsleuchte mit XLR Anschluss. ;;;;http://www.example.com/media/images/org/pic20061231171728a.jpg;;;;;;;;JA;; 1-3 Tage ;29,90;25,89;15,26; JB Systems 
L8302591;504200256280277; Spiegelkugel 10cm ;0.5000; JB Systems Spiegelkugel 10cm Durchmesser mit einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324214825a.jpg;;;;;;;;JA;; 1-3 Tage ;5,50;4,90;2,81; JB Systems 
L7302932;542000256510222; Spiegelkugel 20cm ;0.8400; JB Systems Spiegelkugel 20cm Durchmesser mit einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324214907a.jpg;;;;;;;;JA;; 1-3 Tage ;12,90;11,50;6,58; JB Systems 
L2350293;534200562064239; Spiegelkugel 30cm ;2.1300; JB Systems Spiegelkugel 30 cm Durchmesser mit einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324214956a.jpg;;;;;;;;JA;; 1-3 Tage ;26,90;23,00;13,72; JB Systems 
L3302984;545200252024246; Spiegelkugel 40cm ;3.5000; JB Systems Spiegelkugel 40cm Durchmesser mit Sicherungsring und einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324215050a.jpg;;;;;;;;JA;; 1-3 Tage ;54,90;49,00;28,00; JB Systems 
L9302495;542205056225600; Spiegelkugel 50cm ;5.3900; JB Systems Spiegelkugel 50cm Durchmesser mit Sicherungsring und einer hohen Dichte durch 10 x 10 mm Echtglasspiegel. ;;;;http://www.example.com/media/images/org/pic20060324215122a.jpg;;;;;;;;JA;; 1-3 Tage ;89,00;79,00;45,39; JB Systems

But the message box that pops up after FileReadLine contains this:

䅲瑩步汮畭浥爻䕁中䍯摥㭁牴楫敬湡浥㭁牴楫敬来睩捨琻䉥獣桲敩扵湧㭋畲穢敳捨牥楢畮朻䕩来湳捨慦瑥渻呥捨湩獣桥ⵄ慴敮㭂楬搱㭂楬搲㭂楬搳㭂楬搴㭂楬搵㭂楬搶㭂楬搷㭂楬搸㭌楥晥牢慲㬢䱩敦敲扡爠䅢∻噥牳慮摺敩琻≕噐ⵐ牥楳⁩湫氮⁍睓琮∻≐牥楳ㄢ㬢偲敩猲∻䡥牳瑥汬敲ੌ㜳㌵㈷㈻㔴㈰〲㔶〲ㄲ㤻M楮椠䱩杨琠塌刀㬰⸱〰〻J䈠卹獴敭猠卣桷慮敮桡汳汥畣桴攠浩琠塌删䅮獣桬畳献;㬻㭨瑴瀺⼯睷眮數慭灬攮捯洯浥摩愯業慧敳⽯牧⽰楣㈰〷〱ㄴㄵ㌵〰愮橰朻㬻㬻㬻㭊䄻㬀ㄭ㌠呡来;ㄲⰹ〻ㄲⰹ〻㘬㔷㬀䩂⁓祳瑥浳

I've attached both files I use. 

My guess is, that there's something wrong with the encoding, but I'm not sure how to fix it.

 

Thank you very much for your help

- virhonestum

Encodingtester.au3

TestFile.csv

Share this post


Link to post
Share on other sites



#3 ·  Posted (edited)

1 hour ago, Jos said:

There are NUL characters in your file so the file is opened with the wrong encoding.

Jos

And how do I get rid of the NUL characters?Both _ReplaceStringInFile ($f, " ", "")  and _ReplaceStringInFile ($f, Chr(0), "") do not seem to work.

 

Thank you

- virhonestum

Edited by virhonestum

Share this post


Link to post
Share on other sites
Global Const $NULL = Chr(0)

;)


AutoIt.4.Life Clubrooms - Life is like a Donut (secret key)

Spoiler

My contributions to the AutoIt Community

Some messages & Apologizes:

If I hurt you, Please accept my apologies, I never (regardless of the situation) mean to hurt anybody!!!

Also, I am very busy with my project so I will appear in the last row of the online list, if you want to contact me: Email@TheDcoder.xyz

Or you can have a nice chat with me in freenode, I use the same nick on freenode too!

3fHNZJ.gif

PLEASE JOIN ##AutoIt AND HELP THE IRC AUTOIT COMMUNITY!

Share this post


Link to post
Share on other sites

Or just just use:

$file = FileOpen($f, 128) ;~ Opens the file using UTF8 encoding

Share this post


Link to post
Share on other sites
15 minutes ago, TheDcoder said:
Global Const $NULL = Chr(0)

;)

Thanks, but how exactly do I implement it? Just adding this line of code does nothing, and adding it and using $NULL in the _ReplaceInFile() call doesn't work either, since it is essentially the same as I already tried.

Share this post


Link to post
Share on other sites
42 minutes ago, Subz said:

Or just just use:

$file = FileOpen($f, 128) ;~ Opens the file using UTF8 encoding

Thank You!!!! This worked!

Share this post


Link to post
Share on other sites

Sorry never checked the full output, I know the following works:

#include <Array.au3>
#include <Excel.au3>

Local $oExcel = _Excel_Open()
    If @error Then Exit

Local $sWorkbook = @ScriptDir & '\TestFile.csv'
Local $oWorkbook = _Excel_BookOpen($oExcel, $sWorkbook)
    If @error Then Exit

Local $aResult = _Excel_RangeRead($oWorkbook, Default, $oWorkbook.ActiveSheet.Usedrange)
    If @error Then Exit

_ArrayDisplay($aResult)

 

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now

  • Similar Content

    • rudi
      By rudi
      Hello.
      I'm too stupid to see my mistake:
      To investigate the internal "dictionary" of TIFF files I'd like to read in the files in binary mode and to check, if there are more than one pages "in" this TIFF.
      Notepad++, "View as Hex" is presenting the first bytes as "49 49 2a 20 08 20 20 20 12" for the TIF attached to this posting
      The "TIFF Header Format" is easy:
      Offset 00h, 2 Byte = Byte Order, "II"=intel, "MM"=motorola. (I = 0x49)
      --> II
      Offset 02h, 2 Byte = Version Nr.
      Offset 04h, 4 Byte = pointer to first IFD entry
      Description of TIFF header: https://www.awaresystems.be/imaging/tiff/faq.html#q3
       

      Howto read and analyse the binary content correctly? This is my messy, not operational code:
       
      $sampleTiff="H:\daten\tif\11\11\111111.TIF" $h=FileOpen($sampleTiff,16) $content=FileRead($h) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $content = ' & $content & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console FileClose($h) $type=VarGetType($content) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $type = ' & $type & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console $ToString=BinaryToString($content) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $ToString = ' & $ToString & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console ConsoleWrite(@CRLF & @CRLF) $content=StringTrimLeft($content,2) ; cut off the leading "0x" ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $content = ' & $content & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console for $i = 1 to 8 step 8 $next=StringMid($content,$i,2) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $next = ' & $next & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console $Chr=BinaryToString($next) ConsoleWrite('@@ Debug(' & @ScriptLineNumber & ') : $Chr = ' & $Chr & @CRLF & '>Error code: ' & @error & @CRLF) ;### Debug Console ConsoleWrite(@CRLF & "---" & @CRLF) Next Regards, Rudi.
      111111.TIF
    • Jibberish
      By Jibberish
      I need to read log files into an array to search for errors. However when I display the array I get garbage or "chinese characters". Our developers say they are using UTF-8, but FileGetEncoding says the logs are "2048" or $FO_UTF16_BE_NOBOM (2048) = Use Unicode UTF16 Big Endian (without BOM) from the Encoding codes in FileOpen().
      There is an app called Detenc that detects the encoding used by files. You have to guess, but it returns correctly when I set the Encoder for UTF-8. I understand Encoding is not etched in stone, but the first character of the file is a capital B, using HxD Hex Editor.
      I even have another  topic here about running PowerShell to reencode the file so AutoIt will store the file properly in the array - See:
      So I am trying to figure out why AutoIt thinks my logs are not UTF-8.
      Here is sample code:
      #include <array.au3> #include <File.au3> Local $aRetArrayFile _FileReadToArray("C:\Logs\Myplayer1.log", $aRetArrayFile) _ArrayDisplay($aRetArrayFile) I won't post the results as it is illegible, but I did attach a screenshot of the _ArrayDisplay results, and this is the first line of the Log file:
      BANNER 10/10/2017 15:56:00 ====================================================================== And the Hex from the beginning of the file:
      42 41 4E 4E 45 52 20 31 30 2F 31 30 2F 32 30 31 37 20 31 34 3A 33 31 3A 33 35 20 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 3D 0D 0A 42 41 4E 4E 45 52 20 So I don't understand why AutoIt thinks the file is UTF16 BE.
      If I can get the Powershell script running, I have a workaround.
      BTW none of my other arrays display as garbage, just the log files.
      Weird.
      Rereading my post, what seems to be missing is the question. I guess my question is, does anyone know why these logs are being displayed incorrectly?
      Cheers
      Jibs

    • Jibberish
      By Jibberish
      Hi all,
      I need to read a log file into an array, but the log file is encoded as $FO_UTF16_BE_NOBOM (2048) = Use Unicode UTF16 Big Endian (without BOM) per FileGetEncoding (it returns 2048).
      I have searched how to convert these log files to UTF-8 and finally found a Powershell command. Since then I have been racking my brain trying to get the function to work. The command itself works from a Powerscript prompt:
      C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command Get-Content C:\Logs\Myplayer_10-10-17-02-31.log | Set-Content -Encoding utf8 C:\Logs\Myplayer1.log This is my sandbox;
      #include <array.au3> #include <File.au3> Local $aArrayLogFile Local $sLogDir = "C:\Logs\" Local $sLogFile = "Myplayer_10-10-17-02-31.log" Local $sConvertedLog = "ConvertedLog.log" Local $sLogDirFile = $sLogDir&$sLogFile RunWait("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command Get-Content "&$sLogDirFile&" | Set-Content -Encoding utf8 "&$sConvertedLog,$sLogDir) _FileReadToArray($sLogDirFile, $aArrayLogFile) _ArrayDisplay($aArrayLogFile) Also tried
      RunWait("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe -Command Get-Content "&$sLogDirFile&" | Set-Content -Encoding utf8 "&$sConvertedLog,$sLogDir) and
      ShellExecuteWait("C:\Windows\System32\WindowsPowerShell\v1.0\powershell.exe"," -Command Get-Content "&$sLogDirFile&" | Set-Content -Encoding utf8 "&$sConvertedLog,$sLogDir) Tried without -Command and a bunch of other parameters that were sprinkled throughout the internet from people trying to get this to work.
      Thanks
      Jibs
    • usmiv4o
      By usmiv4o
      #cs ---------------------------------------------------------------------------- AutoIt Version: 3.2.4.3 Author: usmiv4o Script Function: AutoIt script to check if files in directory are changed. It is usefull for security contra-inteligense measures. Function Name: LoadTripwireDB() Description: Loads database (text file tripwire.txt) and compare files in /test folder for changes. compares Hash (MD5) checksums. If they are not the same starts Initial() Function Name: Initial() Description: Checks directory and makes index of files and their MD5 checksums in text file (tripwire.txt) Function Name: Hush() Description: Checks file and returns its MD5 checksum. Requirement(s): Windows XP Return Value(s): On Success - Returns true. Files are the same as before. On Failure - return false. Example: LoadTripwireDB() #ce ---------------------------------------------------------------------------- #include <Crypt.au3> #include <File.au3> #include <Array.au3> $sDir = @ScriptDir & "\Test" $sFilePath = @ScriptDir & "\tripwire.txt" Func Hush(ByRef $sFile) $sRead = FileOpen( $sFile) $dHash = _Crypt_HashData($sRead, $CALG_MD5) ; Create a hash of the text entered. ConsoleWrite("Hash of file " & $sFile & " is " & $dHash & @CRLF) EndFunc ;ConsoleWrite("Files in Dir are " & $aScriptDir[0] & @CRLF) ;$sFilePath = @ScriptDir & "\Examples.txt" ;_FileWriteFromArray($sFilePath, $aScriptDir, 1) ;_ArrayDisplay($aScriptDir, "1D display") Func Initial() $aScriptDir = _FileListToArray($sDir) for $i = 1 To UBound($aScriptDir) - 1 $dHash = _Crypt_HashData($i, $CALG_MD5) ;ConsoleWrite("File " & $aScriptDir[$i] & " is " & $dHash & @CRLF) ConsoleWrite($aScriptDir[$i] & ":" & $dHash & @CRLF) ;Hush($aScriptDir[$i]) ;FileWrite $hFileOpen = FileOpen($sFilePath, $FO_APPEND) If $hFileOpen = -1 Then MsgBox($MB_SYSTEMMODAL, "", "An error occurred when reading the file.") EndIf FileWrite($hFileOpen, $aScriptDir[$i] & ":" & $dHash & @CRLF) Next EndFunc Func Monitor() $aScriptDir = _FileListToArray($sDir) for $i = 1 To UBound($aScriptDir) - 1 Next EndFunc Func LoadTripwireDB() $comparison_ok = false $dArray = _FileListToArray($sDir) ;directory $dArray0 = UBound($dArray) - 1 $fArray = FileReadToArray($sFilePath) ;file $fArray0 = UBound($fArray) ;_ArrayDisplay($dArray, "files array") if $dArray0 = $fArray0 Then ; are file same as recorded in txt file? ;ConsoleWrite("files in monitoring dir: " & $dArray[0] & " = file recorded: " & $fArray0 & @CRLF & $fArray[0]& @CRLF) for $i = 1 To UBound($dArray) - 1 ;ConsoleWrite("i = " & $i & @CRLF) $dHash = _Crypt_HashData($i, $CALG_MD5) ;binary ;$dHash = BinaryToString($dHash) $ffhash = StringSplit( $fArray[$i-1],":") $fhash = $ffhash[2] ;ConsoleWrite("IsBinary $dHash " & IsBinary($dHash) & @CRLF) if $dHash = $fhash Then ;if compared hashes are equal ;ConsoleWrite($fhash & ":" & $dHash & " equal" & @CRLF) ;ConsoleWrite("File: " & $fhash & @CRLF & "Directory: " & $dHash & @CRLF & "equal: yes " & @CRLF) Else ;if compared hashes are not equal ;ConsoleWrite("File: " & $fhash & @CRLF & "Directory: " & $dHash & @CRLF & "equal: not " & @CRLF) ;MsgBox(0,"hash md5",$fhash & ":" & $dHash & " not equal") EndIf Next ;ConsoleWrite("hashes are equal" & @CRLF) $comparison_ok = true Else ConsoleWrite("number of files in monitoring dir are not same as recorded" & @CRLF) ConsoleWrite("directory: " & $dArray[0] &":"& "files: " & UBound($fArray) - 1 & @CRLF) EndIf Return $comparison_ok EndFunc #main if LoadTripwireDB() = true Then ConsoleWrite(" hashes are equal " & @CRLF) ElseIf LoadTripwireDB() <> true Then ConsoleWrite(" hashes are not equal " & @CRLF) ConsoleWrite(" hashes are not equal " & @CRLF) Initial() EndIf  
      tripwire.au3
      tripwire.txt
    • 5ervant
      By 5ervant
      What's the best way to receive file from a desktop app?
      app.exe will execute a cmd with "au3file.exe /path/of/the/file.xml" and the au3file.exe will get and delete that. Or else? THE MOST IMPORTANT PART OF THE QUESTION
      And best way to transfer file to a desktop app?
      au3file.exe do a $_POST request and the app.exe MUST HAVE a local HTTP server that can receive $_POST, but it looks heavy 'cause the app must have a server such XAMPP. au3file.exe execute a cmd with "app.exe /path/of/the/file.xml" and the app.exe will now get that file and delete. Or else?