Modify

Opened 9 years ago

Closed 4 years ago

#3137 closed Bug (Fixed)

FileRead() treats count parameter as bytes instead of characters for UTF-8 files

Reported by: miraged Owned by:
Milestone: Component: AutoIt
Version: 3.3.14.2 Severity: None
Keywords: Cc:

Description

When reading from UTF-8 files (with or without a BOM) the count parameter is treated as the number of bytes rather than the number of characters. UTF-16 and ANSI work as expected. Best case you get less characters than expected, worst case you get partial bytes in the returned string. Tested on 3.3.10.2, 3.3.14.2 and 3.3.15.0.
Example is attached.

Attachments (1)

FileRead_UTF-8_Bug.au3 (1.7 KB) - added by miraged 9 years ago.
Repro

Download all attachments as: .zip

Change History (4)

Changed 9 years ago by miraged

Repro

comment:1 Changed 6 years ago by BrewManNH

Running that test file shows me something different.

It looks like StringLen is at fault here and not FileRead. If you do a ConsoleWrite right after the FileRead, and output the @extended you will see that it's always reading 7 characters/bytes like it's supposed to, but stringlen reports the wrong information. Look at the Starting and Ending offsets, and they're identical between the first and second tests.

comment:2 Changed 6 years ago by jchd18

Furthermore, the poster uses files with BOM, so that shifts the byteread count.

comment:3 Changed 4 years ago by jchd18

  • Resolution set to Fixed
  • Status changed from new to closed

Current release/beta versions of AutoIt work correctly; the "repro" code is wrong.

This simple code

Local $f = "len.txt"
FileWrite($f, "€€€")
Local $s = FileRead($f)
ConsoleWrite(@extended & @TAB & StringLen($s) & @LF)
FileDelete($f)

correctly yields
9 3
since '€' uses 3 bytes in UTF8.

Guidelines for posting comments:

  • You cannot re-open a ticket but you may still leave a comment if you have additional information to add.
  • In-depth discussions should take place on the forum.

For more information see the full version of the ticket guidelines here.

Add Comment

Modify Ticket

Action
as closed The ticket will remain with no owner.
Author


E-mail address and user name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.