Jump to content

Byte Order Marker (BOM) check for Unicode


daddydave
 Share

Recommended Posts

There's probably better ways to do this, but I couldn't find anything in the forums for "Byte Order Marker" or BOM, so here you go:

; Function Name:   _BOMCheck()
;
; Description:     Determines whether a given file is ANSI,
;                  UTF-16 Little Endian, UTF-16 Big Endian, or UTF-8
;
; Syntax:          _BOMCheck ( $filename )
;
; Parameter(s):    $filename   = The file to check

;
; Requirement(s):  Must be Unicode build of AutoIt v3.2.4.0 or later.
;
; Return Value(s): ANSI or Unsupported:    Returns 0
;                  UTF-16 Little Endian:   Returns 32
;                  UTF-16 Big Endian:      Returns 64
;                  UTF-8:                  Returns 128
;                  Problem opening file:   Returns -1
;                  The general idea with the return value is that it can
;                  be used in the calculation of the FileOpen mode


; Author:          David Eason

; Sample usage:
;While 1
;    Local $file = FileOpenDialog("Choose a file to determine Ansi or Unicode", @DesktopDir & "\", "Text files (*.txt;*.inf)", 3)
;    If @error = 1 Then ExitLoop
;    MsgBox(0, "", _BOMCheck($file))
;WEnd


Func _BOMCheck(Const ByRef $filename)

    ; Supported Byte Order Markers for non-ANSI
    Local $BOMS[3]
    $BOMS[0] = Binary("0xFFFE") ;UTF-16 Little Endian
    $BOMS[1] = Binary("0xFEFF") ;UTF-16 Big Endian
    $BOMS[2] = Binary("0xEFBBBF") ;UTF-8

    ; Corresponding mode bit for FileOpen
    Local $FileModes[3]
    $FileModes[0] = 32
    $FileModes[1] = 64
    $FileModes[2] = 128

    Local $FH = FileOpen($filename, 4)

    If $FH = -1 Then
        Return -1
    EndIf
    Local $FirstBytes = FileRead($FH, 3)

    If @error = -1 Then
        FileClose($FH)
        Return -1
    EndIf

    Local $I
    For $I = 0 To 2
        If BinaryMid($FirstBytes, 1, BinaryLen($BOMS[$I])) = BinaryMid($BOMS[$I], 1, BinaryLen($BOMS[$I])) Then
            FileClose($FH)
            Return $FileModes[$I]
        EndIf
    Next

    ; If still here, then it is presumed ANSI
    FileClose($FH)
    Return 0

EndFunc   ;==>_BOMCheck
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...