Jump to content

Problem with "FileReadLine" and "UCS-2 Little Endian" encoding


Recommended Posts

I'm trying to read the first line of .reg file generated by Regedit in Win XP Sp2.

The code I've used is:

$read = 'test2.reg'
$first_line = FileReadLine($read, 1)
$write = FileOpen('first_line.txt', 2)
FileWriteLine($write, $first_line)
FileClose($write)

The first line of "test2.reg" is: "Windows Registry Editor Version 5.00"

I get a file (first_line.txt) encoded in "UCS-2 Little Endian". Switching to "8-Bit" encoding, the result is: "àµ" :)

No problems if the .reg file is generated from Win 98 (the first line is "REGEDIT4").

Any hope to solve?

Link to comment
Share on other sites

I'm trying to read the first line of .reg file generated by Regedit in Win XP Sp2.

The code I've used is:

$read = 'test2.reg'
$first_line = FileReadLine($read, 1)
$write = FileOpen('first_line.txt', 2)
FileWriteLine($write, $first_line)
FileClose($write)

The first line of "test2.reg" is: "Windows Registry Editor Version 5.00"

I get a file (first_line.txt) encoded in "UCS-2 Little Endian". Switching to "8-Bit" encoding, the result is: "àµ" :(

No problems if the .reg file is generated from Win 98 (the first line is "REGEDIT4").

Any hope to solve?

When you say it is "UCS-2 Little Endian", do you mean the first bytes are FF FE?

Byte Order Marks (BOM):

FF FE = UCS-2 (16bit) little endian

FE FF = UCS-2 (16bit) big endian

FF FE 00 00 = UCS-4 (32 bit) little endian

00 00 FE FF = UCS-4 (32 bit) big endian

How did you switch to "8-Bit"? :)

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

  • Developers

Try :

RunWait(@Comspec & " /c Type test2.reg > Test2b.reg")

And read the Test2b.reg file to see if that work.....

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

See http://www.autoitscript.com/forum/index.ph...findpost&p=7061

Are you using

REGEDIT /E pathname "RegPath"

or

REG EXPORT pathname "RegPath"

I wonder if it would make a difference as to whether the output is unicode or ascii....

Edit: JdeB beat me to it....

While I'm editing, I'll add some links:

http://www.ss64.com/nt/regedit.html

and

http://www.microsoft.com/resources/documen...g.mspx?mfr=true

Edited by CyberSlug
Use Mozilla | Take a look at My Disorganized AutoIt stuff | Very very old: AutoBuilder 11 Jan 2005 prototype I need to update my sig!
Link to comment
Share on other sites

  • Developers

Works fine!!!

Thanks a lot to everybody!

Explanation: the Type command will convert a Unicode file to a regular file. :)

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Try :

RunWait(@Comspec & " /c Type test2.reg > Test2b.reg")

And read the Test2b.reg file to see if that work.....

Well, that took all the fun out of it... but I was using it as an excuse to see how Chr() reacted to unicode and played with an exported reg key (SOFTWARE\Mozilla key exported to C:\Temp\Mozilla.reg) since it will be usefull for another project:

; Test reading unicode file
#include<array.au3>
#include<string.au3>

$UniFile = "C:\Temp\Mozilla.reg"

$FileHandle = FileOpen($UniFile, 0)
$FileData = _StringToHex(FileRead($FileHandle))
FileClose($FileHandle)

$BOM4 = StringLeft($FileData, 4)
$BOM8 = StringLeft($FileData, 8)

Select
    Case $BOM8 = "0000FEFF"
        $UniCode = "UCS-4BE 32-bit Unicode - Big Endian"
        $BOM = $BOM8
    Case $BOM8 = "FFFE0000"
        $UniCode = "UCS-4LE 32-bit Unicode - Little Endian"
        $BOM = $BOM8
    Case $BOM4 = "FEFF"
        $UniCode = "UCS-2BE 16-bit Unicode - Big Endian"
        $BOM = $BOM4
    Case $BOM4 = "FFFE"
        $UniCode = "UCS-2LE 16-bit Unicode - Little Endian"
        $BOM = $BOM4
    Case Else
        MsgBox(64, "Unicode Test", "Unicode type unidentified!  First 8 hex characters = " & $BOM8)
        Exit
EndSelect
$BomLen = StringLen($BOM)   
$WorkingHex = StringTrimLeft($FileData, $BomLen)
$HexLen = StringLen($WorkingHex)
$CharLen = $HexLen / $BomLen

MsgBox(64, "Unicode Test", "First " & $BomLen & " hex characters = " & $BOM & " = " & $UniCode & @CRLF & @CRLF & _
        "This file is " & $HexLen & " hex characters long (minus the BOM), which is " & $CharLen & " Unicode characters.")

$WorkingHex = StringLeft($WorkingHex, 32 * $BomLen)
Dim $UniArray[1] = [0]
For $n = 1 To 32
    _ArrayAdd($UniArray, StringLeft($WorkingHex, $BomLen))
    $WorkingHex = StringTrimLeft($WorkingHex, $BomLen)
Next
$UniArray[0] = UBound($UniArray) - 1

_ArrayDisplay($UniArray, "First 32 hex char")

If StringInStr($UniCode, "Little") Then
    For $n = 1 To $UniArray[0]
        $NewChar = ""
        For $c = ($BomLen - 1) To 1 Step -2
            $NewChar = $NewChar & StringMid($UniArray[$n], $c, 2)
        Next
        $UniArray[$n] = $NewChar
    Next
    _ArrayDisplay($UniArray, "Endian Converted")
EndIf

For $n = 1 To $UniArray[0]
    $UniArray[$n] = Chr("0x" & $UniArray[$n])
Next
_ArrayDisplay($UniArray, "ASCII Converted")

Not needed, but I learned something from it... :)

P.S. I didn't make clear above: Chr() does not like unicode at all, hence the conversion in my script...

Edited by PsaltyDS
Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
Link to comment
Share on other sites

Here a little function that converts a .reg file from Regedit5 format to Regedit4 format.

It uses the "type" command to convert Unicode to 8 Bit, and then replaces the first string from "Windows Registry Editor Version 5.00" to "REGEDIT4".

Func _REGEDIT4($regfile5)
    Local $regfile4, $tempdir, $write, $read, $l, $line
    If FileReadLine($regfile5, 1) <> 'REGEDIT4' Then
        $tempdir = @TempDir & '\Regedit4'
        $regfile4 = $tempdir & '\regedit4.reg'
        DirCreate($tempdir)
        RunWait(@ComSpec & ' /c type "' & $regfile5 & '" > "' & $regfile4 & '"', '', @SW_HIDE)
        $write = FileOpen($regfile5, 2)
        $read = FileOpen($regfile4, 0)
        FileWriteLine($write, 'REGEDIT4')
        $l = 2
        While 1
            $line = FileReadLine($read, $l)
            If @error = -1 Then ExitLoop
            FileWriteLine($write, $line)
            $l = $l + 1
        WEnd
        FileClose($write)
        FileClose($read)
        DirRemove($tempdir, 1)
    EndIf
EndFunc

It seems working... :)

A little doubt: does the "type" command work on Win 98?

Edited by tittoproject
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...