Sign in to follow this  
Followers 0
tittoproject

Problem with "FileReadLine" and "UCS-2 Little Endian" encoding

9 posts in this topic

I'm trying to read the first line of .reg file generated by Regedit in Win XP Sp2.

The code I've used is:

$read = 'test2.reg'
$first_line = FileReadLine($read, 1)
$write = FileOpen('first_line.txt', 2)
FileWriteLine($write, $first_line)
FileClose($write)

The first line of "test2.reg" is: "Windows Registry Editor Version 5.00"

I get a file (first_line.txt) encoded in "UCS-2 Little Endian". Switching to "8-Bit" encoding, the result is: "àµ" :)

No problems if the .reg file is generated from Win 98 (the first line is "REGEDIT4").

Any hope to solve?

Share this post


Link to post
Share on other sites



I'm trying to read the first line of .reg file generated by Regedit in Win XP Sp2.

The code I've used is:

$read = 'test2.reg'
$first_line = FileReadLine($read, 1)
$write = FileOpen('first_line.txt', 2)
FileWriteLine($write, $first_line)
FileClose($write)

The first line of "test2.reg" is: "Windows Registry Editor Version 5.00"

I get a file (first_line.txt) encoded in "UCS-2 Little Endian". Switching to "8-Bit" encoding, the result is: "àµ" :(

No problems if the .reg file is generated from Win 98 (the first line is "REGEDIT4").

Any hope to solve?

When you say it is "UCS-2 Little Endian", do you mean the first bytes are FF FE?

Byte Order Marks (BOM):

FF FE = UCS-2 (16bit) little endian

FE FF = UCS-2 (16bit) big endian

FF FE 00 00 = UCS-4 (32 bit) little endian

00 00 FE FF = UCS-4 (32 bit) big endian

How did you switch to "8-Bit"? :)


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

That looks like ASCII to me, have you tried converting it?


Share this post


Link to post
Share on other sites

Try :

RunWait(@Comspec & " /c Type test2.reg > Test2b.reg")

And read the Test2b.reg file to see if that work.....


Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

#5 ·  Posted (edited)

See http://www.autoitscript.com/forum/index.ph...findpost&p=7061

Are you using

REGEDIT /E pathname "RegPath"

or

REG EXPORT pathname "RegPath"

I wonder if it would make a difference as to whether the output is unicode or ascii....

Edit: JdeB beat me to it....

While I'm editing, I'll add some links:

http://www.ss64.com/nt/regedit.html

and

http://www.microsoft.com/resources/documen...g.mspx?mfr=true

Edited by CyberSlug

Use Mozilla | Take a look at My Disorganized AutoIt stuff | Very very old: AutoBuilder 11 Jan 2005 prototype I need to update my sig!

Share this post


Link to post
Share on other sites

Try :

RunWait(@Comspec & " /c Type test2.reg > Test2b.reg")

And read the Test2b.reg file to see if that work.....

Works fine!!!

Thanks a lot to everybody!

Share this post


Link to post
Share on other sites

Works fine!!!

Thanks a lot to everybody!

Explanation: the Type command will convert a Unicode file to a regular file. :)

Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

#8 ·  Posted (edited)

Try :

RunWait(@Comspec & " /c Type test2.reg > Test2b.reg")

And read the Test2b.reg file to see if that work.....

Well, that took all the fun out of it... but I was using it as an excuse to see how Chr() reacted to unicode and played with an exported reg key (SOFTWARE\Mozilla key exported to C:\Temp\Mozilla.reg) since it will be usefull for another project:

; Test reading unicode file
#include<array.au3>
#include<string.au3>

$UniFile = "C:\Temp\Mozilla.reg"

$FileHandle = FileOpen($UniFile, 0)
$FileData = _StringToHex(FileRead($FileHandle))
FileClose($FileHandle)

$BOM4 = StringLeft($FileData, 4)
$BOM8 = StringLeft($FileData, 8)

Select
    Case $BOM8 = "0000FEFF"
        $UniCode = "UCS-4BE 32-bit Unicode - Big Endian"
        $BOM = $BOM8
    Case $BOM8 = "FFFE0000"
        $UniCode = "UCS-4LE 32-bit Unicode - Little Endian"
        $BOM = $BOM8
    Case $BOM4 = "FEFF"
        $UniCode = "UCS-2BE 16-bit Unicode - Big Endian"
        $BOM = $BOM4
    Case $BOM4 = "FFFE"
        $UniCode = "UCS-2LE 16-bit Unicode - Little Endian"
        $BOM = $BOM4
    Case Else
        MsgBox(64, "Unicode Test", "Unicode type unidentified!  First 8 hex characters = " & $BOM8)
        Exit
EndSelect
$BomLen = StringLen($BOM)   
$WorkingHex = StringTrimLeft($FileData, $BomLen)
$HexLen = StringLen($WorkingHex)
$CharLen = $HexLen / $BomLen

MsgBox(64, "Unicode Test", "First " & $BomLen & " hex characters = " & $BOM & " = " & $UniCode & @CRLF & @CRLF & _
        "This file is " & $HexLen & " hex characters long (minus the BOM), which is " & $CharLen & " Unicode characters.")

$WorkingHex = StringLeft($WorkingHex, 32 * $BomLen)
Dim $UniArray[1] = [0]
For $n = 1 To 32
    _ArrayAdd($UniArray, StringLeft($WorkingHex, $BomLen))
    $WorkingHex = StringTrimLeft($WorkingHex, $BomLen)
Next
$UniArray[0] = UBound($UniArray) - 1

_ArrayDisplay($UniArray, "First 32 hex char")

If StringInStr($UniCode, "Little") Then
    For $n = 1 To $UniArray[0]
        $NewChar = ""
        For $c = ($BomLen - 1) To 1 Step -2
            $NewChar = $NewChar & StringMid($UniArray[$n], $c, 2)
        Next
        $UniArray[$n] = $NewChar
    Next
    _ArrayDisplay($UniArray, "Endian Converted")
EndIf

For $n = 1 To $UniArray[0]
    $UniArray[$n] = Chr("0x" & $UniArray[$n])
Next
_ArrayDisplay($UniArray, "ASCII Converted")

Not needed, but I learned something from it... :)

P.S. I didn't make clear above: Chr() does not like unicode at all, hence the conversion in my script...

Edited by PsaltyDS

Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Here a little function that converts a .reg file from Regedit5 format to Regedit4 format.

It uses the "type" command to convert Unicode to 8 Bit, and then replaces the first string from "Windows Registry Editor Version 5.00" to "REGEDIT4".

Func _REGEDIT4($regfile5)
    Local $regfile4, $tempdir, $write, $read, $l, $line
    If FileReadLine($regfile5, 1) <> 'REGEDIT4' Then
        $tempdir = @TempDir & '\Regedit4'
        $regfile4 = $tempdir & '\regedit4.reg'
        DirCreate($tempdir)
        RunWait(@ComSpec & ' /c type "' & $regfile5 & '" > "' & $regfile4 & '"', '', @SW_HIDE)
        $write = FileOpen($regfile5, 2)
        $read = FileOpen($regfile4, 0)
        FileWriteLine($write, 'REGEDIT4')
        $l = 2
        While 1
            $line = FileReadLine($read, $l)
            If @error = -1 Then ExitLoop
            FileWriteLine($write, $line)
            $l = $l + 1
        WEnd
        FileClose($write)
        FileClose($read)
        DirRemove($tempdir, 1)
    EndIf
EndFunc

It seems working... :)

A little doubt: does the "type" command work on Win 98?

Edited by tittoproject

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0