Jump to content

Problem with binary data


 Share

Recommended Posts

Hi all,

I have a very big problem !!!

I write in a file data. When I write in this file within variable for filename it works correctly :-)

But if I use a variable for filename the string format is bad !

FileWrite(@HomeDrive & "TEST.xml", $Data_xml)

FileWrite($Path, $Data_xml)

With a variable the result is : R걡ration d'ordinateurs et d'겵ipements p곩ph곩ques

and if I don't use variable the result is correct : Réparation d'ordinateurs et d'équipements périphériques]

 

Link to comment
Share on other sites

#include <IE.au3>
#include <File.au3>
#include <String.au3>

Global $Data_xml = ""
$Emplacement_enregistrement = "c:\data.xml"
$Enregistrement_XML = 0
$IE = _IECreate("http://www.pagesjaunes.fr/annuaire/chalon-sur-saone-71/informatique-conseils-services-maintenanc", 0, 0)
If @error Then
    Exit
EndIf
$SourceHTML = _IEPropertyGet($IE, "outerhtml")
Local $Div_resultats = _IEGetObjById($IE, "reformulationZone")
If Not @error Then
    $Texte_resultats = _IEPropertyGet($Div_resultats, "innertext")
    $Nombre_resultats = StringRegExp(StringTrimLeft($Texte_resultats, StringInStr($Texte_resultats, ':', 0, -1)), '(?s)(\d+)', 1)
    If @error = 0 Then
        FileWrite($Emplacement_enregistrement, '<?xml version="1.0" encoding="UTF-8"?>' & @CRLF & '<root>' & @CRLF)
        $Nombre_pages = Ceiling($Nombre_resultats[0] / 20)
        For $Page = 1 To $Nombre_pages
            For $Resultats = 1 To $Nombre_resultats[0]
                $Data_xml = ""
                $div = _IEGetObjById($IE, 'lrVisitCard-'& $Resultats)
                $Bloc_entreprise = _IEPropertyGet($div, "innerhtml")
                $Nom_societe = StringRegExp($Bloc_entreprise, '(?s)<h2.*?<span>(.*?)</span>.*?h2>', 1)
                If @error = 0 Then
                    $Data_xml &= "<enseigne><![CDATA[" & $Nom_societe[0] & "]]></enseigne>" & @CRLF
                EndIf
                $Container_adresse_div = _StringBetween($Bloc_entreprise, '<div class="localisationBlock">', '</div>')
                If @error = 0 Then
                    $Container_adresse_p = _StringBetween($Container_adresse_div[0], '<p>', '</p>')
                    If @error = 0 Then
                        $HTML_adresse = StringRegExpReplace($Container_adresse_p[0], '(<br>)', @CRLF)
                        If @error = 0 Then
                            $HTML_adresse = StringRegExpReplace($HTML_adresse, '(<.*?>)', '')
                            If @error = 0 Then
                                $Adresse = StringStripWS(StringRegExpReplace(StringRegExpReplace(StringRegExpReplace(StringRegExpReplace($HTML_adresse, "&nbsp;", " "), " r ", " rue "), " av ", " avenue "), " rte ", " route "), 3)
                                $Code_postal = StringRegExp($Adresse, '\d{5}', 1)
                                If @error = 0 Then
                                    $Data_xml &= "<CP><![CDATA[" & $Code_postal[0] & "]]></CP>" & @CRLF
                                    If @error = 0 Then
                                        $Position = StringInStr($Adresse, $Code_postal[0])
                                        $Ville = StringTrimLeft($Adresse, $Position + 5)
                                    EndIf
                                    $Adresse = StringStripWS(StringLeft($Adresse, $Position -2), 7)
                                    $Data_xml &= "<adresse><![CDATA[" & StringStripWS(StringLower($Adresse), 7) & "]]></adresse>" & @CRLF
                                EndIf
                            EndIf
                        EndIf
                    EndIf
                EndIf
                FileWrite(@HomeDrive & "\test.xml", $Data_xml)
                FileWrite($Emplacement_enregistrement, $Data_xml)
            Next
        Next
        _FileWriteToLine($Emplacement_enregistrement, 1, '<?xml version="1.0" encoding="UTF-8"?>' & @CRLF & '<root>' & @CRLF)
    EndIf
EndIf
_IEQuit($IE)

Link to comment
Share on other sites

ATR,

Not sure what is happenning.  You are running two successive Filewrites referencing the same filename (presumably overwriting the file).  When I change the filename for the second Filewrite it appears to work.

#include <IE.au3>
#include <File.au3>
#include <String.au3>

Global $Data_xml = ""
$Emplacement_enregistrement = "c:\data.xml"
$Enregistrement_XML = 0
$IE = _IECreate("http://www.pagesjaunes.fr/annuaire/chalon-sur-saone-71/informatique-conseils-services-maintenanc", 0, 0)
If @error Then
    Exit
EndIf
$SourceHTML = _IEPropertyGet($IE, "outerhtml")
Local $Div_resultats = _IEGetObjById($IE, "reformulationZone")
If Not @error Then
    $Texte_resultats = _IEPropertyGet($Div_resultats, "innertext")
    $Nombre_resultats = StringRegExp(StringTrimLeft($Texte_resultats, StringInStr($Texte_resultats, ':', 0, -1)), '(?s)(\d+)', 1)
    If @error = 0 Then
        FileWrite($Emplacement_enregistrement, '<?xml version="1.0" encoding="UTF-8"?>' & @CRLF & '<root>' & @CRLF)
        $Nombre_pages = Ceiling($Nombre_resultats[0] / 20)
        For $Page = 1 To $Nombre_pages
            For $Resultats = 1 To $Nombre_resultats[0]
                $Data_xml = ""
                $div = _IEGetObjById($IE, 'lrVisitCard-'& $Resultats)
                $Bloc_entreprise = _IEPropertyGet($div, "innerhtml")
                $Nom_societe = StringRegExp($Bloc_entreprise, '(?s)<h2.*?<span>(.*?)</span>.*?h2>', 1)
                If @error = 0 Then
                    $Data_xml &= "<enseigne><![CDATA[" & $Nom_societe[0] & "]]></enseigne>" & @CRLF
                EndIf
                $Container_adresse_div = _StringBetween($Bloc_entreprise, '<div class="localisationBlock">', '</div>')
                If @error = 0 Then
                    $Container_adresse_p = _StringBetween($Container_adresse_div[0], '<p>', '</p>')
                    If @error = 0 Then
                        $HTML_adresse = StringRegExpReplace($Container_adresse_p[0], '(<br>)', @CRLF)
                        If @error = 0 Then
                            $HTML_adresse = StringRegExpReplace($HTML_adresse, '(<.*?>)', '')
                            If @error = 0 Then
                                $Adresse = StringStripWS(StringRegExpReplace(StringRegExpReplace(StringRegExpReplace(StringRegExpReplace($HTML_adresse, "&nbsp;", " "), " r ", " rue "), " av ", " avenue "), " rte ", " route "), 3)
                                $Code_postal = StringRegExp($Adresse, '\d{5}', 1)
                                If @error = 0 Then
                                    $Data_xml &= "<CP><![CDATA[" & $Code_postal[0] & "]]></CP>" & @CRLF
                                    If @error = 0 Then
                                        $Position = StringInStr($Adresse, $Code_postal[0])
                                        $Ville = StringTrimLeft($Adresse, $Position + 5)
                                    EndIf
                                    $Adresse = StringStripWS(StringLeft($Adresse, $Position -2), 7)
                                    $Data_xml &= "<adresse><![CDATA[" & StringStripWS(StringLower($Adresse), 7) & "]]></adresse>" & @CRLF
                                EndIf
                            EndIf
                        EndIf
                    EndIf
                EndIf
                FileWrite(@HomeDrive & "\test1.xml", $Data_xml)     ; <<<------- changed file name so next stmt does'nt overwrite
                FileWrite($Emplacement_enregistrement, $Data_xml)
            Next
        Next
        _FileWriteToLine($Emplacement_enregistrement, 1, '<?xml version="1.0" encoding="UTF-8"?>' & @CRLF & '<root>' & @CRLF)
    EndIf
EndIf
_IEQuit($IE)

kylomas

Forum Rules         Procedure for posting code

"I like pigs.  Dogs look up to us.  Cats look down on us.  Pigs treat us as equals."

- Sir Winston Churchill

Link to comment
Share on other sites

you should always write to a file with FileOpen()

[optional] Mode to open the file in.
Can be a combination of the following:
  0 = Read mode (default)
  1 = Write mode (append to end of file)
  2 = Write mode (erase previous contents)
  8 = Create directory structure if it doesn't exist (See Remarks).
  16 = Force binary mode (See Remarks).
  32 = Use Unicode UTF16 Little Endian reading and writing mode. Reading does not override existing BOM.
  64 = Use Unicode UTF16 Big Endian reading and writing mode. Reading does not override existing BOM.
  128 = Use Unicode UTF8 (with BOM) reading and writing mode. Reading does not override existing BOM.
  256 = Use Unicode UTF8 (without BOM) reading and writing mode.
  16384 = When opening for reading and no BOM is present, use full file UTF8 detection. If this is not used then only the initial part of the file is checked for UTF8.
The folder path must already exist (except using mode '8' - See Remarks).

.

from the fileopen helpfile. you need to specify the coding (probably utf16 little endian).

cheers E.

Edited by Edano

[color=rgb(255,0,0);][font="'comic sans ms', cursive;"]FukuLeaks[/color][/font]

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...