Jump to content
Sign in to follow this  
ATR

_inetgetsource block ?

Recommended Posts

ATR

Hi all,

I have a problem when I browse web links.

My script works fine while a moment, but it will block at a time... :(

some times it will be the 50th pass, and other times it will be the 2500th pass..

I don't know what to do

#include <Inet.au3>

#RequireAdmin

Opt("TrayAutoPause", 0)

TraySetState()

$i = 1
$x = 1
$Dossier = @WorkingDir & "\SIRENs\"
$Recherche = FileFindFirstFile($Dossier & "*.txt")
While 1
    $Fichier = FileFindNextFile($Recherche)
    If @error Then ExitLoop
    $Lecture_fichier = FileRead($Dossier & $Fichier)
    $Sirens = StringSplit($Lecture_fichier, ';', 2)
    $Fichiers = FileOpen(@WorkingDir & "\Sortie\SIRENS_" & $x & ".txt", 138)
    For $Siren In $Sirens
        TraySetToolTip("Siren n° : " & $i)
        FileWrite($Fichiers, RNCS_INSEE($Siren))
        $i += 1
        If $i = 15000 Then
            FileClose($Fichiers)
            $Fichiers = FileOpen(@WorkingDir & "\Sortie\SIRENS_" & $x & ".txt", 138)
            $i = 1
            $x += 1
        EndIf
    Next
WEnd
FileClose($Recherche)
;ConsoleWrite(RNCS_INSEE("356000000") & @LF)

Func RNCS_INSEE($Siren)
    Local $Data_xml = ""
    Local $Commune = ""
    Local $Validation = 0
    Local $Donnees_financieres = 0
    $URL = "http://www.bilansgratuits.fr/recherche/entreprise?rcs=" & $Siren
    Sleep(2000)
    $Source_HTML = _INetGetSource($URL, True)
    If Not StringInStr($Source_HTML, "dataNotFound") Then
        $Lien_societe = StringRegExp($Source_HTML, '(?s)tbodyEntreprise(.*?)</tbody>', 1)
        If @error = 0 Then
            $Lien_societe = StringRegExp($Lien_societe[0], '(?s)alternColumn.*?href="(.*?)"', 1)
            If @error = 0 Then
                $URL = "http://www.bilansgratuits.fr" & $Lien_societe[0]
                $Source_HTML = _INetGetSource($URL, True)
                $Balises = _StringBetween($Source_HTML, '<tr>', '</tr>')
                If @error = 0 Then
                    For $Balise = 0 To UBound($Balises) - 1
                        If StringInStr($Balises[$Balise], 'juridique') Then
                            $Data_xml &= RechercheRNCS($Balises[$Balise], 'forme_juridique')
                        ElseIf StringInStr($Balises[$Balise], 'Date de création') Then
                            $Data_xml &= RechercheRNCS($Balises[$Balise], 'date_creation')
                        EndIf
                    Next
                    $Source_HTML = _INetGetSource(StringTrimRight($URL, 4) & '/dirigeants.htm', True)
                    Local $Occurence = 1
                    While 1
                        $Gerants = StringRegExp($Source_HTML, '(?s)linkDirName.*?div.*?>(.*?)</div>', 1, $Occurence)
                        If @error = 0 Then
                            $Occurence = @extended
                        Else
                            ExitLoop
                        EndIf
                        For $i = 0 To UBound($Gerants) - 1
                            $Dirigeants = StringRegExpReplace($Gerants[$i], '\((.*?)\)', '')
                            If @error = 0 Then
                                $Dirigeants = StringLower(StringStripWS($Dirigeants, 7))
                                $Data_xml &= '<item><![CDATA[' & $Dirigeants & ']]></item>'
                            EndIf
                        Next
                    WEnd
                EndIf
                Return $Data_xml
            EndIf
        EndIf
    Else
        Return ""
    EndIf
    $Source_HTML = ""
EndFunc
Edited by ATR

Share this post


Link to post
Share on other sites
dragan

I've had some problems with inetget and inetread in the past, and thus I decided to switch to:

 

Local $oHTTP
Local $URL = 'http://www.bilansgratuits.fr/recherche/dirigeant.htm'
Local $Source_HTML

$oHTTP = ObjCreate("winhttp.winhttprequest.5.1")
$oHTTP.Open("GET", $URL)
$oHTTP.Send()
$oHTTP.WaitForResponse()
$Source_HTML = $oHTTP.Responsetext;<---- this is what you need
$oHTTP = 0

ConsoleWrite('!!!!============= START ==============' & @CRLF & _
            @CRLF & _
            $Source_HTML & @CRLF & _
            @CRLF & _
            '!!!!!============= END ================' & @CRLF)

Share this post


Link to post
Share on other sites
ATR

Thanks a lot ! It works fine :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×