ATR Posted August 26, 2013 Posted August 26, 2013 (edited) Hi all, I have a problem when I browse web links. My script works fine while a moment, but it will block at a time... some times it will be the 50th pass, and other times it will be the 2500th pass.. I don't know what to do expandcollapse popup#include <Inet.au3> #RequireAdmin Opt("TrayAutoPause", 0) TraySetState() $i = 1 $x = 1 $Dossier = @WorkingDir & "\SIRENs\" $Recherche = FileFindFirstFile($Dossier & "*.txt") While 1 $Fichier = FileFindNextFile($Recherche) If @error Then ExitLoop $Lecture_fichier = FileRead($Dossier & $Fichier) $Sirens = StringSplit($Lecture_fichier, ';', 2) $Fichiers = FileOpen(@WorkingDir & "\Sortie\SIRENS_" & $x & ".txt", 138) For $Siren In $Sirens TraySetToolTip("Siren n° : " & $i) FileWrite($Fichiers, RNCS_INSEE($Siren)) $i += 1 If $i = 15000 Then FileClose($Fichiers) $Fichiers = FileOpen(@WorkingDir & "\Sortie\SIRENS_" & $x & ".txt", 138) $i = 1 $x += 1 EndIf Next WEnd FileClose($Recherche) ;ConsoleWrite(RNCS_INSEE("356000000") & @LF) Func RNCS_INSEE($Siren) Local $Data_xml = "" Local $Commune = "" Local $Validation = 0 Local $Donnees_financieres = 0 $URL = "http://www.bilansgratuits.fr/recherche/entreprise?rcs=" & $Siren Sleep(2000) $Source_HTML = _INetGetSource($URL, True) If Not StringInStr($Source_HTML, "dataNotFound") Then $Lien_societe = StringRegExp($Source_HTML, '(?s)tbodyEntreprise(.*?)</tbody>', 1) If @error = 0 Then $Lien_societe = StringRegExp($Lien_societe[0], '(?s)alternColumn.*?href="(.*?)"', 1) If @error = 0 Then $URL = "http://www.bilansgratuits.fr" & $Lien_societe[0] $Source_HTML = _INetGetSource($URL, True) $Balises = _StringBetween($Source_HTML, '<tr>', '</tr>') If @error = 0 Then For $Balise = 0 To UBound($Balises) - 1 If StringInStr($Balises[$Balise], 'juridique') Then $Data_xml &= RechercheRNCS($Balises[$Balise], 'forme_juridique') ElseIf StringInStr($Balises[$Balise], 'Date de création') Then $Data_xml &= RechercheRNCS($Balises[$Balise], 'date_creation') EndIf Next $Source_HTML = _INetGetSource(StringTrimRight($URL, 4) & '/dirigeants.htm', True) Local $Occurence = 1 While 1 $Gerants = StringRegExp($Source_HTML, '(?s)linkDirName.*?div.*?>(.*?)</div>', 1, $Occurence) If @error = 0 Then $Occurence = @extended Else ExitLoop EndIf For $i = 0 To UBound($Gerants) - 1 $Dirigeants = StringRegExpReplace($Gerants[$i], '\((.*?)\)', '') If @error = 0 Then $Dirigeants = StringLower(StringStripWS($Dirigeants, 7)) $Data_xml &= '<item><![CDATA[' & $Dirigeants & ']]></item>' EndIf Next WEnd EndIf Return $Data_xml EndIf EndIf Else Return "" EndIf $Source_HTML = "" EndFunc Edited August 26, 2013 by ATR
dragan Posted August 26, 2013 Posted August 26, 2013 I've had some problems with inetget and inetread in the past, and thus I decided to switch to: Local $oHTTP Local $URL = 'http://www.bilansgratuits.fr/recherche/dirigeant.htm' Local $Source_HTML $oHTTP = ObjCreate("winhttp.winhttprequest.5.1") $oHTTP.Open("GET", $URL) $oHTTP.Send() $oHTTP.WaitForResponse() $Source_HTML = $oHTTP.Responsetext;<---- this is what you need $oHTTP = 0 ConsoleWrite('!!!!============= START ==============' & @CRLF & _ @CRLF & _ $Source_HTML & @CRLF & _ @CRLF & _ '!!!!!============= END ================' & @CRLF)
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now