faustf

regexp and array problems

5 posts in this topic

#1 ·  Posted (edited)

hi  guy i try to  take   a list of  state  by  this  site  www.mapanet.es/IT

i  create  a siple script 

but   when  i try to run  not  go  , some one  can help me  ,  i think the problem  is regrexp 

 

#cs ----------------------------------------------------------------------------

 AutoIt Version: 3.3.12.0
 Author:         myName

 Script Function:
    Template AutoIt script.

    MAPANET GRABBER
    BY faustf
#ce ----------------------------------------------------------------------------


; Script Start - Add your code below here
#include <IE.au3>
#include <MsgBoxConstants.au3>
#include <StringConstants.au3>
#include <InetConstants.au3>
#include <WinAPI.au3>
#include <WinAPIsysinfoConstants.au3>
#include <WindowsConstants.au3>
#include <GDIPlus.au3>
#include <Misc.au3>
#include <INet.au3>
#include <Excel.au3>
#include <File.au3>
#include <Array.au3>

; # VARIABILI GLOBALI ===============================================================================================
Global $oIE
; ===================================================================================================================

If ProcessExists("iexplore.exe") Then ; Check if the internet esplorer process is running.
    ProcessClose("iexplore.exe")
EndIf

If ProcessExists("EXCEL.exe") Then ; Check if the excel process is running.
    ProcessClose("EXCEL.exe")
EndIf


_entra_dentro() ;  entra dentro il sito web e  arriva fino a fantaricerca e primo livello gruppi ,es. audio borse cd dvd  elttrodomestici
_excel_Crea()
_1_step()

Func _1_step() ;  ENTRO DENTRO e listo gli stati
    ConsoleWrite('@@ (79) :(' & @MIN & ':' & @SEC & ') _1_step()' & @CR) ;### Function Trace

Local $stati_grab = _IEBodyReadHTML($oIE)
MsgBox (0,'',$stati_grab)
Local $aArray_stati = StringRegExp($stati_grab, '(?s)<a href="Postal_Codes/\?C=(.*?)" itemprop="url">(.*?)</a></span></td></tr>', $STR_REGEXPARRAYGLOBALMATCH)
_ArrayDisplay ($aArray_stati,"de ma de")
For $i = 0 To UBound($aArray_stati) - 1
    MsgBox (0,'',$aArray_stati[$i])
Next

EndFunc







;************************************************************************************************************************************************
;--------------------------  FUNZIONI ASCRIVIBILI AD  UDF  AGGIUNTIVA ---------------------------------------------------------------------------
;************************************************************************************************************************************************

; #FUNCTION# ====================================================================================================================
; Author ........: faustf
; Modified.......:
; What do........: apre foglio excel
; ===============================================================================================================================


Func _excel_Crea()
    ConsoleWrite('@@ (1144) :(' & @MIN & ':' & @SEC & ') _excel_Crea()' & @CR) ;### Function Trace
    Local $oAppl = _Excel_Open()
    If @error Then Exit ;MsgBox(16, "Excel UDF: _Excel_BookOpen Example", "Error creating the Excel application object." & @CRLF & "@error = " & @error & ", @extended = " & @extended)
    $oWorkbook = _Excel_BookNew($oAppl)
    If @error Then Exit ;MsgBox($MB_SYSTEMMODAL, "Excel UDF: _Excel_BookOpen ", "Error opening '" & $oWorkbook & "'." & @CRLF & "@error = " & @error & ", @extended = " & @extended)
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Stato", "A1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Sigla", "B1")
    #cs
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Sotto Sotto Categoria", "C1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Codice", "D1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Descrizione", "E1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "", "F1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Disponibilita", "G1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Arrivi", "H1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Listino", "I1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Margine", "J1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Prezzo c&c", "K1")
    _Excel_RangeWrite($oWorkbook, $oWorkbook.Activesheet, "Prezzo Web", "L1")
    #ce
EndFunc   ;==>_excel_Crea

; #FUNCTION# ====================================================================================================================
; Author ........: faustf
; Modified.......:
; What do........: entro dentro il sito
; ===============================================================================================================================

Func _entra_dentro()
    ConsoleWrite('@@ (1116) :(' & @MIN & ':' & @SEC & ') _entra_dentro()' & @CR) ;### Function Trace

    $oIE = _IECreate("http://www.mapanet.es/IT/", 0, 1, 1, 1)

    ;Local $oForm = _IEFormGetObjByName($oIE, "arearis")
    
    _IELoadWait($oIE)
EndFunc   ;==>_entra_dentro

 

Edited by faustf

Share this post


Link to post
Share on other sites



I really don't know so much about regex but you can do something like this.

#include <Array.au3>

Local $sData=InetRead("http://www.mapanet.es/IT/")
Local $aData=StringRegExp(BinaryToString($sData,4),'itemprop="url">(.*?)</a>',3)
_ArrayDisplay($aData)

Saludos

Share this post


Link to post
Share on other sites

o thnaks  so much  go  good 

but  what  difference  from  

Local $sData=InetRead("http://www.mapanet.es/IT/")
Local $aData=StringRegExp(BinaryToString($sData,4),'itemprop="url">(.*?)</a>',3)

and my 

Local $stati_grab = _IEBodyReadHTML($oIE)
Local $aArray_stati = StringRegExp($stati_grab, '(?s)<a href="Postal_Codes/\?C=(.*?)" itemprop="url">(.*?)</a></span></td></tr>', $STR_REGEXPARRAYGLOBALMATCH)

Share this post


Link to post
Share on other sites

in :

'(?s)<a href="Postal_Codes/\?C=(.*?)" itemprop="url">(.*?)</a></span></td></tr>'

there are 2 capturing groups,
1: C=(.*?)
2: >(.*?)</a>

and the in:

'itemprop="url">(.*?)</a>'

there is only 1 capturing group.
Also Internet Explorer can change HTML format, but InetRead will return exact Html received from server.
 


73 108 111 118 101 65 117 116 111 105 116

Share this post


Link to post
Share on other sites

The 2 capturing groups were probably intended to do this :

#include <Array.au3>

Local $sData = InetRead("http://www.mapanet.es/IT/")
Local $aData = StringRegExp(BinaryToString($sData, 4),'Postal_Codes/\?C=(\w+).*?>([^<]+)', 3)

Local $aRes[UBound($aData)/2][2]
For $i = 0 to UBound($aData)-1 step 2
       For $j = 0 to 1
          $aRes[$i/2][$j] = $aData[$i+$j]
       Next
Next
_ArrayDisplay($aRes)

But be careful, there is a bug in the source code of the web page for Macao  :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now