Jump to content
faustf

the hell of regrexp :(

Recommended Posts

hi guy i have  the  pageweb  with  this  part of code

<td class="F13" colspan="4" align="right">
                                <B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;

                            </td>

 

i want  extract  only link  for  successiva  ( this  href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0 )

i use  this expression but   take me all

(?s)</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/\?page=(.*?)">Successiva

some one can help me please  :)

 

Share this post


Link to post
Share on other sites

This ?

#Include <Array.au3>
$sHTML = '<td class="F13" colspan="4" align="right">' & @CRLF & _
'<B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;' & @CRLF & _ 
'</td>'

$aLinks = StringRegExp($sHTML, '<a href=[^>]+?page=([^>]+)">Successiva</a>', 3)
_ArrayDisplay($aLinks)

 

Share this post


Link to post
Share on other sites

An alternative without regexp.

 

MsgBox(0,"",StringMid($sData,StringInStr($sData,'href="',2,-1,StringInStr($sData,'">Successiva',2)),StringInStr($sData,'">Successiva',2)-StringInStr($sData,'href="',2,-1,$iPos2)))

same above clean.

Local $iPos2=StringInStr($sData,'">Successiva',2)
Local $iPos1=StringInStr($sData,'href="',2,-1,$iPos2)
MsgBox(0,"",StringMid($sData,$iPos1,$iPos2-$iPos1))

where $sData is the source html...

 

Saludos

Edited by Danyfirex

Share this post


Link to post
Share on other sites
#Include <Array.au3>
$sHTML = '<td class="F13" colspan="4" align="right">' & @CRLF & _
'<B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;' & @CRLF & _
'</td>'

$aLinks = stringsplit($sHtml , "a href" , 3)

For $Link in $aLinks
    If stringinstr($Link , "Successiva") Then msgbox(0, '' , stringtrimright($Link , 34))
Next

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

this can extract the last href
 

#include <Array.au3>
Local $data = '<td class="F13" colspan="4" align="right">' & @CRLF & _
'<B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;' & @CRLF & _
'</td>'
Local $aRet = StringRegExp($data,'(?i)(?!href).*href="(.*?)">Successiva',3)
_ArrayDisplay($aRet)

 


73 108 111 118 101 65 117 116 111 105 116

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...