faustf

the hell of regrexp :(

7 posts in this topic

hi guy i have  the  pageweb  with  this  part of code

<td class="F13" colspan="4" align="right">
                                <B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;

                            </td>

 

i want  extract  only link  for  successiva  ( this  href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0 )

i use  this expression but   take me all

(?s)</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/\?page=(.*?)">Successiva

some one can help me please  :)

 

Share this post


Link to post
Share on other sites



This ?

#Include <Array.au3>
$sHTML = '<td class="F13" colspan="4" align="right">' & @CRLF & _
'<B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;' & @CRLF & _ 
'</td>'

$aLinks = StringRegExp($sHTML, '<a href=[^>]+?page=([^>]+)">Successiva</a>', 3)
_ArrayDisplay($aLinks)

 

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

An alternative without regexp.

 

MsgBox(0,"",StringMid($sData,StringInStr($sData,'href="',2,-1,StringInStr($sData,'">Successiva',2)),StringInStr($sData,'">Successiva',2)-StringInStr($sData,'href="',2,-1,$iPos2)))

same above clean.

Local $iPos2=StringInStr($sData,'">Successiva',2)
Local $iPos1=StringInStr($sData,'href="',2,-1,$iPos2)
MsgBox(0,"",StringMid($sData,$iPos1,$iPos2-$iPos1))

where $sData is the source html...

 

Saludos

Edited by Danyfirex

Share this post


Link to post
Share on other sites

If only the last link is needed a SRER does the job

$sLink = StringRegExpReplace($sHTML, '(?s).*"([^"]+)">Successiva.*', "$1")
Msgbox(0,"", $sLink)

 

Share this post


Link to post
Share on other sites
#Include <Array.au3>
$sHTML = '<td class="F13" colspan="4" align="right">' & @CRLF & _
'<B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;' & @CRLF & _
'</td>'

$aLinks = stringsplit($sHtml , "a href" , 3)

For $Link in $aLinks
    If stringinstr($Link , "Successiva") Then msgbox(0, '' , stringtrimright($Link , 34))
Next

 


,-. .--. ________ .-. .-. ,---. ,-. .-. .-. .-.
|(| / /\ \ |\ /| |__ __||| | | || .-' | |/ / \ \_/ )/
(_) / /__\ \ |(\ / | )| | | `-' | | `-. | | / __ \ (_)
| | | __ | (_)\/ | (_) | | .-. | | .-' | | \ |__| ) (
| | | | |)| | \ / | | | | | |)| | `--. | |) \ | |
`-' |_| (_) | |\/| | `-' /( (_)/( __.' |((_)-' /(_|
'-' '-' (__) (__) (_) (__)

Share this post


Link to post
Share on other sites

this can extract the last href
 

#include <Array.au3>
Local $data = '<td class="F13" colspan="4" align="right">' & @CRLF & _
'<B>1</B>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">2</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=3&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">3</a>&nbsp;&nbsp;<a href="/IT/Postal_Codes/?page=2&c=AD&n=2&r0=00&r1=04&r2=00&r3=00&r4=00&o=&L=0">Successiva</a>&nbsp;&nbsp;' & @CRLF & _
'</td>'
Local $aRet = StringRegExp($data,'(?i)(?!href).*href="(.*?)">Successiva',3)
_ArrayDisplay($aRet)

 


73 108 111 118 101 65 117 116 111 105 116

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now