Jump to content
Sign in to follow this  

Toubles with INetGetSource & StringRegExp

Recommended Posts


Func _INetGetSource($s_URL)
;add 'http://' if 'http(s)://' is not present
    If StringLeft($s_URL, 7) <> 'http://' AND StringLeft($s_URL, 8) <> 'https://' Then $s_URL = 'http://' & $s_URL
    Local $o_HTTP, $i_ERR
    $o_HTTP = ObjCreate ("winhttp.winhttprequest.5.1")
;err handeling
    If @error Then
        Return 0
;send the request
    $o_HTTP.open ("GET", $s_URL)
    $o_HTTP.send ()
;return the response
    Return $o_HTTP.Responsetext
EndFunc  ;==>_GetSource

$Source = _INetGetSource('http://brianhare.com/Test/test.htm')
$Links = StringRegExp($Source, "<A(.*?)>", 3)

I want $source to be the HTML source code of the site. and $Links to be the array of each link.

However I dont want just any old link, I want the link of only the Member's profile.

I dont want any of the useless links to be in the $Links

After I have found all the Member profile links on the page, I want another function to only allow online users. I was able to do this in theory with another au3.

$LINE = 0

   While 1
      $TEXT=FileReadLine($FILESCAN,$LINE )
      If @error=-1 Then ExitLoop
          If Not $POS=0 Then
             $TEXT=StringTrimLeft( $TEXT, $POS-$FILTERPOS)
             $ENDPOS=StringInStr( $TEXT,'</a><br>[online]')
             If Not $ENDPOS=0 Then $TEXT=StringMid($TEXT,1,$ENDPOS-1)

What this code did was download the page to a file, looked thorough each line extracting the text that was between 'profile.php?user=' and '</a><br>[online]'

So the variable $TEXTwould be equal to an online members name. Then i stored $TEXT to a textfile and when to the next line.

How can I accomplish this search with StringRegExp and have it send each online member name to an array, and also never have to download to file.

Using this does not work:

$o_IE = ObjCreate ("InternetExplorer.Application")
    $o_IE.Navigate ('http://brianhare.com/Test/test.htm')
    While $o_IE.busy
    $source = $o_IE.document.body.innerHTML

The innerHTML does not show all of the HTML. I dont know why.

Share this post

Link to post
Share on other sites

I found my problem.

_INetGetSource reads quotes as '

$o_IE.document.body.innerHTML reads quotes as "

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this