Jump to content
Sign in to follow this  

Toubles with INetGetSource & StringRegExp

Recommended Posts


Func _INetGetSource($s_URL)
;add 'http://' if 'http(s)://' is not present
    If StringLeft($s_URL, 7) <> 'http://' AND StringLeft($s_URL, 8) <> 'https://' Then $s_URL = 'http://' & $s_URL
    Local $o_HTTP, $i_ERR
    $o_HTTP = ObjCreate ("winhttp.winhttprequest.5.1")
;err handeling
    If @error Then
        Return 0
;send the request
    $o_HTTP.open ("GET", $s_URL)
    $o_HTTP.send ()
;return the response
    Return $o_HTTP.Responsetext
EndFunc  ;==>_GetSource

$Source = _INetGetSource('http://brianhare.com/Test/test.htm')
$Links = StringRegExp($Source, "<A(.*?)>", 3)

I want $source to be the HTML source code of the site. and $Links to be the array of each link.

However I dont want just any old link, I want the link of only the Member's profile.

I dont want any of the useless links to be in the $Links

After I have found all the Member profile links on the page, I want another function to only allow online users. I was able to do this in theory with another au3.

$LINE = 0

   While 1
      $TEXT=FileReadLine($FILESCAN,$LINE )
      If @error=-1 Then ExitLoop
          If Not $POS=0 Then
             $TEXT=StringTrimLeft( $TEXT, $POS-$FILTERPOS)
             $ENDPOS=StringInStr( $TEXT,'</a><br>[online]')
             If Not $ENDPOS=0 Then $TEXT=StringMid($TEXT,1,$ENDPOS-1)

What this code did was download the page to a file, looked thorough each line extracting the text that was between 'profile.php?user=' and '</a><br>[online]'

So the variable $TEXTwould be equal to an online members name. Then i stored $TEXT to a textfile and when to the next line.

How can I accomplish this search with StringRegExp and have it send each online member name to an array, and also never have to download to file.

Using this does not work:

$o_IE = ObjCreate ("InternetExplorer.Application")
    $o_IE.Navigate ('http://brianhare.com/Test/test.htm')
    While $o_IE.busy
    $source = $o_IE.document.body.innerHTML

The innerHTML does not show all of the HTML. I dont know why.

Share this post

Link to post
Share on other sites

I found my problem.

_INetGetSource reads quotes as '

$o_IE.document.body.innerHTML reads quotes as "

Share this post

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  


Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.