Jump to content

Toubles with INetGetSource & StringRegExp


Recommended Posts

Func _INetGetSource($s_URL)
    
;add 'http://' if 'http(s)://' is not present
    If StringLeft($s_URL, 7) <> 'http://' AND StringLeft($s_URL, 8) <> 'https://' Then $s_URL = 'http://' & $s_URL
    
;locals
    Local $o_HTTP, $i_ERR
    
;object
    $o_HTTP = ObjCreate ("winhttp.winhttprequest.5.1")
    
;err handeling
    If @error Then
        SetError(1)
        Return 0
    EndIf
    
;send the request
    $o_HTTP.open ("GET", $s_URL)
    $o_HTTP.send ()
    
;return the response
    Return $o_HTTP.Responsetext
    
EndFunc  ;==>_GetSource

$Source = _INetGetSource('http://brianhare.com/Test/test.htm')
$Links = StringRegExp($Source, "<A(.*?)>", 3)

I want $source to be the HTML source code of the site. and $Links to be the array of each link.

However I dont want just any old link, I want the link of only the Member's profile.

I dont want any of the useless links to be in the $Links

After I have found all the Member profile links on the page, I want another function to only allow online users. I was able to do this in theory with another au3.

$FILTER="profile.php?user="
$FILTERPOS=-16
$LINE = 0
$ENDPOS=0
URLDownloadToFile("http://brianhare.com/Test/test.htm",$FILESCAN)



   FileOpen($FILESCAN,0)
   While 1
      $TEXT=FileReadLine($FILESCAN,$LINE )
      If @error=-1 Then ExitLoop
      $POS=StringInStr($TEXT,$FILTER)
          If Not $POS=0 Then
             $TEXT=StringTrimLeft( $TEXT, $POS-$FILTERPOS)
             $ENDPOS=StringInStr( $TEXT,'</a><br>[online]')
             If Not $ENDPOS=0 Then $TEXT=StringMid($TEXT,1,$ENDPOS-1)
         StoreTEXTtoFile()
           EndIf
      $LINE=$LINE+1
   Wend
   SplashOff()
   FileClose($FILESCAN)
   FileDelete($FILESCAN)
   $LINE=0

What this code did was download the page to a file, looked thorough each line extracting the text that was between 'profile.php?user=' and '</a><br>[online]'

So the variable $TEXTwould be equal to an online members name. Then i stored $TEXT to a textfile and when to the next line.

How can I accomplish this search with StringRegExp and have it send each online member name to an array, and also never have to download to file.

Using this does not work:

$o_IE = ObjCreate ("InternetExplorer.Application")
    $o_IE.Navigate ('http://brianhare.com/Test/test.htm')
    While $o_IE.busy
        Sleep(10)
    WEnd
    $source = $o_IE.document.body.innerHTML

The innerHTML does not show all of the HTML. I dont know why.

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...