Sign in to follow this  
Followers 0
Bi0haZarD

Stupid question about Inetget and getting all links on the page.

5 posts in this topic

ok, i'm tryin to make a script for my unattended windows CD. that goes to my FTP, downloads the source of the FTP (the HTML) and looks in the $source string for all "<a href=" and all "</a>" then download all the exe files.

i want to do this, so if i want to update a file on my unattended CD, i just update the file on my FTP and its all done.

heres what i have thus far. note that i have it pointed to www.google.com just as a test. and i have the "If InetGet($url" and "FileDelete(" lines commented out so i'm not hammering googles servers.

HotKeySet("{ESC}", "Terminate")

$url = "http://www.google.com/"
$source = _data()

msgbox(0,"",$source)

$urla = StringTrimLeft($source,StringInStr($source,"<a href=")+8)
$urla = StringLeft($urla, StringInStr($urla, ">") - 2)

msgbox(0,"",$urla)

Func _data()
    Local $source
    If InetGet($url, "C:\data.txt") Then
        $source = FileRead("C:\data.txt", FileGetSize("C:\data.txt"))
        FileDelete("C:\data.txt")
        Return $source
    Else
        SetError(1)
        Return -1
    EndIf
EndFunc

Func Terminate()
    Exit 0
EndFunc

the problem i'm having is...

i want it to get ALL of the "<a href" links... right now it just shows a messagebox of the first one.. so how could i get it to search all the code, for all <a href's.

thanks.

Share this post


Link to post
Share on other sites



Hello there, I've had to deal with this problem to, it's quite annoying.

If you want to get rid of all <a href thingys in there you just do the following, or this is what I did at least.

You can modify it to remove as many things as you want. It's probably inefficient code, but it works for me.

#include <file.au3>
#include <Process.au3>

;----------------------Read INETGet data
InetGet("http://www.google.com/", "C:\data.txt")
$googleget = FileOpen ( "C:\data.txt", 0 )

;----------------------Create file to write to with corrections
_FileCreate ( "C:\_Stripped.htm" )
$googlestrip = FileOpen ( "C:\_Stripped.htm", 2 )

;----------------------Get total file lines of source
$cnt = 0
While 1
    $linenum = FileReadLine( $googleget )
    $cnt = $cnt + 1
    If @error = -1 Then ExitLoop
Wend

;----------------------Replace selected things with nothing
$i = 0
Do
    $line = FileReadLine ( $googleget , $i )
    $rep = StringReplace ( $line, "<", "" )
    $rep1 = StringReplace ( $rep, ">", "" )
    $rep2 = StringReplace ( $rep1, "=", "" )
    FileWriteLine ( $googlestrip, $rep2 )
    $end = StringInStr ( $rep2, "body")
    $i = $i + 1
Until $i = $cnt

FileClose($googleget)
FileClose($googlestrip)

Hope this helps B)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

Here's a couple of my UDF's that separate out all the http:// LINKS from html source code. Just give it a variable with the source code, and it returns a CRLF delimited list of links it finds.

$sBigListOLinks = _GetLinks($source)
  $aLinks = StringSplit($sBigListOLinks,@CRLF,1)
;Do something with this array of links....

Func _GetLinks($sData)
      Dim $iCounter,$iPointer,$iPointer2,$sLinks
  
      for $iCounter = 1 to StringLen($sData)
         $iPointer = _SearchTarget($sData,"href=" & chr(34) & "http://",$iCounter,1) - 8; Looks for any href="http://
          if @error = 0 Then
              $iCounter = $iPointer
             $iPointer2 = _SearchTarget($sData,chr(34),$iPointer,0) - 1; Looks for the next "
              if @error = 0 Then
                  $iCounter = $iPointer2
                 $sLinks = $sLinks & StringMid($sData,$iPointer,$iPointer2-$iPointer) & @CRLF
              endif
          else
              exitloop
          endif
      next; Loops until it finds all matches in $sData
  
      if $sLinks = "" Then
          SetError(1)
      endif
     Return $sLinks; Returns Variable with all found links.
  EndFunc
      
  Func _SearchTarget($sString,$sTarget,$iPointerIn,$iAfter)
      Dim $iPointerTemp,$iPointerOut
      $iPointerTemp = StringInStr(StringMid($sString,$iPointerIn),$sTarget,0)
      if $iPointerTemp > 0 then
          $iPointerOut = $iPointerTemp
         $iPointerOut = $iPointerOut + $iPointerIn; Adds given optional offset to Pointer location
          if $iAfter = 1 then
             $iPointerOut = $iPointerOut + StringLen($sTarget); Sets Pointer location AFTER Target string
          endif
      else
 ;Target not found
          SetError(1)
      endif
      Return $iPointerOut
  EndFunc

Hope this helps.

-Trystian

Edited by TrystianSky

Share this post


Link to post
Share on other sites

thanks guys =) helped me out A LOT! B)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0