Sign in to follow this  
Followers 0
mcfr1es

String And Url Functions

4 posts in this topic

ok, well im trying to extract only the usernames from a forum-style webpage in which many usernames are scattered. please see the code below

;  Determining the link on a webpage.
;------------------------------------------------------------------------------

;this url will change and the process will be done again, when i am sure the script
;works, an embedded for loop will be used to change the url
  $sURL = "http://www.neopets.com/neoboards/boardlist.phtml?board=1"
  $UserList = _GetUsers($sURL)
  MsgBox ( 1, "users", $Userlist[3])
;just checking to see if script works, these names will be written to a text file later 

  
Func _GetUsers($psURL)
;Returns an array of links from a webpage
;------------------------------------------------------------------------------

;Download the HTML to a temporary file
   $sTempFile  = "$page.htm"
   URLDownloadToFile($psURL, $sTempFile)
   $sHTML = FileRead($sTempFile, FileGetSize($sTempFile))
   FileDelete($sTempFile)
   
;Cleanup the HTML for better consumption    
   $sHTML = StringReplace($sHTML, @CR, "")
   $sHTML = StringReplace($sHTML, @LF, "")
   $sHTML = StringReplace($sHTML, @TAB, " ")

;Break it into chewable bytes
   $sHTML = StringReplace($sHTML, '<span class="blistSmall">', @LF & '<span class="blistSmall">')
   
   $asHTML = StringSplit($sHTML, @LF)
   
;Spit out the bones
   $sLinks = ""
   For $nX = 1 to $asHTML[0]
     ;Process only "<span class="blistSmall">" lines
       If StringLeft($asHTML[$nX],25) = '<span class="blistSmall">' then
           $asUserlist = StringSplit($asHTML[$nX], ">")
           $sUserlist = $sUserlist & @LF & $asUserlist[1]
       Endif
   Next 

;Return the juicy links
   Return StringSplit(StringTrimLeft($sUserlist,1), @LF)
   
EndFunc

when viewing the source of that webpage, I see that all of the usernames are contained within this tag: <span class="blistSmall">"username"</span>

and therefore the script above was written to get the info from within this tag but it does not seem to work. The username values are all returned as <span class="blistSmall" instead of the acutally username after that tag and i think its because of the line $asUserlist = StringSplit($asHTML[$nX], ">") but I am not sure how to fix it. :ph34r:


Roger! You son of a big pile o' Monkey Nuts.

Share this post


Link to post
Share on other sites



I would use this logic to filter out the usename, see if that works for you... (untested)

If StringLeft($asHTML[$nX],25) = '<span class="blistSmall">' then
           $asUserlist = Stringtrimleft($asHTML[$nX], StringInStr($asHTML[$nX],">")+1)
           $asUserlist = Stringleft($asUserlist, StringInStr($asUserlist,"<")-1)
           $sUserlist = $sUserlist & @LF & $asUserlist
       Endif

But i believe this should work as well (Untested) :

If StringLeft($asHTML[$nX],25) = '<span class="blistSmall">' then
           $asUserlist = StringSplit($asHTML[$nX], "><")
           $sUserlist = $sUserlist & @LF & $asUserlist[3]
       Endif

Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

the first script works perfectly except for one thing, it deletes the first character in every username, help anyone?

Also, how would i make a for loop that lasts for however many usernames there are? or would i have to use a diff loop altogether

Edited by mcfr1es

Roger! You son of a big pile o' Monkey Nuts.

Share this post


Link to post
Share on other sites

the first script works perfectly except for one thing, it deletes the first character in every username, help anyone?

Also, how would i make a for loop that lasts for however many usernames there are? or would i have to use a diff loop altogether

<{POST_SNAPBACK}>

Try this version:

If StringLeft($asHTML[$nX],25) = '<span class="blistSmall">' then
           $asUserlist = Stringtrimleft($asHTML[$nX], StringInStr($asHTML[$nX],">"))
           $asUserlist = Stringleft($asUserlist, StringInStr($asUserlist,"<")-1)
           $sUserlist = $sUserlist & @LF & $asUserlist
       Endif

Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0