Jump to content
Sign in to follow this  
cuShane

Better way to do this?

Recommended Posts

cuShane

I can do what I am trying to do, but it's ugly. If anyone can give me some tips on a better way to do this I'd love to hear them.

The purpose of this section of the script is to retrieve and parse some html source code. The script should locate all the places in the source where a number of views is listed (e.g. "2018 views") It should then identify where it lists the highest number of views. Once it has identified the location in the string of the highest number of views, the script should then parse the name of the file listed before, but closest to, that location (which can be identified by finding "Filename="). The file name should be stored as $imgname.

CODE
$HTMLSource = _IEBodyReadHTML($o_IE) ;get html source code

$output = StringSplit($HTMLSource, @CRLF) ;split the code into lines

$hit = 0 ;used to denote if any results located (0 = false, 1 = true)

;gather results by searching for the word "views"

For $i = 1 To $output[0]

If StringInStr($output[$i], "views") Then

$hit = 1

$resultPos = StringInStr($output[$i], "views") ;get position of the word "views"

;get the number of views by isolating the number before the word "views"

$count = StringMid ( $output[$i], $resultPos - 10, 10 )

$count = StringRegExpReplace ( $count, "[^0-9]", "")

;add resulting number of views to the appropriate arrays

_ArrayAdd($viewsarray, $count)

EndIf

Next

;______________________________sect5.3________________________________

;If any results were found identify the one with the most views, locate the filename in the html source, and generate the image URL

if $hit Then

$mostPOS = StringInStr($HTMLSource,_ArrayMax($viewsarray,1) & " views",0,-1) ;get position of most views

$smallerStr = StringLeft($HTMLSource, $mostPOS) ;drop all text after the position with the most views

$result1 = StringInStr($smallerStr, "Filename",0,-1) ;position of the last instance of the word "Filename"

$result2 = StringInStr($smallerStr, ".jpg",0,1,$result1) ;position of the end of the filename as identified by the .jpg extension

$start = $result1 + 9

$count = $result2 - $result1 -5

$imgname = StringMid ( $smallerStr,$start ,$count) ;parse the filename

EndIf

Here is an example of what some of the html source code that I am trying to parse might look like:

CODE
<td valign="top" class="thumbnails" width ="25%" align="center">

<table width="100%" cellpadding="0" cellspacing="0">

<tr>

<td align="center">

<a href="displayimage.php?album=search&amp;cat=0&amp;pos=0"><img src="albums/userpics/thumb_Cobain.jpg" class="image" width="100" height="57" border="0" alt="Cobain.jpg" title="Filename=Cobain.jpg Filesize=155KB Dimensions=1280x720 Date added=Jan 02, 2009"/><br /></a>

<span class="thumb_title">Curt Cobain</span><span class="thumb_title">19 views</span><span class="thumb_caption">Who needs to find high-rez images when you'e got the cutout tool?</span>

<span class="thumb_title"><a href ="profile.php?uid=6">digitalhigh</a></span>

</td>

</tr>

</table>

</td>

<td valign="top" class="thumbnails" width ="25%" align="center">

<table width="100%" cellpadding="0" cellspacing="0">

<tr>

<td align="center">

<a href="displayimage.php?album=search&amp;cat=0&amp;pos=1"><img src="albums/userpics/thumb_nirvana.jpg" class="image" width="100" height="57" border="0" alt="nirvana.jpg" title="Filename=nirvana.jpg Filesize=184KB Dimensions=1280x720 Date added=Jul 30, 2008"/><br /></a>

<span class="thumb_title">Nirvana</span><span class="thumb_title">223 views</span>

<span class="thumb_title"><a href ="profile.php?uid=1">nuzecast</a></span>

</td>

</tr>

</table>

</td>

Given this example html source, the script should function as follows:

1. Identify that there are two target items: one with 19 views and one with 223 views

2. Discern that 223 is the most number of views

3. Set $imgname to the filename before, but closest to, "223 views"; which, in this example, is "nirvana.jpg"

Share this post


Link to post
Share on other sites
Spiff59

You don't need StringInStr in there twice.

Get the result position return from the first StringInStr, then check that in your If/EndIf statement.

Zap the second StringInStr.

Ought to save your CPU a bit o work.

Share this post


Link to post
Share on other sites
Authenticity

#include <Array.au3>

Dim $sStr = '<td valign="top" class="thumbnails" width ="25%" align="center">' & @CRLF & _
            '<table width="100%" cellpadding="0" cellspacing="0">' & @CRLF & _
            '<tr>' & @CRLF & _
            '<td align="center">' & @CRLF & _
            '<a href="displayimage.php?album=search&amp;cat=0&amp;pos=0"><img src="albums/userpics/thumb_Cobain.jpg" class="image" width="100" height="57" border="0" alt="Cobain.jpg" title="Filename=Cobain.jpg Filesize=155KB Dimensions=1280x720 Date added=Jan 02, 2009"/><br /></a>' & @CRLF & _
            '<span class="thumb_title">Curt Cobain</span><span class="thumb_title">19 views</span><span class="thumb_caption">Who needs to find high-rez images when you"e got the cutout tool?</span>' & @CRLF & _
            '<span class="thumb_title"><a href ="profile.php?uid=6">digitalhigh</a></span>' & @CRLF & _
            '</td>' & @CRLF & _
            '</tr>' & @CRLF & _
            '</table>' & @CRLF & _
            '</td>' & @CRLF & _
            '<td valign="top" class="thumbnails" width ="25%" align="center">' & @CRLF & _
            '<table width="100%" cellpadding="0" cellspacing="0">' & @CRLF & _
            '<tr>' & @CRLF & _
            '<td align="center">' & @CRLF & _
            '<a href="displayimage.php?album=search&amp;cat=0&amp;pos=1"><img src="albums/userpics/thumb_nirvana.jpg" class="image" width="100" height="57" border="0" alt="nirvana.jpg" title="Filename=nirvana.jpg Filesize=184KB Dimensions=1280x720 Date added=Jul 30, 2008"/><br /></a>' & @CRLF & _
            '<span class="thumb_title">Nirvana</span><span class="thumb_title">223 views</span>' & @CRLF & _
            '<span class="thumb_title"><a href ="profile.php?uid=1">nuzecast</a></span>' & @CRLF & _
            '</td>' & @CRLF & _
            '</tr>' & @CRLF & _
            '</table>' & @CRLF & _
            '</td>'
            
$Arr = StringRegExp($sStr, '(?i)([0-9]*) views', 3)

If Not @error Then
    Dim $Rank = Int($Arr[0])
    
    For $i = 1 To UBound($Arr)-1
        If Int($Arr[$i]) > $Rank Then $Rank = Int($Arr[$i])
    Next
EndIf

$Arr = StringRegExp($sStr, '(?i)(?s)<img.*"Filename=([0-9a-z\.]*) .*' & $Rank & ' views', 3)

If IsArray($Arr) Then _ArrayDisplay($Arr)

Maybe something as silly as this?

Share this post


Link to post
Share on other sites
cuShane

Perfect, thanks! I made a couple of changes to the final code so that it could handle if the number of views has a comma and/or if the filename has a space:

CODE
$Arr = StringRegExp($sStr, '(?i)(\d*,?\d*) views', 3)

If Not @error Then

Dim $Rank = Int(StringReplace($Arr[0],",",""))

Dim $RankIndex = 0

For $i = 1 To UBound($Arr)-1

If Int(StringReplace($Arr[$i],",","")) > $Rank Then

$Rank = Int(StringReplace($Arr[$i],",",""))

$RankIndex = $i

EndIf

Next

EndIf

$Arr = StringRegExp($sStr, '(?i)(?s)<img.*"Filename=([0-9a-z \.]*) .*' & $Arr[$RankIndex] & ' views', 3)

If IsArray($Arr) Then _ArrayDisplay($Arr)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×