Jump to content

Troubles with _ArrayBinarySearch


Recommended Posts

#include <file.au3>
#include <Array.au3>

Dim $Links[1]
Dim $FileLinks[1]
Global $LastLink = 0
$LOGFILE="Links.txt"
$LOGFILE=@ScriptDir&"\"&$LOGFILE

ExtractLinks('http://newslink.org/')
_ArrayDelete($Links,0)

_ArrayDisplay($Links, "Links Array")
DeleteDuplicates()
LogLinks()

;==========================================================================================
Func ExtractLinks($Site)
    Local $Source, $PageLinks = 0
        $Source = _INetGetSource($Site)               
    $PageLinks = StringRegExp($Source, '(?i)<A href="(.*?)">', 3)
    For  $NUM = 0 to UBound($PageLinks) - 1
        _ArrayAdd($Links,$PageLinks[$NUM])
    Next
    _ArraySort($Links)
    $LastLink = UBound($Links) - 1
EndFunc
;==========================================================================================
Func DeleteDuplicates() 
    LoadLOG()
    Local $Link = 0     
    While $Link < $LastLink
; Check for duplicates
        if $Links[$Link] = $Links[$Link+1] Then
                _ArrayDelete ($Links, $Link+1)
        EndIf
; Load the LOG and check for duplicates
        msgbox(64,"Searching...",$Links[$Link])
        _ArrayBinarySearch ($FileLinks, $Links[$Link])
        If Not @error Then;If Found
            _ArrayDelete ($Links, $Link)
            msgbox(64,"Result:","FOUND")
        Else
            msgbox(64,"Result:","NOT FOUND")
        EndIf
;Update the Link and LastLink
        $Link = $Link +1
        $LastLink = UBound($Links) - 1
    Wend
EndFunc
;==========================================================================================
Func LoadLOG()
    FileOpen($LOGFILE,1)
    _FileReadToArray($LOGFILE,$FileLinks)
    
    _ArrayDelete ($FileLinks, $FileLinks[0])
    _ArrayDelete ($FileLinks, 0)
    
    _ArrayDisplay($FileLinks,"FileLinks Array")
EndFunc
;==========================================================================================
Func LogLinks()
    FileOpen($LOGFILE,1)
    For $Link = 0 to $LastLink
        FileWriteLine($LOGFILE,$Links[$Link]) 
    Next
    FileClose($LOGFILE)
EndFunc 
;==========================================================================================
Func _INetGetSource($s_URL)
    $o_HTTP = ObjCreate ("winhttp.winhttprequest.5.1")
    $o_HTTP.open ("GET", $s_URL)
    $o_HTTP.send ()
    return $o_HTTP.Responsetext
EndFunc

"; Load the LOG and check for duplicates" is where my trouble is.

I use _ArrayDisplay to debug and view the arrays. The $Links and $FileLinks work as they should.

I cant seem to get the binary search to find any matches between $Links and $FileLinks. Of course the first run sets the log up, and the 2nd run shouldnt log any links, because the links are already in the log.

Any feedback is welcome

Link to comment
Share on other sites

What's the problem? I ran the program and I don't see any duplicate link in the output file.

Edit:Oh, I get it, if you run it more than once, the file contains another list because you append it. Hmm... Let me look into this...

Edited by blindwig
Link to comment
Share on other sites

OK, 2 problems I found:

First, the _FileReadToArray() leaves the EOL markers on the end of the strings when it puts them in the array. You need to run through the array and run StringStripCR() on each element after you read the file.

Second, your searching logic is flawed:

If you find a duplicate link, you delete the current link and then move on to the next link. The problem is that now you just skipped the link that was the next link, because when you delete the current link, the next link becomes the current link. The solution is to either not move to the next link until the current link is not found, or move through the array backwards (from end to beginning) so that deleted elements don't effect where you are in the array.

Link to comment
Share on other sites

  • Developers

First, the _FileReadToArray() leaves the EOL markers on the end of the strings when it puts them in the array.  You need to run through the array and run StringStripCR() on each element after you read the file.

<{POST_SNAPBACK}>

Will be fixed in the next UDF version... tnx

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...