Sign in to follow this  
Followers 0
ParoXsitiC

Troubles with _ArrayBinarySearch

6 posts in this topic

#include <file.au3>
#include <Array.au3>

Dim $Links[1]
Dim $FileLinks[1]
Global $LastLink = 0
$LOGFILE="Links.txt"
$LOGFILE=@ScriptDir&"\"&$LOGFILE

ExtractLinks('http://newslink.org/')
_ArrayDelete($Links,0)

_ArrayDisplay($Links, "Links Array")
DeleteDuplicates()
LogLinks()

;==========================================================================================
Func ExtractLinks($Site)
    Local $Source, $PageLinks = 0
        $Source = _INetGetSource($Site)               
    $PageLinks = StringRegExp($Source, '(?i)<A href="(.*?)">', 3)
    For  $NUM = 0 to UBound($PageLinks) - 1
        _ArrayAdd($Links,$PageLinks[$NUM])
    Next
    _ArraySort($Links)
    $LastLink = UBound($Links) - 1
EndFunc
;==========================================================================================
Func DeleteDuplicates() 
    LoadLOG()
    Local $Link = 0     
    While $Link < $LastLink
; Check for duplicates
        if $Links[$Link] = $Links[$Link+1] Then
                _ArrayDelete ($Links, $Link+1)
        EndIf
; Load the LOG and check for duplicates
        msgbox(64,"Searching...",$Links[$Link])
        _ArrayBinarySearch ($FileLinks, $Links[$Link])
        If Not @error Then;If Found
            _ArrayDelete ($Links, $Link)
            msgbox(64,"Result:","FOUND")
        Else
            msgbox(64,"Result:","NOT FOUND")
        EndIf
;Update the Link and LastLink
        $Link = $Link +1
        $LastLink = UBound($Links) - 1
    Wend
EndFunc
;==========================================================================================
Func LoadLOG()
    FileOpen($LOGFILE,1)
    _FileReadToArray($LOGFILE,$FileLinks)
    
    _ArrayDelete ($FileLinks, $FileLinks[0])
    _ArrayDelete ($FileLinks, 0)
    
    _ArrayDisplay($FileLinks,"FileLinks Array")
EndFunc
;==========================================================================================
Func LogLinks()
    FileOpen($LOGFILE,1)
    For $Link = 0 to $LastLink
        FileWriteLine($LOGFILE,$Links[$Link]) 
    Next
    FileClose($LOGFILE)
EndFunc 
;==========================================================================================
Func _INetGetSource($s_URL)
    $o_HTTP = ObjCreate ("winhttp.winhttprequest.5.1")
    $o_HTTP.open ("GET", $s_URL)
    $o_HTTP.send ()
    return $o_HTTP.Responsetext
EndFunc

"; Load the LOG and check for duplicates" is where my trouble is.

I use _ArrayDisplay to debug and view the arrays. The $Links and $FileLinks work as they should.

I cant seem to get the binary search to find any matches between $Links and $FileLinks. Of course the first run sets the log up, and the 2nd run shouldnt log any links, because the links are already in the log.

Any feedback is welcome

Share this post


Link to post
Share on other sites



Please help.

Share this post


Link to post
Share on other sites

#3 ·  Posted (edited)

What's the problem? I ran the program and I don't see any duplicate link in the output file.

Edit:Oh, I get it, if you run it more than once, the file contains another list because you append it. Hmm... Let me look into this...

Edited by blindwig

Share this post


Link to post
Share on other sites

OK, 2 problems I found:

First, the _FileReadToArray() leaves the EOL markers on the end of the strings when it puts them in the array. You need to run through the array and run StringStripCR() on each element after you read the file.

Second, your searching logic is flawed:

If you find a duplicate link, you delete the current link and then move on to the next link. The problem is that now you just skipped the link that was the next link, because when you delete the current link, the next link becomes the current link. The solution is to either not move to the next link until the current link is not found, or move through the array backwards (from end to beginning) so that deleted elements don't effect where you are in the array.

Share this post


Link to post
Share on other sites

First, the _FileReadToArray() leaves the EOL markers on the end of the strings when it puts them in the array.  You need to run through the array and run StringStripCR() on each element after you read the file.

<{POST_SNAPBACK}>

Will be fixed in the next UDF version... tnx

Visit the SciTE4AutoIt3 Download page for the latest versions        Beta files                                                          Forum Rules
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Share this post


Link to post
Share on other sites

Yes. I found the problems before I looked here...Thanks anyhow!! I noticed the search was flawed too. Everything is working now

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0