Jump to content
Sign in to follow this  
phatzilla

Someone Help me make an email saver...

Recommended Posts

phatzilla

I already have this one which saves each craigslist webpage as defined by my keywords.

E.g. Keyword = Toyota, it would search for Toyota on craigslist and save a brand new .html file with all craigslist ads that have 'toyota' in the title.

Free for everyone to use

While 1
    $SITE = IniRead(@ScriptDir & "\Settings.ini", "Site", "Site", "default")
    $DATE = _DateTimeFormat(_NowCalc(), 1)
    $DATE1 = _DateTimeFormat(_NowCalc(), 3)
    $OIE = _IECreate($SITE, 0, 0, 1, -1)
    $OPS = _IETagNameGetCollection($OIE, "p")
    $CNTPS = @extended
    Local $ALINKINFO[$CNTPS + 1][3]
    $ALINKINFO[0][0] = "Index"
    $ALINKINFO[0][1] = "Link Text"
    $ALINKINFO[0][2] = "href"
    $CNT = 1
    For $OP In $OPS
        $OLINK = _IETagNameGetCollection($OP, "a", 0)
        $ALINKINFO[$CNT][0] = $CNT - 1
        $ALINKINFO[$CNT][1] = $OLINK.innerText
        $ALINKINFO[$CNT][2] = $OLINK.href
        $CNT += 1
    Next
    $Y = 0
    $TERMS = IniReadSection(@ScriptDir & "\Settings.ini", "Terms")
    For $I = 1 To $TERMS[0][0] Step 1
        For $X = 1 To $CNT - 1 Step 1
            If StringInStr($ALINKINFO[$X][1], $TERMS[$I][1]) <> 0 Then
                $Y = $Y + 1
                IniWrite(@ScriptDir & "\Results.ini", "Results", $ALINKINFO[$X][1], $ALINKINFO[$X][2])
            EndIf
        Next
    Next
    $RESULTS = IniReadSection(@ScriptDir & "\Results.ini", "Results")
    If $Y <> 0 Then
        $FILE = FileOpen(@ScriptDir & "\Results.htm", 9)
        FileWriteLine($FILE, "<h2>" & $DATE & "---------" & $DATE1 & "</h2> " & @CRLF)
        For $U = 1 To $RESULTS[0][0] Step 1
            FileWriteLine($FILE, '<p><a href="' & $RESULTS[$U][1] & '">' & $RESULTS[$U][0] & "</a></p>" & @CRLF)
        Next
        FileClose($FILE)
        TrayTip("Craigs List Results", "Search Found " & $Y & " items!", 5000)
        Sleep(5000)
        TrayTip("", "", 0)
    Else
        FileClose($FILE)
        TrayTip("Craigs List Results", "Inconclusive Search (0 Results)", 5000)
        Sleep(5000)
        TrayTip("", "", 0)
    EndIf
    Sleep((1000 * 60) * 60)
WEnd

Now here's what i want.

Let's say It searches for Toyota, but now i want it to go into every page that matches 'toyota' and save the email that's in the AD (usually @craigslist.org format). I have no idea how to go by doing this, i'd love to get some help so i can finally finish my little craigslist auction indexer :D

Share this post


Link to post
Share on other sites
Authenticity

#include <Array.au3>
#include <HTTP.au3>

Dim $sHost = 'allentown.craigslist.org'
Dim $sPage = '/search/sss?query=toyota'
Dim $sSource = ''
Dim $sHTTP


Dim $sHTTP = _HTTPConnect($sHost)
    If @error Then Exit
    
_HTTPGet($sHost, $sPage, $sHTTP)
    If @error Then
        ConsoleWrite(@error & @TAB & @extended & @LF)
        _HTTPClose($sHTTP)
        Exit
    EndIf

Dim $sSource = _HTTPRead($sHTTP)
Dim $aMatches = StringRegExp($sSource, '(?i)<p>[^<]*+<a\s++href="([^"]++)"', 3)

If IsArray($aMatches) Then _ArrayDisplay($aMatches)


_HTTPClose($sHTTP)

I believe you know how to continue.

Edited by Authenticity

Share this post


Link to post
Share on other sites
phatzilla

Sorry im quite lost, can you please explain?

Share this post


Link to post
Share on other sites
Authenticity

It's not as fast as you may want it to be but at least it's faster than me or you for that matter ;]

If anything is not clear first check the help file.

Example:

#include <Array.au3>
#include <HTTP.au3>
#include <INet.au3>

Dim $sHost = 'allentown.craigslist.org'
Dim $sPage = '/search/sss?query=toyota'
Dim $sSource = ''
Dim $sHTTP


Dim $sHTTP = _HTTPConnect($sHost)
    If @error Then Exit
    
_HTTPGet($sHost, $sPage, $sHTTP)
    If @error Then
        ConsoleWrite(@error & @TAB & @extended & @LF)
        _HTTPClose($sHTTP)
        Exit
    EndIf

Dim $sSource = _HTTPRead($sHTTP)
Dim $aMatches = StringRegExp($sSource, '(?i)<p>[^<]*+<a\s++href="([^"]++)"', 3)

If IsArray($aMatches) Then
    Local $aSrc, $sMailto
    
    For $i = 0 To UBound($aMatches)-1
        $aSrc = StringRegExp(_INetGetSource($sHost & $aMatches[$i]), '(?i)mailto:([^?]++)\?subject',  1)
        If IsArray($aSrc) Then 
            $sMailto = _ParseASCII($aSrc[0])
            ConsoleWrite($sMailto & @LF)
        EndIf
    Next
EndIf
_HTTPClose($sHTTP)


Func _ParseASCII($sString)
    Local $aMatch = StringRegExp($sString, '&#(\d++);', 3)
    Local $sTmp = ''
    
    For $i = 0 To UBound($aMatch)-1
        $sTmp &= Chr($aMatch[$i])
    Next
    
    Return $sTmp
EndFuncoÝ÷ Øë­¦ë¡×£(uëì&n6õ×]û÷}ºÙÊÚ%Ëh®ë7ßØ5×]ùë®·ÝÊÚ%Ëh®ç¸¹·õ×]¼ëß=ÙÊÚ%Ëh®è0qúµ×]¼ëMöíÊÚ%Ëh®ê¬çµ×]ºß~wõÊÚ%Ëh®ìö«u×]ºÓ¾ºõÊÚ%Ëh®îûµ×]·÷n¶éÊÚ%Ëh®é¸¦¦õ×]·×]=áÊÚ%Ëh®é¶vu×]¶ß¾yÙÊÚ%Ëh®ëû95×]µß½´áÊÚ%Ëh®îé5×]´çÞºéÊÚ%Ëh®ìª¨u×]}ï¸áÊÚ%Ëh®çnüu×]}Û]ôéÊÚ%Ëh®ëÚÝÌõ×]]:éÊÚ%Ëh®é;u×]6÷Î9íÊÚ%Ëh®ìfm¸5×]xó½ùáÊÚ%Ëh®ìc³õ×]xß}7ÑÊÚ%Ëh®í{õ×]vóM}ñÊÚ%Ëh®çglõ×]vç­ýáÊÚ%Ëh®í¦Ûlõ×]tã_|ñÊÚ%Ëh®ææmµ×]=÷vÙÊÚ%Ëh®ê×5×]=ó~tíÊÚ%Ëh®ë÷õ×]:ë~¸ÙÊÚ%Ëh®æï¦5×]6óûÝÊÚ%Ëh®îÆ©µ×O}ë¾8éÊÚ%Ëh®ëæî¨õ×O}ã®´ÙÊÚ%Ëh®çsÃ'µ×O}ÓO9ÑÊÚ%Ëh®é£Ï*u×O{ó¾ôÙÊÚ%Ëh®ç*¶Kõ×OzótÝÊÚ%Ëh®ë!µ×Ox×myõÊÚ%Ëh®ës5×OwóÝxñÊÚ%Ëh®ï7º{õ×Ot×_}ÕÊÚ%Ëh®î9§©5×Ouëo:åÊÚ%Ëh®ëmÂëu×OtßÍwÑÊÚ%Ëh®èÞ©®µ×O9Û8áÊÚ%Ëh®é·Îõ×O8÷½{íÊÚ%Ëh®êî¬u×Ný÷®÷ÝÊÚ%Ëh®ëñ®øu×Nüó]úõÊÚ%Ëh®é£Ê¹õ×NüÓ~xÝÊÚ%Ëh®ê|×µ×Nûç^¶ÕÊÚ%Ëh®ëÛu×Nùë¾6åÊÚ%Ëh®çïâ{µ×Nùß:ÑÊÚ%Ëh®ê|«5×NùÛwÕÊÚ%Ëh®îìõçu×NöÓÞôÑÊÚ%Ëh®éº·u×Nõß9ÙÊÚ%Ëh®èð²Hõ×Nõß;õÊÚ%Ëh®è.Fµ×Nôë~:éÊÚ%Ëh®é'«Ûµ×N½ßß}õÊÚ%Ëh®íÚºh5×Nº×­ôõÊÚ%Ëh®çßëµ×N¹ç_våÊÚ%Ëh®ì¦~Ø5×N¹ã¼ÕÊÚ%Ëh®Ü¡×¿v+ÓêÞv'ßz·§qçë¢g±¥©Ýx"¶Ú,q©áºwkyÛ(~X§È¬ë-o*®z¼¢®¶­seôEEvWBb33¶ÆÆVçF÷vâæ7&w6Æ7Bæ÷&rb33²Âb33²÷6V&6÷773÷VW'×F÷÷Ff×·3Ób33²Âb33c·4EE£²÷"÷"#âââVçFÂFW&R2âW'&÷"ࣲ÷"2FRf&&ÆW2&RW6VC ¥ôEEvWBb33c·4÷7BÂb33c·5vRfײb33²f×·3Ób33²Âb33c·4EE

Share this post


Link to post
Share on other sites
phatzilla

Thanks authenticity! You are a huge help!

I am just wondering however, why is it not giving all of the results?

edit: Nevermind, its because some of hte postings dont have the email.

Ok this is a great help, i will go about finishing it off. Thank you again.

Edited by phatzilla

Share this post


Link to post
Share on other sites
Authenticity

It's because this is the first page and the page has a maximum of 100 links. Read the last comment on my last post - if you want to expand it to fish all of the links you'll have to use _HTTPGet with increasing s=xxx so in this case the next page is:

http://allentown.craigslist.org/search/sss?query=toyota&s=100

and the third is:

http://allentown.craigslist.org/search/sss?query=toyota&s=200

et cetera.

Share this post


Link to post
Share on other sites
phatzilla

Nevermind

Edited by phatzilla

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.