phatzilla Posted April 9, 2009 Share Posted April 9, 2009 I already have this one which saves each craigslist webpage as defined by my keywords. E.g. Keyword = Toyota, it would search for Toyota on craigslist and save a brand new .html file with all craigslist ads that have 'toyota' in the title. Free for everyone to use expandcollapse popupWhile 1 $SITE = IniRead(@ScriptDir & "\Settings.ini", "Site", "Site", "default") $DATE = _DateTimeFormat(_NowCalc(), 1) $DATE1 = _DateTimeFormat(_NowCalc(), 3) $OIE = _IECreate($SITE, 0, 0, 1, -1) $OPS = _IETagNameGetCollection($OIE, "p") $CNTPS = @extended Local $ALINKINFO[$CNTPS + 1][3] $ALINKINFO[0][0] = "Index" $ALINKINFO[0][1] = "Link Text" $ALINKINFO[0][2] = "href" $CNT = 1 For $OP In $OPS $OLINK = _IETagNameGetCollection($OP, "a", 0) $ALINKINFO[$CNT][0] = $CNT - 1 $ALINKINFO[$CNT][1] = $OLINK.innerText $ALINKINFO[$CNT][2] = $OLINK.href $CNT += 1 Next $Y = 0 $TERMS = IniReadSection(@ScriptDir & "\Settings.ini", "Terms") For $I = 1 To $TERMS[0][0] Step 1 For $X = 1 To $CNT - 1 Step 1 If StringInStr($ALINKINFO[$X][1], $TERMS[$I][1]) <> 0 Then $Y = $Y + 1 IniWrite(@ScriptDir & "\Results.ini", "Results", $ALINKINFO[$X][1], $ALINKINFO[$X][2]) EndIf Next Next $RESULTS = IniReadSection(@ScriptDir & "\Results.ini", "Results") If $Y <> 0 Then $FILE = FileOpen(@ScriptDir & "\Results.htm", 9) FileWriteLine($FILE, "<h2>" & $DATE & "---------" & $DATE1 & "</h2> " & @CRLF) For $U = 1 To $RESULTS[0][0] Step 1 FileWriteLine($FILE, '<p><a href="' & $RESULTS[$U][1] & '">' & $RESULTS[$U][0] & "</a></p>" & @CRLF) Next FileClose($FILE) TrayTip("Craigs List Results", "Search Found " & $Y & " items!", 5000) Sleep(5000) TrayTip("", "", 0) Else FileClose($FILE) TrayTip("Craigs List Results", "Inconclusive Search (0 Results)", 5000) Sleep(5000) TrayTip("", "", 0) EndIf Sleep((1000 * 60) * 60) WEnd Now here's what i want. Let's say It searches for Toyota, but now i want it to go into every page that matches 'toyota' and save the email that's in the AD (usually @craigslist.org format). I have no idea how to go by doing this, i'd love to get some help so i can finally finish my little craigslist auction indexer Link to comment Share on other sites More sharing options...
Authenticity Posted April 9, 2009 Share Posted April 9, 2009 (edited) #include <Array.au3> #include <HTTP.au3> Dim $sHost = 'allentown.craigslist.org' Dim $sPage = '/search/sss?query=toyota' Dim $sSource = '' Dim $sHTTP Dim $sHTTP = _HTTPConnect($sHost) If @error Then Exit _HTTPGet($sHost, $sPage, $sHTTP) If @error Then ConsoleWrite(@error & @TAB & @extended & @LF) _HTTPClose($sHTTP) Exit EndIf Dim $sSource = _HTTPRead($sHTTP) Dim $aMatches = StringRegExp($sSource, '(?i)<p>[^<]*+<a\s++href="([^"]++)"', 3) If IsArray($aMatches) Then _ArrayDisplay($aMatches) _HTTPClose($sHTTP) I believe you know how to continue. Edited April 9, 2009 by Authenticity Link to comment Share on other sites More sharing options...
phatzilla Posted April 9, 2009 Author Share Posted April 9, 2009 Sorry im quite lost, can you please explain? Link to comment Share on other sites More sharing options...
Authenticity Posted April 9, 2009 Share Posted April 9, 2009 It's not as fast as you may want it to be but at least it's faster than me or you for that matter ;] If anything is not clear first check the help file. Example: expandcollapse popup#include <Array.au3> #include <HTTP.au3> #include <INet.au3> Dim $sHost = 'allentown.craigslist.org' Dim $sPage = '/search/sss?query=toyota' Dim $sSource = '' Dim $sHTTP Dim $sHTTP = _HTTPConnect($sHost) If @error Then Exit _HTTPGet($sHost, $sPage, $sHTTP) If @error Then ConsoleWrite(@error & @TAB & @extended & @LF) _HTTPClose($sHTTP) Exit EndIf Dim $sSource = _HTTPRead($sHTTP) Dim $aMatches = StringRegExp($sSource, '(?i)<p>[^<]*+<a\s++href="([^"]++)"', 3) If IsArray($aMatches) Then Local $aSrc, $sMailto For $i = 0 To UBound($aMatches)-1 $aSrc = StringRegExp(_INetGetSource($sHost & $aMatches[$i]), '(?i)mailto:([^?]++)\?subject', 1) If IsArray($aSrc) Then $sMailto = _ParseASCII($aSrc[0]) ConsoleWrite($sMailto & @LF) EndIf Next EndIf _HTTPClose($sHTTP) Func _ParseASCII($sString) Local $aMatch = StringRegExp($sString, '&#(\d++);', 3) Local $sTmp = '' For $i = 0 To UBound($aMatch)-1 $sTmp &= Chr($aMatch[$i]) Next Return $sTmp EndFuncoÝ÷ Øë¦ë¡×£(uëì&n6õ×]û÷}ºÙÊÚ%Ëh®ë7ßØ5×]ùë®·ÝÊÚ%Ëh®ç¸¹·õ×]¼ëß=ÙÊÚ%Ëh®è0qúµ×]¼ëMöíÊÚ%Ëh®ê¬çµ×]ºß~wõÊÚ%Ëh®ìö«u×]ºÓ¾ºõÊÚ%Ëh®îûµ×]·÷n¶éÊÚ%Ëh®é¸¦¦õ×]·×]=áÊÚ%Ëh®é¶vu×]¶ß¾yÙÊÚ%Ëh®ëû95×]µß½´áÊÚ%Ëh®îé5×]´çÞºéÊÚ%Ëh®ìª¨u×]}ï¸áÊÚ%Ëh®çnüu×]}Û]ôéÊÚ%Ëh®ëÚÝÌõ×]yÛ]:éÊÚ%Ëh®é;u×]6÷Î9íÊÚ%Ëh®ìfm¸5×]xó½ùáÊÚ%Ëh®ìc³õ×]xß}7ÑÊÚ%Ëh®í{õ×]vóM}ñÊÚ%Ëh®çglõ×]vçýáÊÚ%Ëh®í¦Ûlõ×]tã_|ñÊÚ%Ëh®ææmµ×]=÷vÙÊÚ%Ëh®ê×5×]=ó~tíÊÚ%Ëh®ë÷õ×]:ë~¸ÙÊÚ%Ëh®æï¦5×]6óûÝÊÚ%Ëh®îÆ©µ×O}ë¾8éÊÚ%Ëh®ëæî¨õ×O}ã®´ÙÊÚ%Ëh®çsÃ'µ×O}ÓO9ÑÊÚ%Ëh®é£Ï*u×O{ó¾ôÙÊÚ%Ëh®ç*¶Kõ×OzótÝÊÚ%Ëh®ë!µ×Ox×myõÊÚ%Ëh®ës5×OwóÝxñÊÚ%Ëh®ï7º{õ×Ot×_}ÕÊÚ%Ëh®î9§©5×Ouëo:åÊÚ%Ëh®ëmÂëu×OtßÍwÑÊÚ%Ëh®èÞ©®µ×O9Û8áÊÚ%Ëh®é·Îõ×O8÷½{íÊÚ%Ëh®êî¬u×Ný÷®÷ÝÊÚ%Ëh®ëñ®øu×Nüó]úõÊÚ%Ëh®é£Ê¹õ×NüÓ~xÝÊÚ%Ëh®ê|×µ×Nûç^¶ÕÊÚ%Ëh®ëÛu×Nùë¾6åÊÚ%Ëh®çïâ{µ×Nùß:ÑÊÚ%Ëh®ê|«5×NùÛwÕÊÚ%Ëh®îìõçu×NöÓÞôÑÊÚ%Ëh®éº·u×Nõß9ÙÊÚ%Ëh®èð²Hõ×Nõß;õÊÚ%Ëh®è.Fµ×Nôë~:éÊÚ%Ëh®é'«Ûµ×N½ßß}õÊÚ%Ëh®íÚºh5×Nº×ôõÊÚ%Ëh®çßëµ×N¹ç_våÊÚ%Ëh®ì¦~Ø5×N¹ã¼ÕÊÚ%Ëh®Ü¡×¿v+ÓêÞv'ßz·§qçë¢g±¥©Ýx"¶Ú,q©áºwkyÛ(~X§È¬ë-o*®z¼¢®¶seôEEvWBb33¶ÆÆVçF÷vâæ7&w6Æ7Bæ÷&rb33²Âb33²÷6V&6÷773÷VW'×F÷÷Ff×·3Ób33²Âb33c·4EE£²÷"÷"#âââVçFÂFW&R2âW'&÷"ࣲ÷"2FRf&&ÆW2&RW6VC ¥ôEEvWBb33c·4÷7BÂb33c·5vRfײb33²f×·3Ób33²Âb33c·4EE Link to comment Share on other sites More sharing options...
phatzilla Posted April 9, 2009 Author Share Posted April 9, 2009 (edited) Thanks authenticity! You are a huge help! I am just wondering however, why is it not giving all of the results? edit: Nevermind, its because some of hte postings dont have the email. Ok this is a great help, i will go about finishing it off. Thank you again. Edited April 9, 2009 by phatzilla Link to comment Share on other sites More sharing options...
Authenticity Posted April 9, 2009 Share Posted April 9, 2009 It's because this is the first page and the page has a maximum of 100 links. Read the last comment on my last post - if you want to expand it to fish all of the links you'll have to use _HTTPGet with increasing s=xxx so in this case the next page is: http://allentown.craigslist.org/search/sss?query=toyota&s=100 and the third is: http://allentown.craigslist.org/search/sss?query=toyota&s=200 et cetera. Link to comment Share on other sites More sharing options...
phatzilla Posted April 9, 2009 Author Share Posted April 9, 2009 (edited) Nevermind Edited April 9, 2009 by phatzilla Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now