phatzilla Posted September 14, 2009 Posted September 14, 2009 (edited) Hey there guys, i have a craigslist email saver script from a while ago, i tried to fire her up and it's not working expandcollapse popup#include <Array.au3> #include <HTTP.au3> #include <INet.au3> Func _ParseASCII($sString) Local $aMatch = StringRegExp($sString, '&#(\d++);', 3) Local $sTmp = '' For $i = 0 To UBound($aMatch)-1 $sTmp &= Chr($aMatch[$i]) Next Return $sTmp EndFunc Dim $sHost = 'losangeles.craigslist.org' Dim $sPage = '/search/sss?query=toyota' Dim $sSource = '' Dim $sHTTP Dim $sHTTP = _HTTPConnect($sHost) If @error Then Exit _HTTPGet($sHost, $sPage, $sHTTP) If @error Then ConsoleWrite(@error & @TAB & @extended & @LF) _HTTPClose($sHTTP) Exit EndIf Dim $sSource = _HTTPRead($sHTTP) Dim $aMatches = StringRegExp($sSource, '(?i)<p>[^<]*+<a\s++href="([^"]++)"', 3) If IsArray($aMatches) Then Local $aSrc, $sMailto For $i = 0 To UBound($aMatches)-1 $aSrc = StringRegExp(_INetGetSource($sHost & $aMatches[$i]), '(?i)mailto:([^?]++)\?subject', 1) If IsArray($aSrc) Then $sMailto = _ParseASCII($aSrc[0]) ConsoleWrite($sMailto & @LF) EndIf Next EndIf _HTTPClose($sHTTP) It's supposed to write every Toyota ad's email in the console, but when i do it it just writes blank lines. It's worked before and i haven't really changed anything, so i dont know exactly what's up. I think it may have something to do with the REGEX (The one which checks the e-mail link), can somebody check? Edited September 14, 2009 by phatzilla
PsaltyDS Posted September 14, 2009 Posted September 14, 2009 Hey there guys, i have a craigslist email saver script from a while ago, i tried to fire her up and it's not working<snip>It's supposed to write every Toyota ad's email in the console, but when i do it it just writes blank lines. It's worked before and i haven't really changed anything, so i dont know exactly what's up. I think it may have something to do with the REGEX (The one which checks the e-mail link), can somebody check?Is that kind of botting consistent with the owner's terms of use? Does the site post a robots.txt that you are ignoring? Did they perhaps implement some anti-bot defensive techniques? Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
phatzilla Posted September 14, 2009 Author Posted September 14, 2009 this is for educational purposes, am i doing something wrong? Im serious
PsaltyDS Posted September 14, 2009 Posted September 14, 2009 ... am i doing something wrong?Not necessarily, you just want to check on the web site owner's policy before botting their site. And one possible reason for such a working script to STOP working would be a defensive response from the site admin. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
phatzilla Posted September 14, 2009 Author Posted September 14, 2009 Well i am not going to make any of the information available, it will be discarded and i'm quite sure that falls within the ToS. Is the regex incorrect? For example would mailto:sale-2bbuk-1963163780@craigslist.org?subject=%204X4%20TOYOTA%2013xxxxxxxxx0&body=%0A%0Ahttp%3A%2F%2Flosangeles.craigslist.org%2Flac%2Fcto%2F1374263780.html%0A Fall under (?i)mailto:([^?]++)\?subject (address was changed so it doesn't reflect a 'real' one
PsaltyDS Posted September 14, 2009 Posted September 14, 2009 Well i am not going to make any of the information available, it will be discarded and i'm quite sure that falls within the ToS. Is the regex incorrect? For example would mailto:sale-2bbuk-1963163780@craigslist.org?subject=%204X4%20TOYOTA%2013xxxxxxxxx0&body=%0A%0Ahttp%3A%2F%2Flosangeles.craigslist.org%2Flac%2Fcto%2F1374263780.html%0A Fall under (?i)mailto:([^?]++)\?subject (address was changed so it doesn't reflect a 'real' one Well, you could just test it and see... #include <Array.au3> $sAdx = "mailto:sale-2bbuk-1963163780@craigslist.org?subject=%204X4%20TOYOTA" & _ "%2013xxxxxxxxx0&body=%0A%0Ahttp%3A%2F%2Flosangeles.craigslist.org" & _ "%2Flac%2Fcto%2F1374263780.html%0A" $sPatt = "(?i)mailto:([^?]++)\?subject" $avRET = StringRegExp($sAdx, $sPatt, 3) If @error Then MsgBox(16, "Error", "RegExp failed, @error = " & @error) Else _ArrayDisplay($avRET, "$avRET") EndIf Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law
jvanegmond Posted September 14, 2009 Posted September 14, 2009 Excerpt from craigslist terms of service: You agree not to: ... u) use automated means, including spiders, robots, crawlers, data mining tools, or the like to download data from the Service - unless expressly permitted by craigslist; github.com/jvanegmond
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now