Sign in to follow this  
Followers 0
phatzilla

Need some help with my E-mail Saver Script

7 posts in this topic

#1 ·  Posted (edited)

Hey there guys, i have a craigslist email saver script from a while ago, i tried to fire her up and it's not working

#include <Array.au3>
#include <HTTP.au3>
#include <INet.au3>
Func _ParseASCII($sString)
    Local $aMatch = StringRegExp($sString, '&#(\d++);', 3)
    Local $sTmp = ''
   
    For $i = 0 To UBound($aMatch)-1
        $sTmp &= Chr($aMatch[$i])
    Next
   
    Return $sTmp
EndFunc

Dim $sHost = 'losangeles.craigslist.org'
Dim $sPage = '/search/sss?query=toyota'
Dim $sSource = ''
Dim $sHTTP


Dim $sHTTP = _HTTPConnect($sHost)
    If @error Then Exit
   
_HTTPGet($sHost, $sPage, $sHTTP)
    If @error Then
        ConsoleWrite(@error & @TAB & @extended & @LF)
        _HTTPClose($sHTTP)
        Exit
    EndIf

Dim $sSource = _HTTPRead($sHTTP)
Dim $aMatches = StringRegExp($sSource, '(?i)<p>[^<]*+<a\s++href="([^"]++)"', 3)

If IsArray($aMatches) Then
    Local $aSrc, $sMailto
   
    For $i = 0 To UBound($aMatches)-1
        $aSrc = StringRegExp(_INetGetSource($sHost & $aMatches[$i]), '(?i)mailto:([^?]++)\?subject',  1)
        If IsArray($aSrc) Then
            $sMailto = _ParseASCII($aSrc[0])
            ConsoleWrite($sMailto & @LF)
        EndIf
    Next
EndIf

_HTTPClose($sHTTP)

It's supposed to write every Toyota ad's email in the console, but when i do it it just writes blank lines. It's worked before and i haven't really changed anything, so i dont know exactly what's up. I think it may have something to do with the REGEX (The one which checks the e-mail link), can somebody check?

Edited by phatzilla

Share this post


Link to post
Share on other sites



Hey there guys, i have a craigslist email saver script from a while ago, i tried to fire her up and it's not working

<snip>

It's supposed to write every Toyota ad's email in the console, but when i do it it just writes blank lines. It's worked before and i haven't really changed anything, so i dont know exactly what's up. I think it may have something to do with the REGEX (The one which checks the e-mail link), can somebody check?

Is that kind of botting consistent with the owner's terms of use? Does the site post a robots.txt that you are ignoring? Did they perhaps implement some anti-bot defensive techniques?

:D


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

this is for educational purposes, am i doing something wrong? Im serious

Share this post


Link to post
Share on other sites

... am i doing something wrong?

Not necessarily, you just want to check on the web site owner's policy before botting their site. And one possible reason for such a working script to STOP working would be a defensive response from the site admin.

:D


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

Well i am not going to make any of the information available, it will be discarded and i'm quite sure that falls within the ToS.

Is the regex incorrect?

For example would

mailto:sale-2bbuk-1963163780@craigslist.org?subject=%204X4%20TOYOTA%2013xxxxxxxxx0&body=%0A%0Ahttp%3A%2F%2Flosangeles.craigslist.org%2Flac%2Fcto%2F1374263780.html%0A

Fall under

(?i)mailto:([^?]++)\?subject

(address was changed so it doesn't reflect a 'real' one

Share this post


Link to post
Share on other sites

Well i am not going to make any of the information available, it will be discarded and i'm quite sure that falls within the ToS.

Is the regex incorrect?

For example would

mailto:sale-2bbuk-1963163780@craigslist.org?subject=%204X4%20TOYOTA%2013xxxxxxxxx0&body=%0A%0Ahttp%3A%2F%2Flosangeles.craigslist.org%2Flac%2Fcto%2F1374263780.html%0A

Fall under

(?i)mailto:([^?]++)\?subject

(address was changed so it doesn't reflect a 'real' one

Well, you could just test it and see...
#include <Array.au3>

$sAdx = "mailto:sale-2bbuk-1963163780@craigslist.org?subject=%204X4%20TOYOTA" & _
        "%2013xxxxxxxxx0&body=%0A%0Ahttp%3A%2F%2Flosangeles.craigslist.org" & _
        "%2Flac%2Fcto%2F1374263780.html%0A"
$sPatt = "(?i)mailto:([^?]++)\?subject"
$avRET = StringRegExp($sAdx, $sPatt, 3)

If @error Then
    MsgBox(16, "Error", "RegExp failed, @error = " & @error)
Else
    _ArrayDisplay($avRET, "$avRET")
EndIf

:D


Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law

Share this post


Link to post
Share on other sites

Excerpt from craigslist terms of service:

You agree not to:

...

u) use automated means, including spiders, robots, crawlers, data mining

tools, or the like to download data from the Service - unless expressly

permitted by craigslist;

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0