ParoXsitiC Posted September 9, 2007 Share Posted September 9, 2007 I am making a script that will take a craigslist posting and parse the cars brands, models, year etc and then do an auto lookup at edmunds.com's true market value. This way I can instantly get an idea how good of a deal it is. I am having problem with parsing out the years because some people put 88 or 94 instead of 1988 and 1994. Right now I am just working on making it check the start of the listing for /d/d. However, because of compatibility problems with older scripts I use I have to resort on a 3.1.1.132 version. The way it does start of a match is this i think: \< Match beginning of word. \> Match end of word. Here is my script so far: expandcollapse popup#include <INet.au3> For $index = 1 To 1 $Source = _INetGetSource('http://detroit.craigslist.org/car/index' & $index & '00.html') $Listings = StringRegExp($Source, '<p>(.*?)</p>', 3) Dim $CarDatabase[100][2] $CarDatabase[0][1] = "ford" $CarDatabase[0][0] = "mustang" Dim $Cars[UBound($Listings) ][9] For $i = 0 To UBound($Listings) - 1 $Link = "" $City = "" $Year = "" $Brand = "" $Model = "" $Price = "" $TMV_Low = "" $TMV_High = "" $Bargin = "" $LinkSearch = StringRegExp($Listings[$i], '<a href="(.*?)">', 3) If @extended Then $Link = $LinkSearch[0] $CitySearch = StringRegExp($Listings[$i], '<font size="-1"> (.*?)</font>', 3) If @extended Then $City = $CitySearch[0] $YearSearch = StringRegExp($Listings[$i], '(19\d\d)|(200\d)\D', 3) If @extended Then $Year = $YearSearch[0] Else $YearSearch = StringRegExp($Listings[$i], '(\d\d)\D', 3) If @extended Then If StringRegExp($YearSearch[0], '(0\d)', 0) Then $Year = '20' & $YearSearch[0] Else $Year = '19' & $YearSearch[0] EndIf EndIf EndIf For $ii = 0 To UBound($CarDatabase, 1) - 1 If StringInStr($Listings[$i], $CarDatabase[$ii][0]) Then $Brand = $CarDatabase[$ii][1] $Model = $CarDatabase[$ii][0] $Source = _INetGetSource('http://www.edmunds.com/used/' & $Year & '/' & $Brand & '/' & $Model & '/') $TMV_LowSearch = StringRegExp($Source, 'Dealer Retail:<b> $(.*?) - ', 3) If @extended Then $TMV_Low = $TMV_LowSearch[0] $TMV_HighSearch = StringRegExp($Source, ' - $(.*?)</b></font>', 3) If @extended Then $TMV_High = $TMV_HighSearch[0] ExitLoop EndIf Next $PriceSearch = StringRegExp($Listings[$i], ' - $(\d*)', 3) If @extended Then $Price = $PriceSearch[0] $Cars[$i][0] = $Link $Cars[$i][1] = $City $Cars[$i][2] = $Year $Cars[$i][3] = $Brand $Cars[$i][4] = $Model $Cars[$i][5] = $Price $Cars[$i][6] = $TMV_Low $Cars[$i][7] = $TMV_High $Cars[$i][8] = $Bargin Next Next $file = FileOpen("Output.html", 2) FileWriteLine($file, '<table>') FileWriteLine($file, '<tr><td>Listing</td><td>Year</td><td>Car Brand</td><td>Car Model</td><td>Price</td><td>TMV Low</td><td>TMV High</td></tr>') For $i = 0 To UBound($Listings) - 1 FileWriteLine($file, '<tr><td><a href="' & $Cars[$i][0] & '">' & $Listings[$i] & '</a></td><td>' & $Cars[$i][2] & '</td><td>' & $Cars[$i][3] & '</td><td>' & $Cars[$i][4] & '</td><td>' & $Cars[$i][5] & '</td><td>' & $Cars[$i][6] & '</td><td>' & $Cars[$i][7] & '</td></tr>') Next FileWriteLine($file, '</table>') FileClose($file) Its a very early stage as you can see. The problem lies in here: $YearSearch = StringRegExp($Listings[$i], '(19\d\d)|(200\d)\D', 3) If @extended Then $Year = $YearSearch[0] Else $YearSearch = StringRegExp($Listings[$i], '(\d\d)\D', 3) If @extended Then If StringRegExp($YearSearch[0], '(0\d)', 0) Then $Year = '20' & $YearSearch[0] Else $Year = '19' & $YearSearch[0] EndIf EndIf EndIf Basically it searches for a 19xx or 200x and if it doesnt find that it will try to look for just a \d\d\D or digit digit non-digit obviously this messes up because the price is in the listing as well. So I need it search the very beginning of the string. Now I just had an idea that I could just have the price removed from the search string first and then I should be left with just years for numbers. None the less I still would like to know how to search the beginning of a string for 3.1.1.132 AutoIT. I know newer versions use /a and /z i think. Link to comment Share on other sites More sharing options...
ParoXsitiC Posted September 9, 2007 Author Share Posted September 9, 2007 Nevermind, I forgot that I changed listing from just the text to the whole line, including the html so >/d/d works Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now