Pandemic Posted December 19, 2009 Share Posted December 19, 2009 I'm entirely confused... I've been looking at this for a bit, did a lot of debugging, and I'm still baffled. The goal of this script is go to through and download all the xkcd comics, as well as the alt text. It gets the image for the comic perfectly fine, but then it goes to grab the alt text, and it fails horribly. Sometimes it gets too much, sometimes it doesn't get enough. Funny thing is, I used the same method for getting the alt-text as I did for getting the comic URL, and the comic URL hasn't broken yet. expandcollapse popup$num = 1 $max = 677 while $num < $max ToolTip("" & $num & '/' & $max, 0,0) sleep(2000) ;KEEP THIS LINE. DoS attacks are not win. If $num <> 404 Then InetGet("http://xkcd.com/" & $num & "/", "htm") ;Gets the page code $str = FileReadLine("htm", 77) ;Line 77 is the line with the comic image (well, it's actually 75 in Chrome, but AutoIt is 2 lines off for who knows what reason...) $str = StringTrimLeft($str, 10) ;Trimming 10 characters eliminates the <img=" $f = FileOpen("htm", 2) ;write FileWriteLine($f, $str) FileClose($f) $f = FileOpen("htm", 0) ;read $k = 0 $c = " " while $c <> '"' ;Ends the URL for the comic $c = FileRead($f, 1) $k += 1 WEnd ;CODE BREAKS SOMEWHERE AFTER THIS POINT $addr = StringLeft($str, $k-1) ;Comic address $str = fileread($f) $str = StringTrimLeft($str, 8) ;Trims off the title=" FileWrite($f, $str) FileClose($f) $f = FileOpen("htm", 0) ;read $k = 0 $c = "" while $c <> '"' ;Gets the alt text $c = FileRead($f, 1) $k += 1 WEnd $alt = StringLeft($str, $k-1) FileClose($f) $f = fileopen($num & ".txt", 2) ;write FileWrite($f, $alt) ;write the alt-text FileClose($f) MsgBox(0, "Alt Text", "" & $alt & @LF & "Length: " & StringLen($alt)) If $num > 116 Then ;117 changes from .jpg to .png InetGet($addr, $num & ".png") Else InetGet($addr, $num & ".jpg") EndIf EndIf $num += 1 WEnd FileDelete("htm") On a sidenote, if you know of a string function that scans the string for a character please let me know . I looked for a bit, but then figured that I need to download the comic's HTML code anyway, so I might as well keep using files. -Pandemic Link to comment Share on other sites More sharing options...
PsaltyDS Posted December 19, 2009 Share Posted December 19, 2009 Ya'know, Randall Monroe releases his comics under Copyleft (CC Attribution-NonCommercial ) so there may not be any issue with downloading the comics, but he supports the site by selling XKCD collections 'n stuff, too......and it is close to Christmas.I'm just say'n. Valuater's AutoIt 1-2-3, Class... Is now in Session!For those who want somebody to write the script for them: RentACoder"Any technology distinguishable from magic is insufficiently advanced." -- Geek's corollary to Clarke's law Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now