Alterego Posted February 25, 2005 Share Posted February 25, 2005 This is my [current] method of downloading a web page and publishing an rss feed for it. A lot of it is reusable, and that's mostly the things that were tedius and took the longest, for example, the layout of the rss feed. I used the harvard copy of the spec.It seems to me that with a GUI it is possible to have autoit go in and try to automatically generate RSS and ATOM feeds from a given web page by searching for overall patterns. There could be a text input box for any given field (see the tags below) where the user could copy/paste "hints" for the program to base its search for patters on. I'll ruminate on this for a while. Any pointers?expandcollapse popup;;;suggest viewing in autoit3 so it looks pretty :) #include <file.au3> #include <Array.au3> #Include <date.au3> #NoTrayIcon Do ;;;how many days since sql dump last time we checked? ;;;reads the text file you specify in the following format ;;; xx MO-DA-YEAR HOUR:MIN:SEC where xx is the number of days since last update FileOpen(@HomeDrive & "\Qwikly\wikipediadownload.txt",1) $lines = _FileCountLines(@HomeDrive & "\Qwikly\wikipediadownload.txt") $lastRead = FileReadLine(@HomeDrive & "\Qwikly\wikipediadownload.txt",$lines) $lastReportedDaysSinceUpdate = StringLeft($lastRead,StringInStr($lastRead," ",0,1)) ;;;how many days since sql dump now? FileDelete(@TempDir & "\index.html") ;;delete file from last time InetGet("http://download.wikimedia.org/",@TempDir & "\index.html",1,0) $a = "" ;;;sticks the web page on the clipboard for easy processing. _FileReadToArray(@TempDir & "\index.html",$a) ;;;searches the clipboard for the occurence of "Last dump made: " for the date $h = StringTrimRight(StringTrimLeft(ClipGet(),StringInStr(ClipGet(),"Last dump made: ",1) + 15),StringLen(StringTrimLeft(ClipGet(),StringInStr(ClipGet(),"Last dump made: ",1) + 15)) - 10) _ArrayToClip($a) ;;;does some string manipulations to isolate the date found on the clipboard and convert it to the format _DateDiff asks for $lastDump = _DateDiff('D',StringLeft($h,4) & "/" & StringLeft(StringTrimLeft($h,5),2) & "/" & StringTrimLeft($h,8),@YEAR & "/" & @MON & "/" & @MDAY) ;;;writes to our log file xx MO-DA-YEAR HOUR:MIN:SEC FileWrite(@HomeDrive & "\Qwikly\wikipediadownload.txt",$lastDump & " " & @MON & "-" & @MDAY & "-" & @YEAR & " " & @HOUR & ":" & @MIN & ":" & @SEC & @CRLF) ;;;;compares the number of days since a dump found in our log file with the current number found on webpage. ;;;;if the webpage days is smaller than log file days, publish a new rss feed If $lastDump < $lastReportedDaysSinceUpdate Then FileOpen(@HomeDrive & "\Qwikly\wikipediadownload.rss",1) $lastBuildDate = StringRight($lastRead,StringInStr($lastread," ",0,1)) FileOpen(@HomeDrive & "\Qwikly\wikipediadownload.rss",2) FileWrite(@HomeDrive & "\Qwikly\wikipediadownload.rss", _ '<?xml version="1.0"?>' & @CRLF & _ '<rss version="2.0">' & @CRLF & _ ' <channel>' & @CRLF & _ ' <title>Wikipedia database download</title>' & @CRLF & _ ' <link>http://download.wikimedia.org</link>' & @CRLF & _ ' <description>SQL database dumps on download.wikimedia.org have historically updated approximately twice weekly, but updates are currently biweekly to monthly.</description>' & @CRLF & _ ' <language>en-us</language>' & @CRLF & _ ' <copyright>All text is available under the terms of the GNU Free Documentation License</copyright>' & @CRLF & _ ' <ttl>150</ttl>' & @CRLF & ' <pubDate>' & @YEAR & "/" & @MON & "/" & @MDAY & @HOUR & ":" & @MIN & @SEC & " Mountain Time" & '</pubDate>' & @CRLF & _ ' <lastBuildDate>' & StringTrimLeft($lastRead,StringInStr($lastread," ",0,-2)) & '</lastBuildDate>' & @CRLF & ' <docs>http://www.wikipedia.org/wiki/Wikipedia:Database_download</docs>' & @CRLF & _ ' <generator>Qwikly.com</generator>' & @CRLF & _ ' <managingEditor>reflection+qwikly@gmail.com</managingEditor>' & @CRLF & _ ' <webMaster>simple@qwikly.com</webMaster>' & @CRLF & _ ' <item>' & @CRLF & _ ' <title>New SQL dump detected at ' & @HOUR & ":" & @MIN & " on " & @MON & "-" & @MDAY & "-" & @YEAR & '</title>' & @CRLF & _ ' <link>http://download.wikimedia.org</link>' & @CRLF & _ ' <description>All original textual content is licensed under the GNU Free Documentation License. Text written by some authors may be released under additional licenses or into the public domain. Some text (including quotations) may be used under fair use, usually where it is believed that the use will also be fair dealing outside the USA. Note that material used as "fair use" under United States law may not be legal to reproduce outside the US. See Fair use for more information.</description>' & @CRLF & _ ' <pubDate>' & @YEAR & "/" & @MON & "/" & @MDAY & @HOUR & ":" & @MIN & @SEC & " Mountain Time" & '</pubDate>' & @CRLF & _ ' </item>' & @CRLF & _ ' </channel>' & @CRLF & _ '</rss>') ;;;curl seems to be a good option for uploading via ftp... RunWait(@Comspec & ' /c curl -T' & @HomeDrive & "\Qwikly\wikipediadownload.rss" & ' -u user:pass ftp://blah.com/public_html/',@HomeDrive & '\Qwikly\',@SW_HIDE) FileClose(@HomeDrive & "\Qwikly\wikipediadownload.txt") EndIf ;;;sleep for 2 1/2 hours before we download the web page and calculate again Sleep(9000000) Until 1 = 0 This dynamic web page is powered by AutoIt 3. Link to comment Share on other sites More sharing options...
Alterego Posted February 25, 2005 Author Share Posted February 25, 2005 Thanks Insolence and Mhz for helping with readability This dynamic web page is powered by AutoIt 3. Link to comment Share on other sites More sharing options...
Insolence Posted February 25, 2005 Share Posted February 25, 2005 Great script, and thanks for mentioning me "I thoroughly disapprove of duels. If a man should challenge me, I would take him kindly and forgivingly by the hand and lead him to a quiet place and kill him." - Mark TwainPatient: "It hurts when I do $var_"Doctor: "Don't do $var_" - Lar. Link to comment Share on other sites More sharing options...
DirtyBanditos Posted February 25, 2005 Share Posted February 25, 2005 Thanks Insolence and Mhz for helping with readability<{POST_SNAPBACK}>Hi Alterego good job thx you Link to comment Share on other sites More sharing options...
steveR Posted February 26, 2005 Share Posted February 26, 2005 Nice work! AutoIt3 online docs Use it... Know it... Live it...MSDN libraryglobal Help and SupportWindows: Just another pane in the glass. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now