ronburgandy Posted August 21, 2006 Share Posted August 21, 2006 I am just learning about this scripting tool and was wondering if some of you veterans here could enlighten me. I was wondering if it would be possible to take several (possibly thousands) of text files that all have pretty much the same structure and reading each file, pulling the data (text) out of the file and writing it to a spreadsheet, then going to the next text file, writing that data to row #2 of the spreadsheet and so on into a loop until all text files in that folder have been read? Link to comment Share on other sites More sharing options...
blakel Posted August 21, 2006 Share Posted August 21, 2006 Yes. You could write them to a .csv file, XML file, or one of several other formats that excel supports. Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 How does that work? do you tell the script to count x-number of spaces on line#1 then copy the text up to such and such tag or character and paste that in cell A, etc? Link to comment Share on other sites More sharing options...
Valuater Posted August 21, 2006 Share Posted August 21, 2006 i think its better to search all the files you desire for the "info' you want, then when that info is found .... write it to the .csv file.... then open the excell with that file to be opened in a "run" command 8) Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 The info is contained in each file, I already know I want the info to be extracted out of each file. For example, I need all of the information out of file #1 to be copied into row#1 of the spreadsheet, then I need all of the info in file#2 to be inserted/copied into row#2 of the spreadsheet, and so on. My question is more centered around how do I get a script to do this if the text files have some text I don't want inserted such as html tags. All of the files however have the same layout and internal structure, they just have different data. The html files are basically "View, source, save" of web pages of business listings. I have saved all of these as txt files and now want a script to extract the business listings out of the html code from the text files and save them all in a spreadsheet or MDB or some sort of database I can use to sort the listings and build mailing lists from Link to comment Share on other sites More sharing options...
blakel Posted August 21, 2006 Share Posted August 21, 2006 So that we can better anwser your question: 1. Are you new to scripting / programming in general? 2. Are you new to Auto-It? 3. Just don't know how to do this particular task in Auto-It? Link to comment Share on other sites More sharing options...
Valuater Posted August 21, 2006 Share Posted August 21, 2006 (edited) i think you might understand it like this... there is a hard way and an easy way to do this 1 hard way we siffer through the downloaded files and find the info you want... then place it in the correct cell in the correct row and in the correct column of the excell program that is already open 2 easy way we siffer through the downloaded files and find the info you want... then place in a file named "address.csv".. then run excell with that file Since you seem to be new and would need guidence, i strongly suggest the easy way 8) Edited August 21, 2006 by Valuater Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 So that we can better anwser your question:1. Are you new to scripting / programming in general?No2. Are you new to Auto-It?Yes3. Just don't know how to do this particular task in Auto-It?Yes Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 i think you might understand it like this...there is a hard way and an easy way to do this1 hard waywe siffer through the downloaded files and find the info you want... then place it in the correct cell in the correct row and in the correct column of the excell program that is already open2 easy way we siffer through the downloaded files and find the info you want... then place in a file named "address.csv"..then run excell with that fileSince you seem to be new and would need guidence, i strongly suggest the easy way8)Yes, my goal here is to try to get the data into a managable format. i don't care if it is tab-delimited, CSV, excel, mySQL, anything can be converted at that point. Link to comment Share on other sites More sharing options...
Valuater Posted August 21, 2006 Share Posted August 21, 2006 to start ( post a file ) 1 we need to see the data from a file that was downloaded 2 tell us what you want to get ( info ) from that file ( since all files are of the same format ) 8) Link to comment Share on other sites More sharing options...
vim Posted August 21, 2006 Share Posted August 21, 2006 So that we can better anwser your question: 1. Are you new to scripting / programming in general? No 2. Are you new to Auto-It? Yes 3. Just don't know how to do this particular task in Auto-It? Yes On #1 above. Have you done this task in another language, say VBS? If so can you show us that code logic? ViM Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 So that we can better anwser your question:1. Are you new to scripting / programming in general?No2. Are you new to Auto-It?Yes3. Just don't know how to do this particular task in Auto-It?YesOn #1 above. Have you done this task in another language, say VBS?If so can you show us that code logic?ViMNo, unfortunately, I have not - most of my scripting experience is with installation setups and registry editing Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 to start ( post a file )1we need to see the data from a file that was downloaded2tell us what you want to get ( info ) from that file( since all files are of the same format )8)here is the file, the information I want to extract is found under <!-- Business Data Detail -->. I would like to extract as much of these details as possible if I couldexample.txt Link to comment Share on other sites More sharing options...
vim Posted August 21, 2006 Share Posted August 21, 2006 here is the file, the information I want to extract is found under <!-- Business Data Detail -->. I would like to extract as much of these details as possible if I couldIn the original post you said test files, and this is labeled .txt and yet the first line says its an HTML.In any rate, if I rename it from .txt to .htm I see what looks like five sections:* * * Business Information * * ** * * License Status * * ** * * Classifications * * ** * * Bonding Information * * ** * * Workers Compensation Information * * *If correct, is there one section in paticular that your trying to capture, say * * * Business Information * * *?ViM Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 In the original post you said test files, and this is labeled .txt and yet the first line says its an HTML.In any rate, if I rename it from .txt to .htm I see what looks like five sections:* * * Business Information * * ** * * License Status * * ** * * Classifications * * ** * * Bonding Information * * ** * * Workers Compensation Information * * *If correct, is there one section in paticular that your trying to capture, say * * * Business Information * * *?ViMwhen saving the html output from my browser (view, source) I intentionally saved the file as txt instead of htm as I thought raw text would be easier to process. As far as the sections, I want to try to capture all of the information. Each section has entries I would like to be able to save in it's own individual cell - such as under * * * Business Information * * *, I want to save, name, street, city, zip, etc in their own field or cell. Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 (edited) here is the info I want to pull from these files:<!-- Business Data Detail --> <table> <tr><td> <table bgcolor="#FFF8DC" border="0" cellpadding="2" cellspacing="1" width="100%"><tr><td width="100%"> <CENTER>Extract Date: <B>08/20/2006</B><br /></CENTER><CENTER><H3>* * * Business Information * * *</H3>ROB B SMITH<br />509 DIVOT DRIVE<br />CITRUS HEIGHTS, CA 95610<br />Business Phone Number: (916) 555-5555<br /><br />Entity: <B>Sole Ownership</B><br />Issue Date: <B>01/23/1968</B>Â Expire Date: <B>03/31/2007</B></CENTER><H3><CENTER>* * * License Status * * *</CENTER></H3>This license is expired at this time.<H3><CENTER>* * * Classifications * * *</CENTER></H3><CENTER><TABLE BORDER=1><TR><TH id=Class>Class</TH><TH id=Desc>Description</TH></TR><TR><TD headers=Class><B>B</B></TD></TR></TABLE></CENTER><H3><CENTER>* * * Bonding Information * * *</CENTER></H3><CENTER><B>BOND: </B>This license filed Bond number <B>555570</B> in the amount of <B>$5,000</B> with the bonding company <br /><B>BONDING COMPANY OF ATLANTA</A>.</B><br />Effective Date: <B>01/01/1980 </B> Cancellation Date: <B>06/07/2007</B><br /><br /></CENTER><H3><CENTER>* * * Workers Compensation Information * * *</CENTER></H3><CENTER>There was no workers comp information found for this license.<br /></CENTER><CENTER></CENTER><br /><br /> Edited August 21, 2006 by ronburgandy Link to comment Share on other sites More sharing options...
ronburgandy Posted August 21, 2006 Author Share Posted August 21, 2006 sorry so many posts, I'm just trying to illustrate what I would like to see accomplished with this... Link to comment Share on other sites More sharing options...
Valuater Posted August 21, 2006 Share Posted August 21, 2006 (edited) sorry so many posts, I'm just trying to illustrate what I would like to see accomplished with this...can you post the website(s) ( or pm me... you can also give me an IM name )i will try to get this going8) Edited August 21, 2006 by Valuater Link to comment Share on other sites More sharing options...
Valuater Posted August 21, 2006 Share Posted August 21, 2006 I am so glad he chickened out and left the building... I was going to do all of it for him .... what a load off of myself...wehw 8) i wont a a guilt-trip if he returns either..lol Link to comment Share on other sites More sharing options...
bluebearr Posted August 22, 2006 Share Posted August 22, 2006 Well, this mess of a function will take the file given and extract out the desired data: expandcollapse popup; $Modes is text that labels the following text as data that we want $Modes = "License #|Extract Date:|* * * Business Information * * *|Business Phone Number:|Entity:|Issue Date:|Expire Date:" _ & "|* * * License Status * * *|Description|" _ & "This license filed Contractor's Bond number|in the amount of|with the bonding company|" _ & "Effective Date:|Cancellation Date:|* * * Workers Compensation Information * * *" $ModesA = StringSplit($Modes, "|") $Myfile = @DesktopDir & "\example.txt" $theData = _StripData($Myfile) ; write data out to file FileWrite(StringReplace($Myfile, ".txt", "-2.txt"), $theData) MsgBox(0, "Debug", $theData) Func _StripData($file) $data = FileRead($file, FileGetSize($file)) ; Preprocess the data a little to make it easier to deal with. we will delimit all data with a @TAB $data = StringReplace($data, "<TD>License #", "License #" & @TAB) ; - so we can find the license number $data = StringReplace($data, "Business Phone Number:", "Business Phone Number:" & @TAB) ; - so we can find the phone number $tags = "<TD>,</TD>,<TR>,</TR>,</font>,<B>,</B>,<H3>,</H3>,<CENTER>,</CENTER>,</A>," _ & "<br />,</TABLE>,<TH id=Class>,</TH>,<TH id=Desc>,<TD headers=Class>, ," & @LF & "," & @CR $tagsA = StringSplit($tags, ",") For $i = 1 to $tagsA[0] $data = StringReplace($data, $tagsA[$i], @TAB) ; - replace all the tags we can easily identify with @TAB Next $dataA = StringSplit($data, @TAB) ; - make into an array Dim $Mode, $line = 1, $PrettyData, $Address, $Class For $j = 1 to $dataA[0] ; "$Mode" lets us know that we have found a tag that identifies significant data coming up. ; We need to treat some data special, so we use a Select-Case statement If $dataA[$j] <> '' And $Mode <> '' Then ; Ignore blank data Select Case $Mode = "* * * Business Information * * *" If $dataA[$j] <> "Business Phone Number:" Then $Address = $Address & $dataA[$j] & @TAB ; - stack up the address. Assume that all addresses have the ; same number of lines in each file. Otherwise, might want to put commas between lines and have the ; full address go into a single cell in Excel. EndIf Case $Mode = "Business Phone Number:" ; Now that we are at the Phone number, add the address to the saved data $PrettyData = $PrettyData & $Address & $dataA[$j] & @TAB $Mode = '' Case $Mode = "Description" If StringStripWS($dataA[$j], 3) = "* * * Bonding Information * * *" Then ; don't record anymore when we get to this string $Mode = '' Else $Class = $Class & $dataA[$j] & @TAB EndIf Case $Mode = "* * * License Status * * *" $PrettyData = $PrettyData & $dataA[$j] & @TAB $Mode = '' Case $Mode = "This license filed Contractor's Bond number" $TmpA = StringSplit($Class, @TAB) If $TmpA[2] = '' Then ; handle the case where the Description is blank, like in the example, by adding extra tab. $Class = $Class & @TAB EndIf $PrettyData = $PrettyData & $Class & $dataA[$j] & @TAB $Mode = '' Case Else ; All other modes ; $Mode = "NO_MORE" means that we are not looking for any more data. If $Mode <> "NO_MORE" And StringInStr($Modes, $Mode) And Not(StringInStr($Modes, $dataA[$j])) Then $PrettyData = $PrettyData & $dataA[$j] & @TAB If $Mode = "* * * Workers Compensation Information * * *" Then $Mode = "NO_MORE" Else $Mode = '' EndIf EndIf EndSelect EndIf ; See if the mode changes If $Mode <> "NO_MORE" Then For $k = 1 to $ModesA[0] If StringStripWS($dataA[$j], 3) = $ModesA[$k] Then ; We found a data tag. $Mode = $ModesA[$k] EndIf Next EndIf ;ConsoleWrite("Data = " & $dataA[$j] & " " & "$Mode = " & $Mode & @LF) Next Return $PrettyData EndFunc BlueBearrOddly enough, this is what I do for fun. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now