Kate Posted June 13, 2007 Share Posted June 13, 2007 Hello all, Ive been using AutoIT for a few years now and Ive run into a project where Im not sure AutoIT is the right solution. It could be that Im not using AutoIT as effectively as I possible. Anyway, let me tell you a bit about what Im trying to do and how Ive approached it with AutoIT: I work at a hospital and my project involves a daily report from our patient information system. A Long Time Ago, this report was printed on a line printer. Now, of course, the report must be handled electronically. If you open the report (its a .txt file) it each page is delineated by an ASCII form feed. Each page has a nice header with the patients information. My mission is to take the original report and divide it up into two files based on a piece of information called Financial Class, which is included in the header. Heres how Ive approached it with AutoIT: I have a loop that (using FileReadLine) looks for the ASCII form feeds. I use the location of the form feeds to determine a page. Once I get a range of lines defined (e.g. lines 59 thru 117=one page), I pass that information to a function. This function (another loop using FileReadLine) finds the header line within the page and using StringMid, extracts the Financial Class. Then using a case statement, I output lines 59 through 117 using FileWriteLine to the appropriate file. Then its back to the first loop to find the next page and so on. Ive successfully written a script that does this, but its performance is not ideal. Many of these reports I need to process are lengthy the one Im working with for testing is 350 pages. It takes my script an hour to process it. My questions are: 1) Am I using AutoIT optimally here in other words, can I write a better script that doesnt have the performance issues (maybe one that doesnt need to read every single line)? 2) Should I be looking at another programming language (perl maybe?) or program to do this? Thanks in advance, Kate Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted June 13, 2007 Moderators Share Posted June 13, 2007 If you can provide a "dummy" txt file, and show the exact output you are looking for, myself/someone will more than likely show you how to effectively do it with StringRegExp(). Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
ptrex Posted June 13, 2007 Share Posted June 13, 2007 @All Look at my signature and see LOG PARSER examples. This might be a better way of dealing with this. regards ptrex Contributions :Firewall Log Analyzer for XP - Creating COM objects without a need of DLL's - UPnP support in AU3Crystal Reports Viewer - PDFCreator in AutoIT - Duplicate File FinderSQLite3 Database functionality - USB Monitoring - Reading Excel using SQLRun Au3 as a Windows Service - File Monitor - Embedded Flash PlayerDynamic Functions - Control Panel Applets - Digital Signing Code - Excel Grid In AutoIT - Constants for Special Folders in WindowsRead data from Any Windows Edit Control - SOAP and Web Services in AutoIT - Barcode Printing Using PS - AU3 on LightTD WebserverMS LogParser SQL Engine in AutoIT - ImageMagick Image Processing - Converter @ Dec - Hex - Bin -Email Address Encoder - MSI Editor - SNMP - MIB ProtocolFinancial Functions UDF - Set ACL Permissions - Syntax HighLighter for AU3ADOR.RecordSet approach - Real OCR - HTTP Disk - PDF Reader Personal Worldclock - MS Indexing Engine - Printing ControlsGuiListView - Navigation (break the 4000 Limit barrier) - Registration Free COM DLL Distribution - Update - WinRM SMART Analysis - COM Object Browser - Excel PivotTable Object - VLC Media Player - Windows LogOnOff Gui -Extract Data from Outlook to Word & Excel - Analyze Event ID 4226 - DotNet Compiler Wrapper - Powershell_COM - New Link to comment Share on other sites More sharing options...
Kate Posted June 13, 2007 Author Share Posted June 13, 2007 If you can provide a "dummy" txt file, and show the exact output you are looking for, myself/someone will more than likely show you how to effectively do it with StringRegExp(). Sure...at the bottom of this post are two dummy pages of the original - the only change I made was to label the form feeds since they didn't show up with the paste. As for output? The output formatting will be the same. What I need my script to do is look at the header: PT NAME PATIENT NO PT FC REP RC REASON PRODUCED HOSP PAGE ...and look at the number under FC (in my code example below, the FC on the first page is 7 and on the second page it's 6.) Any page with FC=7 will need to be output to one file and any page with FC=6 will need to be output to another file. Again, these reports can be lengthly, so the script will need to append to the output files as it goes. Does that make sense? My script does this already, but it just isn't very efficient. :| Thanks, Kate expandcollapse popupPT NAME PATIENT NO PT FC REP RC REASON PRODUCED HOSP PAGE XXXXXXXXXXXXXXXXXXXXXXXXX 99999999999-9 X 7 999 X XXXXXXXX 99/99/99 XXXX 9999 ADM 99/99/99 MR# XXXXXXXXXXXX BD 99/99/99 SEX X SS# 999-99-9999 DISP XXX DSCH 99/99/99 DR XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX MAR ST XXXXXXXXXXXX _______________________________________________________________________________ GUARANTOR# 9999999999 PH1 99999999 999-9999 X99999 PH2 99999999 999-9999 X99999 XXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX 99999-9999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX _______________________________________________________________________________ EMP NAME XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX PHONE 99999999 999-9999 X99999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX 99999-9999 CONTACT XXXXXXXXXXXXXXXXXXXXXXX _______________________________________________________________________________ FBL 99/99/99 CR X CA X DIST XXX STMTS 999 ST CD X CTRCT AMT 9999.00 PER X XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX 9999,999.00 9999,999.00 9999,999.00 9999,999.00 XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXX 9999,999.00 9999,999.00 9999,999.00 9999,999.00 999,999.00 _______________________________________________________________________________ XX UNBL9999,999.00 XX UNBL9999,999.00 XX UNBL9999,999.00 XX UNBL9999,999.00 PT UNBL9999,999.00 TOT CHGS UNBILLED 9999,999.00 CURRENT BALANCE PT CURR BALANCE 9999,999.00 I1 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 I2 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 I3 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 I4 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 TOT CURR ACCOUNT BAL: 9999,999.00 _______________________________________________________________________________ POSTDATE SVC DATE SVC CD DESCRIPTION AMOUNT BALANCE 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 <THERE IS A FORM FEED HERE> PT NAME PATIENT NO PT FC REP RC REASON PRODUCED HOSP PAGE XXXXXXXXXXXXXXXXXXXXXXXXX 99999999999-9 X 6 999 X XXXXXXXX 99/99/99 XXXX 9999 ADM 99/99/99 MR# XXXXXXXXXXXX BD 99/99/99 SEX X SS# 999-99-9999 DISP XXX DSCH 99/99/99 DR XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX MAR ST XXXXXXXXXXXX _______________________________________________________________________________ GUARANTOR# 9999999999 PH1 99999999 999-9999 X99999 PH2 99999999 999-9999 X99999 XXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX 99999-9999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX _______________________________________________________________________________ EMP NAME XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX PHONE 99999999 999-9999 X99999 XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XX 99999-9999 CONTACT XXXXXXXXXXXXXXXXXXXXXXX _______________________________________________________________________________ FBL 99/99/99 CR X CA X DIST XXX STMTS 999 ST CD X CTRCT AMT 9999.00 PER X XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX 9999,999.00 9999,999.00 9999,999.00 9999,999.00 XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXXX XXXXXXXXXXXXXX 9999,999.00 9999,999.00 9999,999.00 9999,999.00 999,999.00 _______________________________________________________________________________ XX UNBL9999,999.00 XX UNBL9999,999.00 XX UNBL9999,999.00 XX UNBL9999,999.00 PT UNBL9999,999.00 TOT CHGS UNBILLED 9999,999.00 CURRENT BALANCE PT CURR BALANCE 9999,999.00 I1 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 I2 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 I3 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 I4 X XXX XXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXX XXXXXXXX 9999,999.00 TOT CURR ACCOUNT BAL: 9999,999.00 _______________________________________________________________________________ POSTDATE SVC DATE SVC CD DESCRIPTION AMOUNT BALANCE 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 9999999 XXXXXXXXXXXXXXXXXXXXXXXXXX 9999,999.00 9999,999.00 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 99/99/99 99/99/99 * XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 999999 <THERE IS A FORM FEED HERE> Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted June 13, 2007 Moderators Share Posted June 13, 2007 Fun times:;Param 1 = File to read and parse for info ;Param 2 = Output file if 6 is found ;Param 3 = Output file if 7 is found _OutPutFiles('MainFileToRead.log', 'If_6_Is_Found_Out_FileWrite_To.log', 'If_7_Is_Found_Out_FileWrite_To.log') Func _OutPutFiles($sString, $sSixOutFile, $sSevenOutFile) $sString = FileRead($sString) ;Parse Main File with Form Feed Local $aGroupHeaders = StringRegExp($sString, '(?s)(?i)(\s{4}PT NAME\s+PATIENT NO.*?)\f', 3) For $iCC = 0 To UBound($aGroupHeaders) -1 $aFC = StringRegExp($aGroupHeaders[$iCC], '(?s)(?i)\s+PT NAME\s+PATIENT NO.*?\r\n.*?\s\d+-*.*?\s{2}(\d+)\s', 3) If IsArray($aFC) Then If $aFC[0] = '6' Then FileWrite($sSixOutFile, $aGroupHeaders[$iCC] & @CRLF & Chr(12)) ElseIf $aFC[0] = '7' Then FileWrite($sSevenOutFile, $aGroupHeaders[$iCC] & @CRLF & Chr(12)) Else FileWrite('OutFileErrorLog.log', '--------- Start Error (Wrong Number Found) ----------- ' & _ @MON & '/' & @MDAY & '/' & @YEAR & ':' & @HOUR & ':' & @MIN & ':' & @SEC & @CRLF & _ $aGroupHeaders[$iCC] & @CRLF & '--------- End Error -----------' & @CRLF) EndIf Else FileWrite('OutFileErrorLog.log', '--------- Start Error (No Number Found) ----------- ' & _ @MON & '/' & @MDAY & '/' & @YEAR & ':' & @HOUR & ':' & @MIN & ':' & @SEC & @CRLF & _ $aGroupHeaders[$iCC] & @CRLF & '--------- End Error -----------' & @CRLF) EndIf Next Return '' EndFunc Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Kate Posted June 14, 2007 Author Share Posted June 14, 2007 Wow, that's awesome, thank you. I've only just begun reading about regular expressions and am quite glad to have help using them with this file...it's not safe for regular expression beginners...If I may summarize what your script does to make sure I understand:1. The input and two output files are defined and passed to a function.2. The function begins by storing the whole input file as one big string in a variable ($sString)3. Then StringRegExp goes hunting for the headers in $sString, storing the array of matches in the $aGroupHeaders variable.4. The For loop begins processing all the entries in the $aGroupHeaders array one at a time5. StringRegExp then goes hunting for the FC number and stores the result in $aFC6. An If/Then statement handles output to each file depending on the value of $aFC and error handling ensues.The only part I don't understand is how does the script know to output the entire page and not just the header line?Thank you again,Kate Link to comment Share on other sites More sharing options...
Moderators SmOke_N Posted June 14, 2007 Moderators Share Posted June 14, 2007 Wow, that's awesome, thank you. I've only just begun reading about regular expressions and am quite glad to have help using them with this file...it's not safe for regular expression beginners...If I may summarize what your script does to make sure I understand:1. The input and two output files are defined and passed to a function.2. The function begins by storing the whole input file as one big string in a variable ($sString)3. Then StringRegExp goes hunting for the headers in $sString, storing the array of matches in the $aGroupHeaders variable.4. The For loop begins processing all the entries in the $aGroupHeaders array one at a time5. StringRegExp then goes hunting for the FC number and stores the result in $aFC6. An If/Then statement handles output to each file depending on the value of $aFC and error handling ensues.The only part I don't understand is how does the script know to output the entire page and not just the header line?Thank you again,Kate$aGroupHeaders doesn't just store the header, it stores from the header to the feed. If you look at when I write the group to it's chosen file, I add Chr(12) back in it (the feed). Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer. Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now