trademaid Posted October 1, 2009 Author Posted October 1, 2009 (edited) im working on cleaning the data which has corruptions. file move doesn't work, and copy with delete of source copies, but wont delete the source. any idea why? how also do i frame my scripts so they appear in au3 boxes? expandcollapse popup#include <Array.au3> FileChangeDir("C:\Program Files\TheGrailGGO\Reports\in") $search = FileFindFirstFile("*.txt") If $search = -1 Then MsgBox(0, "Error", "No files/directories matched the search pattern1111") Exit EndIf While 1; $file <> "@ES.D_15min_BuildCASBTemplate4_Fitness1.t.xt" $file = FileFindNextFile($search) If @error Then ExitLoop ; $file = FileFindNextFile($search) If @error Then Exit ;Loop MsgBox(4096, "File:", $file,2) Global $sInFile = "C:\Program Files\TheGrailGGO\Reports\in\" & $file Global $outFile = "C:\Program Files\TheGrailGGO\Reports\clean\" & $file DirCreate("C:\Program Files\TheGrailGGO\Reports\clean\") DirCreate("C:\Program Files\TheGrailGGO\Reports\in\cleaned\") Global $aStr, $aSort, $aSS1, $sTmp, $sRet ;read file to array $fileread = FileOpen($file, 0) If FileExists($file) Then $counter = 0 $cleanop="" While 1 $line = FileReadLine($fileread) If @error = -1 Then ExitLoop ;if ((StringLen($line) <> 633) and (StringLen($line) <320) )then MsgBox(0, "Line read:" & StringLen($line), $line) ;if number(StringLeft($line,7)) < 100000 then msgbox(1, StringLeft($line,7), $line) $counter = $counter + 1 If ($counter < 5) Then $cleanop = $cleanop & $line & @CRLF If (($counter > 4) And (Number(StringLeft($line, 6)) > 99999))Then $cleanop = $cleanop & $line & @CRLF; if first number less than 100,000 then its bad WEnd ;msgbox(1,"op",$cleanop) FileDelete($outFile) FileWrite($outFile,$cleanop) fileclose($outFile) fileclose($sInFile) ;msgbox(1,$sInFile,"C:\Program Files\TheGrailGGO\Reports\in\cleaned\" & $file) filecopy($sInFile,"C:\Program Files\TheGrailGGO\Reports\in\cleaned\" & $file,1); file move didnt work FileDelete($sInFile) EndIf WEnd Edited October 2, 2009 by trademaid
trademaid Posted October 2, 2009 Author Posted October 2, 2009 (edited) The following code fails on second file. Anyone know what Ive done wrong? (47) : ==> Array variable has incorrect number of subscripts or subscript dimension range exceeded.: expandcollapse popup#include <Array.au3> FileChangeDir("C:\Program Files\TheGrailGGO\Reports\clean\") DirCreate("C:\Program Files\TheGrailGGO\Reports\out2\") $search = FileFindFirstFile("*.txt") If $search = -1 Then MsgBox(0, "Error", "No files/directories matched the search pattern1111") Exit EndIf ;While 1 ;$file2=fileopen("d:\5\autoit\ftpnew5.txt",1) $file = FileFindNextFile($search) If @error Then Exit ;Loop MsgBox(4096, "File:", $file,3) While $file <> "@ES.D_15min_BuildCASBTemplate4_Fitness1.t.xt" Global $sInFile = "C:\Program Files\TheGrailGGO\Reports\clean\" & $file Global $outFile = "C:\Program Files\TheGrailGGO\Reports\out2\" & $file Global $aStr, $aSort, $aSS1, $sTmp, $sRet ;read file to array If FileExists($sInFile) Then $aStr = StringSplit(StringStripCR(StringStripWS(FileRead($sInFile), 3)), @LF, 2) Else MsgBox(4096, "Error", " Error reading file to Array error:" & @error, 1) ;ExitLoop EndIf ;_ArrayDisplay($aStr, "$aStr before sorting") ;Filter out a line if it doesn't have a double space. For $i = 0 To UBound($aStr) - 1 If StringInStr($aStr[$i], " ") Or $i < 4 Then $sTmp &= $aStr[$i] & @LF Next $aStr = StringSplit(StringStripWS($sTmp, 2), @CRLF, 2) ; CR LF ADDED 2 OCT $sTmp = '' ;create new 2d array for sorting Dim $aSort[UBound($aStr) - 4][2] For $i = 0 To UBound($aSort, 1) - 1 ;ConsoleWrite($i & " ^ " & StringLen($aStr[$i]) & " ^ " & $aStr[$i] & @CRLF) If $i < 4 Then $sRet &= $aStr[$i] & @CRLF $aSS1 = StringSplit($aStr[$i + 4], " ", 3) ;msgbox(1,"ass",Number($aSS1[1])) ;ConsoleWrite(Number($aSS1[1]) & "~" & StringLen( Number($aSS1[1])) & @crlf) $aSort[$i][0] = Number($aSS1[1]) $aSort[$i][1] = $aStr[$i + 4] Next ;_ArrayDisplay($aSort, "$aSort before sorting") _ArraySort($aSort) ;_ArrayDisplay($aSort, "$aSort after sorting, dups not removed yet") ;convert the sorted 2d array back to the original 1D array and romove duplicate lines based on matching sorting numbers For $i = 0 To UBound($aSort) - 1 If Not StringInStr($sTmp, $aSort[$i][0]) Then $sTmp &= $aSort[$i][0] & @CRLF $sRet &= $aSort[$i][1] & @CRLF ; MsgBox(1, "ret", $sRet) EndIf Next $aStr = StringSplit(StringStripWS($sRet, 2), @CRLF, 2) ;_ArrayDisplay($aStr, "$aStr after sorting and removing duplicate lines") FileDelete($outFile) FileWriteLine($outFile, $sRet) FileClose("$outfile") FileMove($sInFile, $sInFile & ".old", 1) FILECLOSE($search) WEnd ;WEnd Edited October 2, 2009 by trademaid
smashly Posted October 2, 2009 Posted October 2, 2009 It means the line your reading doesn't contain a double space.. So when you StringSplit using a double space and the line doesn't have a double space then the array doesn't have an index [1] so ..fail.. This is why I asked/said a few posts ago "Are the first numbers of a line always 6 digits long or does that length vary? It would probably be better if you could check if a line has a double space at an exact position in the line, this way there'd be less chance of writing dud lines to the new file."If I was writing something like this for myself I would start from the beginning of the code and add proper error handling for every step. As it is you've asked a simple question in your first post.. Which was answered by 3 ppl over the length of your thread. But each time an answer is given, the rules of what your after changes. So in turn everyone else is writing your code for you. Not to mention Malkey giving you some good examples along with a complete stand alone function, and not even a gesture of thanks or recognition from you for him taking the time to post it for you.... Nope, just more questions on others working code you've butchered because your needs are changing every post. I honestly think you've been given enough to work out what you need. Also the way your posting the code makes it damn hard to see what's what. Post in a Code box maybe? Sorry if my attitude comes off as rude, but it's just a general observation. Cheers
trademaid Posted October 2, 2009 Author Posted October 2, 2009 (edited) Im gratefull for the help received on this forum. Why rules change is this is a learning process. It had not initially occurred to me that there always must be a six digit number in the input file, which is why I didnt comment on it at the time, but did include in my code days later. It has taken me a long time to sort this out due to its complexity. Im also trying to code the remaining sections without help of people on forums. Its only when stuck did I post things. The array stuff however was beyond me, as its using parts of autoit code that I was very unfamiliar with. ps twice in this thread I asked how to put code in a box, and no one commented. Again, I thank you for your support. Edited October 2, 2009 by trademaid
trademaid Posted October 2, 2009 Author Posted October 2, 2009 (edited) Hi Malkey that looks quite elaborate. Thanks for your work. I will test on monday. I assune that StringRegExpReplace($aArray[$iStartSortLine] & " ", "(\S+)\s+", "(\\S+)\\s+") filters out lines with no double space. the docs dont explain the \s+ functions, so its hard for me to understand. later note, this is explained better in your "Posted 30 September 2009 - 03:21 AM" Have a good weekend. ive now put all the comments in text boxes. Thank you Melba23 Edited October 2, 2009 by trademaid
Moderators Melba23 Posted October 2, 2009 Moderators Posted October 2, 2009 trademaid,Put [autoit ] before and [/autoit ] after your posted code (but omit the trailing space - it is only there so the tags display here). M23  Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind Open spoiler to see my UDFs: Spoiler ArrayMultiColSort ---- Sort arrays on multiple columnsChooseFileFolder ---- Single and multiple selections from specified path treeview listingDate_Time_Convert -- Easily convert date/time formats, including the language usedExtMsgBox --------- A highly customisable replacement for MsgBoxGUIExtender -------- Extend and retract multiple sections within a GUIGUIFrame ---------- Subdivide GUIs into many adjustable framesGUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView itemsGUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeViewMarquee ----------- Scrolling tickertape GUIsNoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxesNotify ------------- Small notifications on the edge of the displayScrollbars ----------Automatically sized scrollbars with a single commandStringSize ---------- Automatically size controls to fit textToast -------------- Small GUIs which pop out of the notification area Â
Malkey Posted October 4, 2009 Posted October 4, 2009 Hi Malkey .... I assune that StringRegExpReplace($aArray[$iStartSortLine] & " ", "(\S+)\s+", "(\\S+)\\s+") filters out lines with no double space. the docs dont explain the \s+ functions, so its hard for me to understand. .... You assuned incorrectly. StringRegExpReplace( ........ & " ", "(\S+)\s+", "(\\S+)\\s+") appears three in the script. The first and second occurrences were copied from the third occurrence. Notice the first two occurrences do not have a variable equal to the StringRegExpReplace() function. The value generated in @extension when the function is called is the number of words in the string that is stored in the variable, $aArray[$iStartSortLine], or, the first line of the file to be sorted. The second occurrence of the function checks if the number of words of subsequent lines are equal to the number of words in the first line to be sorted. The "(\\S+)\\s+" parameter was left over from the copy. It does not matter what that is. "" would be neater. The important bit is the search pattern and how many times the match of what is between "(" and ")" is made because that is what is recorded in @extension - the number of words. The third use of the function returns a search pattern which consists of a string of "(\S+)\s+"'s. Each "(\S+)\s+" represents one word in a line. This allows any word in the line to be retrieved by specifying its back-reference number in the replace parameter of the StringRegExpReplace function. The + is used to repeat the previous set of whitespace characters,\s, one or more times. Equivalent to {1,} The part in the search pattern, "(\S+)\s+", that matches the word is \S+, meaning one or more non-white characters, as opposed to white spaces, \s, such as spacebar space, tab, @Cr- carriage return, and more. The brackets, (),group the matching non-white characters, stores them, and allow back-referencing them in the replace parameter by the use of "\n" or "${n}". Where "n" is a number automatically assigned, based on the number of () groups in the search pattern. The groups are numbered left to right, starting at one. To use this example, hopefully you are using SciTE. The ConsoleWrites function output appear in the bottom resizable, Output window of SciTE, together with all those useful error messages. ; $sStr = "One two 3 four" ; ======= Number of words =================== StringRegExpReplace($sStr & " ", "(\S+)\s+", "") $iReplacements = @extended ; No. of words on line $iStartSortLine ConsoleWrite("Number of words: " & $iReplacements & @CRLF & _ "When using StringRegExpReplace(), the number of replacements of the (...) in the search pattern is in @extended." & @CRLF & _ "In the above case, (\S+) is replaced " & $iReplacements & ' times with "", which is nothing.' & @CRLF & @CRLF) ; ======= Created search pattern =================== $sPattern = StringRegExpReplace($sStr & " ", "(\S+)\s+", "(\\S+)\\s+") ConsoleWrite("Created search pattern: " & $sPattern & @CRLF & _ " If there is no space at end of test string, the above pattern will not match. The trailing \s+ needs to be removed." & @CRLF & @CRLF) ; ======= Word number =================== $iWord = 2 Local $sRetWord = StringRegExpReplace($sStr & " ", StringTrimRight($sPattern, 3), "\" & $iWord) ; \n ConsoleWrite("Word number " & $iWord & " = " & $sRetWord & @CRLF & @CRLF) ; ======= Last word =================== $sRetWord = StringRegExpReplace($sStr & " ", StringTrimRight($sPattern, 3), "\" & $iReplacements) ConsoleWrite("Last word" & " = " & $sRetWord & @CRLF & @CRLF) ; ======= Maniputate return string =================== $sRetWord = StringRegExpReplace($sStr & " ", StringTrimRight($sPattern, 3), "\1+${2}@\4% \3 <> \4 The End") ConsoleWrite("Maniputated return string" & " = " & $sRetWord & @CRLF & _ "If The replacememnt needs to be ${1}9, then using \19 will not work," & @CRLF & _ "because \19 is interpreted as the nineteenth replacement and not the first replacement followed by nine." & @CRLF & @CRLF) ;
trademaid Posted October 5, 2009 Author Posted October 5, 2009 Thank you for taking the time to educate and help me. This project is now completed thanks to you all.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now