myspacee Posted November 25, 2008 Share Posted November 25, 2008 Hello to all, try to create script that read a text file, return a wordlist of all words. Final goal is create a worlist, sort, and delete dupes (have code for this) anyone can help ? m. Link to comment Share on other sites More sharing options...
GEOSoft Posted November 25, 2008 Share Posted November 25, 2008 Hello to all, try to create script that read a text file, return a wordlist of all words. Final goal is create a worlist, sort, and delete dupes (have code for this) anyone can help ? m.Just writing on the fly without testing but this should do it. #include<array.au3> $sFile = FileRead("C:\Path\somefile.txt") $sStr = StringReplace(StringStripCR($sFile), @LF, Chr(32)) $sStr = StringRegExpReplace($sStr, ",|\.|\?|!|:|;", "") $aWords = StringSplit($sStr, Chr(32), 2) $aWords = _ArrayUnique($aWords) _ArrayDisplay($aWords, "Returned Word List") You might want to throw _ArraySort() in after the _ArrayUnique() George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
myspacee Posted November 25, 2008 Author Share Posted November 25, 2008 Thank you for reply but, unsing autoit 3.2.12.1 and can't find _ArrayUnique() func.. Am'I wrong something ? m. Link to comment Share on other sites More sharing options...
GEOSoft Posted November 25, 2008 Share Posted November 25, 2008 Thank you for reply but, unsing autoit 3.2.12.1 and can't find _ArrayUnique() func.. Am'I wrong something ? m.I guess it wasn't in 3.2.12 Here it is Just add it to your script for now. The next release version of AutoIt will include it. CODEFunc _ArrayUnique($aArray, $iDimension = 1, $iBase = 0, $iCase = 0, $vDelim = "|") Local $iUboundDim ;$aArray used to be ByRef, but litlmike altered it to allow for the choosing of 1 Array Dimension, without altering the original array If $vDelim = "|" Then $vDelim = Chr(01) ; by SmOke_N, modified by litlmike If Not IsArray($aArray) Then Return SetError(1, 0, 0) ;Check to see if it is valid array ;Checks that the given Dimension is Valid If Not $iDimension > 0 Then Return SetError(3, 0, 0) ;Check to see if it is valid array dimension, Should be greater than 0 Else ;If Dimension Exists, then get the number of "Rows" $iUboundDim = UBound($aArray, 1) ;Get Number of "Rows" If @error Then Return SetError(3, 0, 0) ;2 = Array dimension is invalid. ;If $iDimension Exists, And the number of "Rows" is Valid: If $iDimension > 1 Then ;Makes sure the Array dimension desired is more than 1-dimensional Local $aArrayTmp[1] ;Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 ;Loop through "Rows" _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]) ;$iDimension-1 to match Dimension Next _ArrayDelete($aArrayTmp, 0) ;Get rid of 1st-element which is blank Else ;Makes sure the Array dimension desired is 1-dimensional ;If Dimension Exists, And the number of "Rows" is Valid, and the Dimension desired is not > 1, then: ;For the Case that the array is 1-Dimensional If UBound($aArray, 0) = 1 Then ;Makes sure the Array is only 1-Dimensional Dim $aArrayTmp[1] ;Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 _ArrayAdd($aArrayTmp, $aArray[$i]) Next _ArrayDelete($aArrayTmp, 0) ;Get rid of 1st-element which is blank Else ;For the Case that the array is 2-Dimensional Dim $aArrayTmp[1] ;Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]) ;$iDimension-1 to match Dimension Next _ArrayDelete($aArrayTmp, 0) ;Get rid of 1st-element which is blank EndIf EndIf EndIf Local $sHold ;String that holds the Unique array info For $iCC = $iBase To UBound($aArrayTmp) - 1 ;Loop Through array ;If Not the case that the element is already in $sHold, then add it If Not StringInStr($vDelim & $sHold, $vDelim & $aArrayTmp[$iCC] & $vDelim, $iCase) Then _ $sHold &= $aArrayTmp[$iCC] & $vDelim Next If $sHold Then $aArrayTmp = StringSplit(StringTrimRight($sHold, StringLen($vDelim)), $vDelim, 1) ;Split the string into an array Return $aArrayTmp ;SmOke_N's version used to Return SetError(0, 0, 0) EndIf Return SetError(2, 0, 0) ;If the script gets this far, it has failed EndFunc ;==>_ArrayUnique George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
myspacee Posted November 25, 2008 Author Share Posted November 25, 2008 thank you ! in meantime i find some code and write this: expandcollapse popup#include <GuiConstantsEx.au3> #include <String.au3> #include <Array.au3> $file = FileOpen("test.txt", 0) ; Check if file opened for reading OK If $file = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit elseif $file <> -1 Then ; Read in 1 character at a time until the EOF is reached $chars = FileRead($file) $chars = StringReplace($chars, " ", " ") $chars = StringReplace($chars, ",", "") $chars = StringReplace($chars, ".", "") $chars = StringReplace($chars, "-", "") $chars = StringReplace($chars, "!", "") $chars = StringReplace($chars, "£", "") $chars = StringReplace($chars, "$", "") $chars = StringReplace($chars, "%", "") $chars = StringReplace($chars, "&", "") $chars = StringReplace($chars, "/", "") $chars = StringReplace($chars, "(", "") $chars = StringReplace($chars, ")", "") $chars = StringReplace($chars, "=", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "^", "") $chars = StringReplace($chars, "(", "") $chars = StringReplace($chars, "[", "") $chars = StringReplace($chars, "]", "") $chars = StringReplace($chars, "@", "") $chars = StringReplace($chars, "#", "") $chars = StringReplace($chars, "§", "") $chars = StringReplace($chars, ";", "") $chars = StringReplace($chars, ":", "") $chars = StringReplace($chars, "_", "") $chars = StringReplace($chars, "-", "") $chars = StringReplace($chars, "+", "") $chars = StringReplace($chars, "*", "") $chars = StringReplace($chars, chr(34), "") $chars = StringReplace($chars, "'", "") ;~ $chars = StringReplace($chars, "", "") ;~ $chars = StringReplace($chars, "+", "") $chars = StringReplace($chars, "0", "") $chars = StringReplace($chars, "1", "") $chars = StringReplace($chars, "2", "") $chars = StringReplace($chars, "3", "") $chars = StringReplace($chars, "4", "") $chars = StringReplace($chars, "5", "") $chars = StringReplace($chars, "6", "") $chars = StringReplace($chars, "7", "") $chars = StringReplace($chars, "8", "") $chars = StringReplace($chars, "9", "") $array = StringSplit($chars, " ") _ArraySort($Array, 0, 0, 0, 0) _ArrayDisplay($array, "BEFORE : with dupes") $mynewarray = dupecheckerthingy($array) _ArrayDisplay($mynewarray, "AFTER : without dupes") EndIf FileClose($file) Func dupecheckerthingy($showmethearray) Local $tmparray[1] = [''] For $i = 1 To UBound($showmethearray) - 1 _ArraySearch($tmparray, $showmethearray[$i]) If @error Then _ArrayAdd($tmparray, $showmethearray[$i]) Next $tmparray[0] = 'Number of elements = ' & UBound($tmparray) - 1 Return $tmparray EndFunc Now need help to write array to text file as wordlist... Can help ? m. Link to comment Share on other sites More sharing options...
GEOSoft Posted November 25, 2008 Share Posted November 25, 2008 (edited) That's essentially the long way of doing what I gave you except that it includes digits and more punctuation which I will correct in this version. The one StringRegExpReplace() line takes care of all those $Chars = StringReplace() lines that you have. #include<array.au3> $sFile = FileRead("C:\Path\somefile.txt") ;; Strip the @CRs and replace @LFs with spaces $sStr = StringReplace(StringStripCR($sFile), @LF, Chr(32)) ;; Remove punctuation (except ') and digits. ;; this one line does the same as all the stringReplace() lines in the other code $sStr = StringRegExpReplace($sStr, "\d|\x22|~|`|@|#|%|\^|&|*|\(|\)|=|/|\[|\]|{|}|<|\\|>|+|,|\.|\?|!|:|;|\|", "") ;; Split the text on the spaces to a 0 based array $aWords = StringSplit($sStr, Chr(32), 2) ;; remove all but one occurance of each word $aWords = _ArrayUnique($aWords) ;; Sort the list (array) _ArraySort($aWords, 0, 0, 0, 0) ;; Write the words to a file $hOut = FileOpen (@DesktopDir & "\Word_List.txt", 2) For $i = 0 To Ubound($aWords) -1 FileWriteLine($hOut, $aWords[$i]) Next FileClose($hFile) ;; View the file ShellExecute(@DesktopDir & "\Word_List.txt") Edited November 25, 2008 by GEOSoft George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
myspacee Posted November 26, 2008 Author Share Posted November 26, 2008 using _ArrayUnique() function but not your code 'cause return as otput this : (using first page of bible as test) 1 2 then i apply function to my code with nice result : expandcollapse popup#include <GuiConstantsEx.au3> #include <String.au3> #include <Array.au3> $file = FileOpen("test.txt", 0) ; Check if file opened for reading OK If $file = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit elseif $file <> -1 Then ; Read in 1 character at a time until the EOF is reached $chars = FileRead($file) ;~ $chars = StringRegExpReplace($chars, ",|.|-|!|£|$|%|&|/|(|)|=|?|^|[|]|@|#|§|;|:|_|\|-|+|*|~|0|1|2|3|4|5|6|7|8|9|\.|\?", "") ;~ $chars = StringReplace($chars, " ", " ") $chars = StringReplace($chars, ",", "") $chars = StringReplace($chars, ".", "") $chars = StringReplace($chars, "-", "") $chars = StringReplace($chars, "!", "") $chars = StringReplace($chars, "£", "") $chars = StringReplace($chars, "$", "") $chars = StringReplace($chars, "%", "") $chars = StringReplace($chars, "&", "") $chars = StringReplace($chars, "/", "") $chars = StringReplace($chars, "(", "") $chars = StringReplace($chars, ")", "") $chars = StringReplace($chars, "=", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "^", "") $chars = StringReplace($chars, "(", "") $chars = StringReplace($chars, "[", "") $chars = StringReplace($chars, "]", "") $chars = StringReplace($chars, "@", "") $chars = StringReplace($chars, "#", "") $chars = StringReplace($chars, "§", "") $chars = StringReplace($chars, ";", "") $chars = StringReplace($chars, ":", "") $chars = StringReplace($chars, "_", "") $chars = StringReplace($chars, "-", "") $chars = StringReplace($chars, "+", "") $chars = StringReplace($chars, "*", "") $chars = StringReplace($chars, chr(34), "") $chars = StringReplace($chars, "'", "") ;~ $chars = StringReplace($chars, "", "") ;~ $chars = StringReplace($chars, "+", "") $chars = StringReplace($chars, "0", "") $chars = StringReplace($chars, "1", "") $chars = StringReplace($chars, "2", "") $chars = StringReplace($chars, "3", "") $chars = StringReplace($chars, "4", "") $chars = StringReplace($chars, "5", "") $chars = StringReplace($chars, "6", "") $chars = StringReplace($chars, "7", "") $chars = StringReplace($chars, "8", "") $chars = StringReplace($chars, "9", "") ;-------------------------------------------------------- order in array $array = StringSplit($chars, " ") _ArraySort($Array, 0, 0, 0, 0) ;~ _ArrayDisplay($array, "BEFORE : with dupes") ;-------------------------------------------------------- delete dupes $array = _ArrayUnique($array) _ArraySort($Array, 0, 0, 0, 0) ;~ _ArrayRemoveBlanks($Array) _ArrayDisplay($array, "Returned Word List") ;-------------------------------------------------------- write wordlist FileDelete("test_sorted.txt") $file_sorted = FileOpen("test_sorted.txt", 2) for $i = 1 to UBound($array) - 1 ;~ if $array[$i] <> "" then;or $array[$i] <> @CRLF then;or $array[$i] <> @CR or $array[$i] <> @LF Then FileWrite($file_sorted, $array[$i] & @CRLF) ;~ EndIf next EndIf FileClose($file) FileClose($file_sorted) ; Removes Elemets that contain only whitespace characters and returns the new array. ; The count of the return is at $aRet[0]. Func _ArrayRemoveBlanks($aID) Local $sTmp = '' For $i = 0 to Ubound($aID) -1 If StringRegExpReplace($aID[$i], "\s", "") Then $sTmp &= $aID[$i] & Chr(0) Next Return StringSplit(StringTrimRight($sTmp, 1), Chr(0)) EndFunc Func _ArrayUnique($aArray, $iDimension = 1, $iBase = 0, $iCase = 0, $vDelim = "|") Local $iUboundDim ;$aArray used to be ByRef, but litlmike altered it to allow for the choosing of 1 Array Dimension, without altering the original array If $vDelim = "|" Then $vDelim = Chr(01); by SmOke_N, modified by litlmike If Not IsArray($aArray) Then Return SetError(1, 0, 0);Check to see if it is valid array ;Checks that the given Dimension is Valid If Not $iDimension > 0 Then Return SetError(3, 0, 0);Check to see if it is valid array dimension, Should be greater than 0 Else ;If Dimension Exists, then get the number of "Rows" $iUboundDim = UBound($aArray, 1);Get Number of "Rows" If @error Then Return SetError(3, 0, 0);2 = Array dimension is invalid. ;If $iDimension Exists, And the number of "Rows" is Valid: If $iDimension > 1 Then;Makes sure the Array dimension desired is more than 1-dimensional Local $aArrayTmp[1];Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1;Loop through "Rows" _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]);$iDimension-1 to match Dimension Next _ArrayDelete($aArrayTmp, 0);Get rid of 1st-element which is blank Else;Makes sure the Array dimension desired is 1-dimensional ;If Dimension Exists, And the number of "Rows" is Valid, and the Dimension desired is not > 1, then: ;For the Case that the array is 1-Dimensional If UBound($aArray, 0) = 1 Then;Makes sure the Array is only 1-Dimensional Dim $aArrayTmp[1];Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 _ArrayAdd($aArrayTmp, $aArray[$i]) Next _ArrayDelete($aArrayTmp, 0);Get rid of 1st-element which is blank Else;For the Case that the array is 2-Dimensional Dim $aArrayTmp[1];Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]);$iDimension-1 to match Dimension Next _ArrayDelete($aArrayTmp, 0);Get rid of 1st-element which is blank EndIf EndIf EndIf Local $sHold;String that holds the Unique array info For $iCC = $iBase To UBound($aArrayTmp) - 1;Loop Through array ;If Not the case that the element is already in $sHold, then add it If Not StringInStr($vDelim & $sHold, $vDelim & $aArrayTmp[$iCC] & $vDelim, $iCase) Then _ $sHold &= $aArrayTmp[$iCC] & $vDelim Next If $sHold Then $aArrayTmp = StringSplit(StringTrimRight($sHold, StringLen($vDelim)), $vDelim, 1);Split the string into an array Return $aArrayTmp;SmOke_N's version used to Return SetError(0, 0, 0) EndIf Return SetError(2, 0, 0);If the script gets this far, it has failed EndFunc;==>_ArrayUnique But I've a lot of blank lines in text file, is there any manner to delete empty lines in text file ? thank you for help, m. Link to comment Share on other sites More sharing options...
myspacee Posted November 27, 2008 Author Share Posted November 27, 2008 Update script now with blank lines deleter. copy & paste big test in file "test.txt" run script and create your worlist. Find some imprecision on new _ArrayUnique() function programemd for next Autoit release, dupes on head and bottom. Someone teach me to use StringRegExpReplace() can't figure how manage big list of bad chars... expandcollapse popup#include <GuiConstantsEx.au3> #include <String.au3> #include <Array.au3> $file = FileOpen("test.txt", 0) ; Check if file opened for reading OK If $file = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit elseif $file <> -1 Then ; Read in 1 character at a time until the EOF is reached $chars = FileRead($file) $chars = StringReplace($chars, ",", "") $chars = StringReplace($chars, ".", "") $chars = StringReplace($chars, "-", "") $chars = StringReplace($chars, "!", "") $chars = StringReplace($chars, "£", "") $chars = StringReplace($chars, "$", "") $chars = StringReplace($chars, "%", "") $chars = StringReplace($chars, "&", "") $chars = StringReplace($chars, "/", "") $chars = StringReplace($chars, "(", "") $chars = StringReplace($chars, ")", "") $chars = StringReplace($chars, "=", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "^", "") $chars = StringReplace($chars, "(", "") $chars = StringReplace($chars, "[", "") $chars = StringReplace($chars, "]", "") $chars = StringReplace($chars, "@", "") $chars = StringReplace($chars, "#", "") $chars = StringReplace($chars, "§", "") $chars = StringReplace($chars, ";", "") $chars = StringReplace($chars, ":", "") $chars = StringReplace($chars, "_", "") $chars = StringReplace($chars, "-", "") $chars = StringReplace($chars, "+", "") $chars = StringReplace($chars, "*", "") $chars = StringReplace($chars, chr(34), "") $chars = StringReplace($chars, "'", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "\", "") $chars = StringReplace($chars, "\", "") $chars = StringReplace($chars, "<", "") $chars = StringReplace($chars, ">", "") $chars = StringReplace($chars, "»", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "»", "") $chars = StringReplace($chars, "©", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "", "") $chars = StringReplace($chars, "?", "") $chars = StringReplace($chars, "¡", "") $chars = StringReplace($chars, "¢", "") $chars = StringReplace($chars, "£", "") $chars = StringReplace($chars, "¤", "") $chars = StringReplace($chars, "Â¥", "") $chars = StringReplace($chars, "¦", "") $chars = StringReplace($chars, "§", "") $chars = StringReplace($chars, "¨", "") $chars = StringReplace($chars, "©", "") $chars = StringReplace($chars, "ª", "") $chars = StringReplace($chars, "«", "") $chars = StringReplace($chars, "¬", "") $chars = StringReplace($chars, "Â", "") $chars = StringReplace($chars, "®", "") $chars = StringReplace($chars, "¯", "") $chars = StringReplace($chars, "°", "") $chars = StringReplace($chars, "±", "") $chars = StringReplace($chars, "²", "") $chars = StringReplace($chars, "³", "") $chars = StringReplace($chars, "´", "") $chars = StringReplace($chars, "µ", "") $chars = StringReplace($chars, "¶", "") $chars = StringReplace($chars, "·", "") $chars = StringReplace($chars, "¸", "") $chars = StringReplace($chars, "¹", "") $chars = StringReplace($chars, "º", "") $chars = StringReplace($chars, "»", "") $chars = StringReplace($chars, "¼", "") $chars = StringReplace($chars, "½", "") $chars = StringReplace($chars, "¾", "") $chars = StringReplace($chars, "¿", "") $chars = StringReplace($chars, "×", "") $chars = StringReplace($chars, "÷", "") $chars = StringReplace($chars, "0", "") $chars = StringReplace($chars, "1", "") $chars = StringReplace($chars, "2", "") $chars = StringReplace($chars, "3", "") $chars = StringReplace($chars, "4", "") $chars = StringReplace($chars, "5", "") $chars = StringReplace($chars, "6", "") $chars = StringReplace($chars, "7", "") $chars = StringReplace($chars, "8", "") $chars = StringReplace($chars, "9", "") ;-------------------------------------------------------- create array ToolTip("Create array") $array = StringSplit($chars, " ") ;~ _ArraySort($Array, 0, 0, 0, 0) ;~ _ArrayDisplay($array, "BEFORE : with dupes") ;-------------------------------------------------------- order in array - delete dupes ToolTip("delete dupes - sorting...") _ArraySort($Array, 0, 0, 0, 0) ToolTip("delete dupes") $array = _ArrayUnique($array) ToolTip("delete dupes - sorting...") _ArraySort($Array, 0, 0, 0, 0) ;~ _ArrayRemoveBlanks($Array) ;~ _ArrayDisplay($array, "Returned Word List") ;-------------------------------------------------------- write wordlist ToolTip("write wordlist") FileDelete("test_sorted.txt") $file_sorted = FileOpen("test_sorted.txt", 2) for $i = 1 to UBound($array) - 1 FileWrite($file_sorted, $array[$i] & @CRLF) next FileClose($file) FileClose($file_sorted) ;-------------------------------------------------------- clean blank lines ToolTip("clean blank lines") $file = FileOpen("test_sorted.txt", 0) FileDelete("Wordlist_from_text.txt") $file_sorted = FileOpen("Wordlist_from_text.txt", 2) ; Read in lines of text until the EOF is reached While 1 $line = FileReadLine($file) If @error = -1 Then ExitLoop $line1 = StringStripWS($line,1) $line1 = StringStripCR($line1) $line1 = StringReplace($line1, "|", "") $line1 = StringReplace($line1, "0", "") $line1 = StringReplace($line1, "1", "") $line1 = StringReplace($line1, "2", "") $line1 = StringReplace($line1, "3", "") $line1 = StringReplace($line1, "4", "") $line1 = StringReplace($line1, "5", "") $line1 = StringReplace($line1, "6", "") $line1 = StringReplace($line1, "7", "") $line1 = StringReplace($line1, "8", "") $line1 = StringReplace($line1, "9", "") if $line1 <> "" Then FileWrite($file_sorted, $line1 & @CRLF) EndIf Wend EndIf ToolTip("") FileClose($file) FileClose($file_sorted) FileDelete("test_sorted.txt") ;~ $mynewarray = dupecheckerthingy($array) ;~ _ArrayDisplay($mynewarray, "AFTER : without dupes") ;~ Func dupecheckerthingy($showmethearray) ;~ Local $tmparray[1] = [''] ;~ For $i = 1 To UBound($showmethearray) - 1 ;~ _ArraySearch($tmparray, $showmethearray[$i]) ;~ If @error Then _ArrayAdd($tmparray, $showmethearray[$i]) ;~ Next ;~ $tmparray[0] = 'Number of elements = ' & UBound($tmparray) - 1 ;~ Return $tmparray ;~ EndFunc Func _ArrayUnique($aArray, $iDimension = 1, $iBase = 0, $iCase = 0, $vDelim = "|") Local $iUboundDim ;$aArray used to be ByRef, but litlmike altered it to allow for the choosing of 1 Array Dimension, without altering the original array If $vDelim = "|" Then $vDelim = Chr(01); by SmOke_N, modified by litlmike If Not IsArray($aArray) Then Return SetError(1, 0, 0);Check to see if it is valid array ;Checks that the given Dimension is Valid If Not $iDimension > 0 Then Return SetError(3, 0, 0);Check to see if it is valid array dimension, Should be greater than 0 Else ;If Dimension Exists, then get the number of "Rows" $iUboundDim = UBound($aArray, 1);Get Number of "Rows" If @error Then Return SetError(3, 0, 0);2 = Array dimension is invalid. ;If $iDimension Exists, And the number of "Rows" is Valid: If $iDimension > 1 Then;Makes sure the Array dimension desired is more than 1-dimensional Local $aArrayTmp[1];Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1;Loop through "Rows" _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]);$iDimension-1 to match Dimension Next _ArrayDelete($aArrayTmp, 0);Get rid of 1st-element which is blank Else;Makes sure the Array dimension desired is 1-dimensional ;If Dimension Exists, And the number of "Rows" is Valid, and the Dimension desired is not > 1, then: ;For the Case that the array is 1-Dimensional If UBound($aArray, 0) = 1 Then;Makes sure the Array is only 1-Dimensional Dim $aArrayTmp[1];Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 _ArrayAdd($aArrayTmp, $aArray[$i]) Next _ArrayDelete($aArrayTmp, 0);Get rid of 1st-element which is blank Else;For the Case that the array is 2-Dimensional Dim $aArrayTmp[1];Declare blank array, which will hold the dimension declared by user For $i = 0 To $iUboundDim - 1 _ArrayAdd($aArrayTmp, $aArray[$i][$iDimension - 1]);$iDimension-1 to match Dimension Next _ArrayDelete($aArrayTmp, 0);Get rid of 1st-element which is blank EndIf EndIf EndIf Local $sHold;String that holds the Unique array info For $iCC = $iBase To UBound($aArrayTmp) - 1;Loop Through array ;If Not the case that the element is already in $sHold, then add it If Not StringInStr($vDelim & $sHold, $vDelim & $aArrayTmp[$iCC] & $vDelim, $iCase) Then _ $sHold &= $aArrayTmp[$iCC] & $vDelim Next If $sHold Then $aArrayTmp = StringSplit(StringTrimRight($sHold, StringLen($vDelim)), $vDelim, 1);Split the string into an array Return $aArrayTmp;SmOke_N's version used to Return SetError(0, 0, 0) EndIf Return SetError(2, 0, 0);If the script gets this far, it has failed EndFunc;==>_ArrayUnique Link to comment Share on other sites More sharing options...
GEOSoft Posted November 27, 2008 Share Posted November 27, 2008 I already gave you one that gets most of them. Give me a list of the characters that you need replaced and I'll work on it some more. George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
myspacee Posted November 27, 2008 Author Share Posted November 27, 2008 ask because can't figure how this func works : $sStr = FileRead($file) $chars = StringRegExpReplace($sStr, "\d|\x22|,|.|-|!|£|$|%|&|/|(|)|=|?|^|(|[|]|@|#|§|;|:|_|-|+|*|chr(34)|'||\|\|<|>|»|||»|©||||?|||||||||||?|?|?|||||||||||||?|¡|¢|£|¤|¥|¦|§|¨|©|ª|«|¬|®|¯|°|±|²|³|´|µ|¶|·|¸|¹|º|»|¼|½|¾|¿|×|÷|", "") thank you, m. (sorry for dupes) Link to comment Share on other sites More sharing options...
GEOSoft Posted November 27, 2008 Share Posted November 27, 2008 ask because can't figure how this func works : $sStr = FileRead($file) $chars = StringRegExpReplace($sStr, "\d|\x22|,|.|-|!|£|$|%|&|/|(|)|=|?|^|(|[|]|@|#|§|;|:|_|-|+|*|chr(34)|'||\|\|<|>|»|||»|©||||?|||||||||||?|?|?|||||||||||||?|¡|¢|£|¤|¥|¦|§|¨|©|ª|«|¬|®|¯|°|±|²|³|´|µ|¶|·|¸|¹|º|»|¼|½|¾|¿|×|÷|", "") thank you, m. (sorry for dupes)When using a RegExp, The "|" symbol means OR, \d means any digit, I used \x22 to replace Chr(34) (double quote). It's not a good idea to replace Chr(39) (single quote) because of words like I'm, you're &Etc. So what it's doing is replacing any of the characters in that RegExp with a blank string. When using RegExp you can not use the literal string "Chr(34)" and like I said, I did that with \x22, although "" should also have worked. Here it is with your Chr(34) and the single quote removed $chars = StringRegExpReplace($sStr, "\d|\x22|,|.|-|!|£|$|%|&|/|(|)|=|?|^|(|[|]|@|#|§|;|:|_|-|+|*||\|\|<|>|»|||»|©||||?|||||||||||?|?|?|||||||||||||?|¡|¢|£|¤|¥|¦|§|¨|©|ª|«|¬|®|¯|°|±|²|³|´|µ|¶|·|¸|¹|º|»|¼|½|¾|¿|×|÷|", "") If you must remove the single quotes then I'll write anothe short one that you can use to pre-process it. George Question about decompiling code? Read the decompiling FAQ and don't bother posting the question in the forums.Be sure to read and follow the forum rules. -AKA the AutoIt Reading and Comprehension Skills test.*** The PCRE (Regular Expression) ToolKit for AutoIT - (Updated Oct 20, 2011 ver:3.0.1.13) - Please update your current version before filing any bug reports. The installer now includes both 32 and 64 bit versions. No change in version number. Visit my Blog .. currently not active but it will soon be resplendent with news and views. Also please remove any links you may have to my website. it is soon to be closed and replaced with something else. "Old age and treachery will always overcome youth and skill!" Link to comment Share on other sites More sharing options...
Malkey Posted November 27, 2008 Share Posted November 27, 2008 (edited) This seems to generate a word list ok. #include<array.au3> $hFile = "C:\Path\somefile.txt" ; <== Enter Text file here ===== $sFile = FileRead($hFile) $sStr = StringReplace(StringStripCR($sFile), @LF, Chr(32)) ;ConsoleWrite($sStr & @CRLF) FileClose($hFile) ; Select NO "a-z, A-Z, 0-9 or underscore (_)" Or Select "any digit (0-9)" Or Select "underscores), to be Replaced. $sStr = StringRegExpReplace($sStr, "[^\w]|\d|_", Chr(32)) $sStr = StringStripWS($sStr, 7) ; remove all double white spaces. ;ConsoleWrite($sStr & @CRLF) $aWords = StringSplit($sStr, Chr(32), 0) $Numbers = "," ; Add coma to start of string for RegExp to get 1st word For $i = 1 To UBound($aWords) - 1 If StringRegExp($Numbers, "," & $aWords[$i] & ",") = 0 And StringStripWS($aWords[$i], 8) <> "" And _ StringLen($aWords[$i]) > 2 Then $Numbers &= $aWords[$i] & "," Next $Numbers = StringTrimLeft(StringTrimRight($Numbers, 1), 1) ; Remove 1st and last comas. $Array = StringSplit($Numbers, ",", 0) ConsoleWrite("Number of words = " & $Array[0] & @CRLF) _ArrayDelete($Array, 0) ; _ArraySort($Array, 0, 0) _ArrayDisplay($Array, "Returned Word List") It took under 20 secs to produce 2080 word list from a 258 kb text file. Edit: Using this, If StringRegExp($Numbers, "," & StringLower($aWords[$i]) & ",") = 0 And StringStripWS($aWords[$i], 8) <> "" And _ StringLen($aWords[$i]) > 2 Then $Numbers &= StringLower($aWords[$i]) & "," in the For Next Loop results in all lower case returned. Reduced word count to 1760. Noticed New Zealand was split into two words both lower case. Edited November 27, 2008 by Malkey Link to comment Share on other sites More sharing options...
Malkey Posted November 28, 2008 Share Posted November 28, 2008 If in the previous post the variable $Numbers caused confusion. I copied and modified the script from http://www.autoitscript.com/forum/index.ph...st&p=609527Here is the same script as above, but modified to allow AutoIt variable types to be added to the word list.#include<array.au3> ;$hFile = "C:\Path\somefile.au3" ; <== Enter .au3 file here ===== $sFile = FileRead($hFile) $sStr = StringReplace(StringStripCR($sFile), @LF, Chr(32)) ConsoleWrite($sStr & @CRLF) FileClose($hFile) ; Select NO "a-z, A-Z, 0-9 or underscore (_) or $" Or Select "any digit (0-9)" Or Select "underscores), to be Replaced. $sStr = StringRegExpReplace($sStr, "[^\w|$]|\d|_", Chr(32)) $sStr = StringStripWS($sStr, 7) ; remove all double white spaces. ConsoleWrite($sStr & @CRLF) $aWords = StringSplit($sStr, Chr(32), 0) $sString = "," ; Add coma to start of string for RegExp to get 1st word For $i = 1 To UBound($aWords) - 1 If StringLeft($aWords[$i], 1) = "$" And StringRegExp($sString, ",\" & $aWords[$i] & ",") = 0 And _ ; Allows preceeding $ sign. StringStripWS($aWords[$i], 8) <> "" And StringLen($aWords[$i]) > 2 Then $sString &= $aWords[$i] & "," ElseIf StringLeft($aWords[$i], 1) <> "$" And StringRegExp($sString, "," & StringLower($aWords[$i]) & ",") = 0 And _ StringStripWS($aWords[$i], 8) <> "" And StringLen($aWords[$i]) > 2 Then $sString &= StringLower($aWords[$i]) & "," EndIf Next $sString = StringTrimLeft(StringTrimRight($sString, 1), 1) ; Remove 1st and last comas. ConsoleWrite(StringLower("$ASD)" & @CRLF)) $Array = StringSplit($sString, ",", 0) ConsoleWrite("Number of words = " & $Array[0] & @CRLF) _ArrayDelete($Array, 0) ; _ArraySort($Array, 0, 0) _ArrayDisplay($Array, "Returned Word List") Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now