stg68 Posted April 21, 2008 Share Posted April 21, 2008 Hello, I am trying to identify the fastest processing method of the following. Your help will be appreciated: 1. I have a large txt file. I want to read this file and split each word by space and write it to second file where all words will be listed in one column. 2. It is optional for now but will be nice to have if easy achievable. If word is already exists in second file then skip writing this word. Regards Link to comment Share on other sites More sharing options...
someone Posted April 21, 2008 Share Posted April 21, 2008 Check out a couple of links, the second is link SmOke_N posted in the first link.http://www.autoitscript.com/forum/index.php?showtopic=69115http://www.autoitscript.com/forum/index.ph...=7821&st=15 While ProcessExists('Andrews bad day.exe') BlockInput(1) SoundPlay('Music.wav') SoundSetWaveVolume('Louder') WEnd Link to comment Share on other sites More sharing options...
herewasplato Posted April 21, 2008 Share Posted April 21, 2008 ...I have a large txt file..."Large" is a relatative term. The best solution would depend on how large large is.Here is a modified version of a remove dup UDF:http://www.autoitscript.com/forum/index.ph...st&p=499222 [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
stg68 Posted April 21, 2008 Author Share Posted April 21, 2008 "Large" is a relatative term. The best solution would depend on how large large is.Here is a modified version of a remove dup UDF:http://www.autoitscript.com/forum/index.ph...st&p=499222Thank you all for your help.Now I can use _ArrayRemoveDuplicates function to remove duplicates. Thanks!My next step is how to build array from the txt file with the multiple lines where each word will be an element in the array.Thanks! Link to comment Share on other sites More sharing options...
weaponx Posted April 21, 2008 Share Posted April 21, 2008 Thank you all for your help.Now I can use _ArrayRemoveDuplicates function to remove duplicates. Thanks!My next step is how to build array from the txt file with the multiple lines where each word will be an element in the array.Thanks!_FileReadToArray() will create an array with each line being an element.FileRead() and StringSplit($string, " ") will break all words into elements. Link to comment Share on other sites More sharing options...
stg68 Posted April 21, 2008 Author Share Posted April 21, 2008 _FileReadToArray() will create an array with each line being an element.FileRead() and StringSplit($string, " ") will break all words into elements.The question is if I use _FileReadToArray() why do i need to use FileRead()I do understand that _FileReadToArray()will create an array with each line being an element. So, how can I split it after?Thanks Link to comment Share on other sites More sharing options...
weaponx Posted April 21, 2008 Share Posted April 21, 2008 The question is if I use _FileReadToArray() why do i need to use FileRead() I do understand that _FileReadToArray()will create an array with each line being an element. So, how can I split it after? Thanks I guess I was showing you 2 seperate solutions. You can just do: $array = _FileReadToArray("myfile.txt") For $X = 1 to $array[0] $tempArray = StringSplit($array[$X], " ") Next Link to comment Share on other sites More sharing options...
stg68 Posted April 21, 2008 Author Share Posted April 21, 2008 I guess I was showing you 2 seperate solutions. You can just do: $array = _FileReadToArray("myfile.txt") For $X = 1 to $array[0] $tempArray = StringSplit($array[$X], " ") Next Please tell me what I am doing wrong here. I just want to write to file each spitted element of the array? #include<file.au3> #include<array.au3> Dim $array _FileReadToArray("c:\temp\test\book.txt",$array) _ArrayDisplay($array, " ") For $X = 1 to $array[0] $tempArray = StringSplit($array[$X], " ") FileWriteLine("c:\temp\test\BookResults.txt",$tempArray[$x] &@CRLF) Next Thank you! Link to comment Share on other sites More sharing options...
herewasplato Posted April 21, 2008 Share Posted April 21, 2008 I'll look at your code in a bit - for now, try this:expandcollapse popup#include <Array.au3> #include <File.au3> ;OutputFileHandle $OFH = FileOpen("output.txt", 2) ; Check if file opened for writing OK If $OFH = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit EndIf $var = StringReplace(FileRead("input.txt"), @CRLF, " ") $varArray = StringSplit($var, " ") _ArrayDisplay ($varArray) _ArrayRemoveDuplicates($varArray) _ArrayDisplay ($varArray) _FileWriteFromArray($OFH, $varArray) ;================================================================== ; Function Name: _ArrayRemoveDuplicates() ; ; Description : Removes duplicate elements from an Array ; Parameter(s) : $avArray ; $iBase ; $iCaseSense ; $sDelimter ; Requirement(s) : None ; Return Value(s): On Success - Returns 1 and the cleaned up Array is set ; On Failure - Returns an -1 and sets @Error ; @Error=1 $avArray is not an array ; @Error=2 $iBase is different from 1 or 2 ; @Error=3 $iCaseSense is different from 0 or 1 ; Author : uteotw, but ALL the credits go to nitro322 and SmOke_N, see link below ; Note(s) : None ; Link ; [url="http://www.autoitscript.com/forum/index.php?showtopic=7821"]http://www.autoitscript.com/forum/index.php?showtopic=7821[/url] ; Example ; Yes ;================================================================== Func _ArrayRemoveDuplicates(ByRef $avArray, $iBase = 0, $iCaseSense = 0, $sDelimter = "") Local $sHold If Not IsArray($avArray) Then SetError(1) Return -1 EndIf If Not ($iBase = 0 Or $iBase = 1) Then SetError(2) Return -1 EndIf If $iBase = 1 And $avArray[0] = 0 Then SetError(0) Return 0 EndIf If Not ($iCaseSense = 0 Or $iCaseSense = 1) Then SetError(3) Return -1 EndIf If $sDelimter = "" Then $sDelimter = Chr(01) & Chr(01) EndIf If $iBase = 0 Then For $i = $iBase To UBound($avArray) - 1 If Not StringInStr($sDelimter & $sHold, $sDelimter & $avArray[$i] & $sDelimter, $iCaseSense) Then $sHold &= $avArray[$i] & $sDelimter EndIf Next $avNewArray = StringSplit(StringTrimRight($sHold, StringLen($sDelimter)), $sDelimter, 1) ReDim $avArray[$avNewArray[0]] For $i = 1 To $avNewArray[0] $avArray[$i - 1] = $avNewArray[$i] Next ElseIf $iBase = 1 Then For $i = $iBase To UBound($avArray) - 1 If Not StringInStr($sDelimter & $sHold, $sDelimter & $avArray[$i] & $sDelimter, $iCaseSense) Then $sHold &= $avArray[$i] & $sDelimter EndIf Next $avArray = StringSplit(StringTrimRight($sHold, StringLen($sDelimter)), $sDelimter, 1) EndIf Return 1 EndFunc ;==>_ArrayRemoveDuplicates [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
herewasplato Posted April 21, 2008 Share Posted April 21, 2008 Please tell me what I am doing wrong here...Try this for your code:#include<file.au3> #include<array.au3> ;OutputFileHandle $OFH = FileOpen("output.txt", 2) ; Check if file opened for writing OK If $OFH = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit EndIf Dim $array _FileReadToArray("input.txt", $array) _ArrayDisplay($array, " ") For $X = 1 To $array[0] $tempArray = StringSplit($array[$X], " ") For $Y = 1 To $tempArray[0] FileWriteLine($OFH, $tempArray[$Y]) Next Next [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
stg68 Posted April 21, 2008 Author Share Posted April 21, 2008 (edited) #include <Array.au3>#include <File.au3>;OutputFileHandle$OFH = FileOpen("output.txt", 2); Check if file opened for writing OKIf $OFH = -1 Then MsgBox(0, "Error", "Unable to open file.") ExitEndIf$var = StringReplace(FileRead("input.txt"), @CRLF, " ")$varArray = StringSplit($var, " ")_ArrayDisplay ($varArray)_ArrayRemoveDuplicates($varArray)_ArrayDisplay ($varArray)_FileWriteFromArray($OFH, $varArray);==================================================================; Function Name: _ArrayRemoveDuplicates();; Description : Removes duplicate elements from an Array; Parameter(s) : $avArray; $iBase; $iCaseSense; $sDelimter; Requirement(s) : None; Return Value(s): On Success - Returns 1 and the cleaned up Array is set; On Failure - Returns an -1 and sets @Error; @Error=1 $avArray is not an array; @Error=2 $iBase is different from 1 or 2; @Error=3 $iCaseSense is different from 0 or 1; Author : uteotw, but ALL the credits go to nitro322 and SmOke_N, see link below; Note(s) : None; Link ; http://www.autoitscript.com/forum/index.php?showtopic=7821; Example ; Yes;==================================================================Func _ArrayRemoveDuplicates(ByRef $avArray, $iBase = 0, $iCaseSense = 0, $sDelimter = "") Local $sHold If Not IsArray($avArray) Then SetError(1) Return -1 EndIf If Not ($iBase = 0 Or $iBase = 1) Then SetError(2) Return -1 EndIf If $iBase = 1 And $avArray[0] = 0 Then SetError(0) Return 0 EndIf If Not ($iCaseSense = 0 Or $iCaseSense = 1) Then SetError(3) Return -1 EndIf If $sDelimter = "" Then $sDelimter = Chr(01) & Chr(01) EndIf If $iBase = 0 Then For $i = $iBase To UBound($avArray) - 1 If Not StringInStr($sDelimter & $sHold, $sDelimter & $avArray[$i] & $sDelimter, $iCaseSense) Then $sHold &= $avArray[$i] & $sDelimter EndIf Next $avNewArray = StringSplit(StringTrimRight($sHold, StringLen($sDelimter)), $sDelimter, 1) ReDim $avArray[$avNewArray[0]] For $i = 1 To $avNewArray[0] $avArray[$i - 1] = $avNewArray[$i] Next ElseIf $iBase = 1 Then For $i = $iBase To UBound($avArray) - 1 If Not StringInStr($sDelimter & $sHold, $sDelimter & $avArray[$i] & $sDelimter, $iCaseSense) Then $sHold &= $avArray[$i] & $sDelimter EndIf Next $avArray = StringSplit(StringTrimRight($sHold, StringLen($sDelimter)), $sDelimter, 1) EndIf Return 1EndFunc ;==>_ArrayRemoveDuplicatesThank you! It works!Is there a way to make some cosmetic changes?When it writes from array it inserting an empty line and second line calculates total elements. Is there a way to avoid it?Thanks! Edited April 21, 2008 by stg68 Link to comment Share on other sites More sharing options...
herewasplato Posted April 21, 2008 Share Posted April 21, 2008 (edited) ...When it writes from array it inserting an empty line and second line calculates total elements. Is there a way to avoid it?...see code below - could not edit the code in this post w/o a forum barf Edited April 21, 2008 by herewasplato [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
herewasplato Posted April 21, 2008 Share Posted April 21, 2008 New code after PMexpandcollapse popup;OutputFileHandle $OFH = FileOpen("output.txt", 2) ; Check if file opened for writing OK If $OFH = -1 Then MsgBox(0, "Error", "Unable to open file.") Exit EndIf $var = StringReplace(FileRead("input.txt"), @CRLF, " ") $varArray = StringSplit($var, " ") _ArrayRemoveDuplicates($varArray, 1) ;write unique element count to the output file FileWriteLine($OFH, $varArray[0]) ;start at 2 to avaoid extra line at the beginning??? For $i = 2 To $varArray[0] FileWriteLine($OFH, $varArray[$i]) Next ;================================================================== ; Function Name: _ArrayRemoveDuplicates() ; ; Description : Removes duplicate elements from an Array ; Parameter(s) : $avArray ; $iBase ; $iCaseSense ; $sDelimter ; Requirement(s) : None ; Return Value(s): On Success - Returns 1 and the cleaned up Array is set ; On Failure - Returns an -1 and sets @Error ; @Error=1 $avArray is not an array ; @Error=2 $iBase is different from 1 or 2 ; @Error=3 $iCaseSense is different from 0 or 1 ; Author : uteotw, but ALL the credits go to nitro322 and SmOke_N, see link below ; Note(s) : None ; Link ; [url="http://www.autoitscript.com/forum/index.php?showtopic=7821"]http://www.autoitscript.com/forum/index.php?showtopic=7821[/url] ; Example ; Yes ;================================================================== Func _ArrayRemoveDuplicates(ByRef $avArray, $iBase = 0, $iCaseSense = 0, $sDelimter = "") Local $sHold If Not IsArray($avArray) Then SetError(1) Return -1 EndIf If Not ($iBase = 0 Or $iBase = 1) Then SetError(2) Return -1 EndIf If $iBase = 1 And $avArray[0] = 0 Then SetError(0) Return 0 EndIf If Not ($iCaseSense = 0 Or $iCaseSense = 1) Then SetError(3) Return -1 EndIf If $sDelimter = "" Then $sDelimter = Chr(01) & Chr(01) EndIf If $iBase = 0 Then For $i = $iBase To UBound($avArray) - 1 If Not StringInStr($sDelimter & $sHold, $sDelimter & $avArray[$i] & $sDelimter, $iCaseSense) Then $sHold &= $avArray[$i] & $sDelimter EndIf Next $avNewArray = StringSplit(StringTrimRight($sHold, StringLen($sDelimter)), $sDelimter, 1) ReDim $avArray[$avNewArray[0]] For $i = 1 To $avNewArray[0] $avArray[$i - 1] = $avNewArray[$i] Next ElseIf $iBase = 1 Then For $i = $iBase To UBound($avArray) - 1 If Not StringInStr($sDelimter & $sHold, $sDelimter & $avArray[$i] & $sDelimter, $iCaseSense) Then $sHold &= $avArray[$i] & $sDelimter EndIf Next $avArray = StringSplit(StringTrimRight($sHold, StringLen($sDelimter)), $sDelimter, 1) EndIf Return 1 EndFunc ;==>_ArrayRemoveDuplicates [size="1"][font="Arial"].[u].[/u][/font][/size] Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now