yton Posted May 6, 2010 Share Posted May 6, 2010 Greetings,I have a big text file with data as follows:phrase1; phrase2; phrase3phrase4; phrase 5phrase6; phrase 7; phrase8etc...I need to fetch phrases of 2 or more words and remove themso, the desired output is the text file with only 1-word phrasesplease, helpthank you Link to comment Share on other sites More sharing options...
99ojo Posted May 6, 2010 Share Posted May 6, 2010 (edited) Hi, #include <file.au3> #include <array.au3> ;read text file into array _FileReadToArray ("c:\mybigtextfile.txt", $arphrases) ;loop over array from last to 1st item For $i = UBound ($arphrases) - 1 To 1 Step -1 ;performing a StringSplit $temp = StringSplit ($arphrases [$i], ";") ;More elements then 1 -> delete item in array If $temp [0] > 1 Then _ArrayDelete ($arphrases, $i) EndIf ;Display result _ArrayDisplay ($arphrases) ;if you want, write into origin file or to another ;_FileWriteFromArray ("c:\result.txt", $arphrases, 1) ;-)) Stefan @edit: Missed the [$i] at line with StringSplit -> corrected @edit:rereading thread: if you want phrase1; phrase2; phrase3 become only phrase1: #include <file.au3> ;only needed for array display then... #include <array.au3> ;read text file into array _FileReadToArray ("c:\mybigtextfile.txt", $arphrases) ;loop over array from last to 1st item For $i = UBound ($arphrases) - 1 To 1 Step -1 ;performing a StringSplit $temp = StringSplit ($arphrases [$i], ";") ;More elements then 1 -> delete item in array If $temp [0] > 1 Then $arphrases [$i] = $temp [1] EndIf ;Display result _ArrayDisplay ($arphrases) ;if you want, write into origin file or to another ;_FileWriteFromArray ("c:\result.txt", $arphrases, 1) Edited May 6, 2010 by 99ojo Link to comment Share on other sites More sharing options...
yton Posted May 6, 2010 Author Share Posted May 6, 2010 wll, the question is how to fetch these 2+ word phrases? Link to comment Share on other sites More sharing options...
99ojo Posted May 6, 2010 Share Posted May 6, 2010 Hi, please post an example of the input file and the output you expect. I havn't have any idea what you want. ;-)) Stefan Link to comment Share on other sites More sharing options...
yton Posted May 6, 2010 Author Share Posted May 6, 2010 input txt file:phrase1; phrase2; phrase3phrase4; phrase 5phrase6; phrase 7; phrase8etc...output txt file:phrase1; phrase2phrase4; phrase 5phrase8etc...where deleted phrase3, 6 and 7 consist of 2+ words (e.g. "amazing blue car", "fine french wine" etc. - "" are for example) Link to comment Share on other sites More sharing options...
99ojo Posted May 6, 2010 Share Posted May 6, 2010 (edited) Hi, now it's a little bit clearer. I think this does what you want: #include <file.au3> #include <array.au3> Global $arphrases ;read text file into array _FileReadToArray ("c:\mybigtextfile.txt", $arphrases) ;loop over array from last to 1st item For $i = UBound ($arphrases) - 1 To 1 Step -1 ;Stringsplit to get seperate phrases $temp = StringSplit ($arphrases [$i], ";") $string = "" ;loop over return array from Stringsplit For $j = 1 To $temp [0] ;Stringsplit to get seperate words in phrase $temp1 = StringSplit (StringStripWS ($temp [$j], 1), " ") ;if you have less then three words save phrase into string and 'rebuild' phrases ConsoleWrite ($temp [$j] & @CRLF) If $temp1 [0] < 3 Then $string &= $temp [$j] & "; " EndIf Next ;there are at least one phrase with only 2 words If $string <> "" Then ;save string into array, get rid of last blank and ; $arphrases [$i] = StringTrimRight ($string, 2) Else ;all phrases with 3 ore more words -> delete item in array _ArrayDelete ($arphrases, $i) EndIf Next ;Display result _ArrayDisplay ($arphrases) ;if you want, write into origin file or to another ;_FileWriteFromArray ("c:\result.txt", $arphrases, 1) ;-)) Stefan @Edit: Did some code corrections... Edited May 6, 2010 by 99ojo Link to comment Share on other sites More sharing options...
yton Posted May 6, 2010 Author Share Posted May 6, 2010 Hi, now it's a little bit clearer. I think this does what you want: #include <file.au3> #include <array.au3> Global $arphrases ;read text file into array _FileReadToArray ("c:\mybigtextfile.txt", $arphrases) ;loop over array from last to 1st item For $i = UBound ($arphrases) - 1 To 1 Step -1 ;Stringsplit to get seperate phrases $temp = StringSplit ($arphrases [$i], ";") $string = "" ;loop over return array from Stringsplit For $j = 1 To $temp [0] ;Stringsplit to get seperate words in phrase $temp1 = StringSplit (StringStripWS ($temp [$j], 1), " ") ;if you have less then three words save phrase into string and 'rebuild' phrases ConsoleWrite ($temp [$j] & @CRLF) If $temp1 [0] < 3 Then $string &= $temp [$j] & "; " EndIf Next ;there are at least one phrase with only 2 words If $string <> "" Then ;save string into array, get rid of last blank and ; $arphrases [$i] = StringTrimRight ($string, 2) Else ;all phrases with 3 ore more words -> delete item in array _ArrayDelete ($arphrases, $i) EndIf Next ;Display result _ArrayDisplay ($arphrases) ;if you want, write into origin file or to another ;_FileWriteFromArray ("c:\result.txt", $arphrases, 1) ;-)) Stefan @Edit: Did some code corrections... not really, i need to browse for $arphrases as I do not know them all as file is very big Link to comment Share on other sites More sharing options...
yton Posted May 6, 2010 Author Share Posted May 6, 2010 (edited) it's clear now for me thanks! ) Edited May 6, 2010 by yton Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now