Carm01 Posted October 4, 2014 Share Posted October 4, 2014 All, I have a part of my code below, and what this does is take a specifically formatted text file, search for specific words or phrases to include or exclude( exclude code ). The include and exclude are handled by separate txt files; one for each. I was able to finally understand how to use case insensitive, but I am having difficulty after several hours failing and looking up examples of how to include multiple words or phrases including spaces. There seem to be a plethora of examples stripping spaces. I need the spaces as this is what i am searching for For example if my include file has: red sugar My exclude file has: sugar coated The output lines will be sentences ( or file lines ) with the words red, sugar contained in the line as included but if I am excluding the phrase 'sugar coated' it appears to only see the word 'sugar' and capture the entire content and display it ignoring the phrase 'sugar coated', thus excluding the entire line. That is a perfect example, and here is a piece of code i threw together as a subroutine for the exclude portion: Func verify1() For $k = 1 To $d $line66 = "(?i) " & FileReadLine($file6, $k) If StringRegExp($line, $line66) = 1 Then $Sno = 1 ExitLoop EndIf Next $Sno = 0 EndFunc ;==>verify1 the lines specifically are: $line66 = "(?i) " & FileReadLine($file6, $k) If StringRegExp($line, $line66) = 1 Then This is what I am using to detect the exclude text. I have something similar for the include text. I need to understand better on how to allow for phrases, numbers, spaces and any special character regardless of how many words or spaces exist. This is not as straightforward as i thought Thank You in Advance Link to comment Share on other sites More sharing options...
Jury Posted October 5, 2014 Share Posted October 5, 2014 So, for example, you want to exclude the phrase 'sugar coated' from all lines in the file then you want to find all lines with red or sugar contained in any line throughout the file? Link to comment Share on other sites More sharing options...
Kyan Posted October 5, 2014 Share Posted October 5, 2014 Is somewhat hard to understand what you're trying to archive, you just want to capture phrases with words present in your include.txt? And if there's some word/sentence of exclude.txt you simply ignore them? I would say with regex you cannot do it in that way, for example, your include file will match all sugar phrases, even if coated is ahead, You will need to put all line matches of include, and then subtract the matchs of exclude Here's one only with one regex that looks for 'sugar' without coated being ahead, or for the word 'red'... Local $hFile, $sLine, $l=1 $hFile = FileOpen(@DesktopDir&"\MyText.txt") $xIncludeExp = 'sugar\s(?!coated)|red' While 1 $sLine = FileReadLine($hFile,$l) If @error Then ExitLoop If StringRegExp($sLine,'(?i)'&$xIncludeExp) = 1 Then ConsoleWrite($l&" OK"&@LF) $l+=1 WEnd FileClose($hFile) Exit if you want to export matched lines, just replace If StringRegExp($sLine,'(?i)'&$xIncludeExp) = 1 Then ConsoleWrite($l&" OK"&@LF) with If StringRegExp($sLine,'(?i)'&$xIncludeExp) = 1 Then FileWriteLine(@DesktopDir&"YourCapturedPhrases.txt",$sLine) Heroes, there is no such thing One day I'll discover what IE.au3 has of special for so many users using it.C'mon there's InetRead and WinHTTP, way better Link to comment Share on other sites More sharing options...
jguinch Posted October 6, 2014 Share Posted October 6, 2014 Carm01, I don't really understand what you want to do... Could you post some small examples of include/exclude files + sentences and the result you expect ? It will be clearer for everyone Spoiler Network configuration UDF, _DirGetSizeByExtension, _UninstallList Firefox ConfigurationArray multi-dimensions, Printer Management UDF Link to comment Share on other sites More sharing options...
kylomas Posted October 7, 2014 Share Posted October 7, 2014 (edited) Carm01, Something like this? local $exclude, $text, $pattern = '(?i)' ; simulate text file $text &= 'I have a sugar coated brown fox.' & @CRLF $text &= 'Now I want to grill it. No, not ask questions of it, but cook it.' & @LF $text &= 'Does anyone have a good brown fox^ recipe?' ;simulate exclude file $exclude &= 'sugar' & @lf $exclude &= 'grill' & @crlf $exclude &= 'fox^ recipe' & @crlf ; handle SRE special chars, if needed $exclude = stringregexpreplace($exclude,'[\^]','\\^') local $aExclude = stringregexp($exclude,'(.*)\R',3) for $1 = 0 to ubound($aExclude) - 1 $pattern &= $1 < ubound($aExclude) - 1 ? $aExclude[$1] & ' |' : ' ' & $aExclude[$1] Next ConsoleWrite('' & @CRLF) ConsoleWrite('! --- Pattern = [' & $pattern & ']' & @CRLF) ConsoleWrite('' & @CRLF) ConsoleWrite(stringregexpreplace($text,$pattern,'') & @CRLF) ConsoleWrite('' & @CRLF) kylomas edit: added SRE char handling for "^" Edited October 7, 2014 by kylomas Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
jchd Posted October 7, 2014 Share Posted October 7, 2014 kylomas, You can put Q...E at good use to ignore special characters or subpatterns. Of course you could also guard against E occurring by itself inside the search word, but that may be unlikely enough to ignore it. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
kylomas Posted October 7, 2014 Share Posted October 7, 2014 @jchd - Thanks... Forum Rules Procedure for posting code "I like pigs. Dogs look up to us. Cats look down on us. Pigs treat us as equals." - Sir Winston Churchill Link to comment Share on other sites More sharing options...
Carm01 Posted October 9, 2014 Author Share Posted October 9, 2014 (edited) Carm01, Something like this? local $exclude, $text, $pattern = '(?i)' ; simulate text file $text &= 'I have a sugar coated brown fox.' & @CRLF $text &= 'Now I want to grill it. No, not ask questions of it, but cook it.' & @LF $text &= 'Does anyone have a good brown fox^ recipe?' ;simulate exclude file $exclude &= 'sugar' & @lf $exclude &= 'grill' & @crlf $exclude &= 'fox^ recipe' & @crlf ; handle SRE special chars, if needed $exclude = stringregexpreplace($exclude,'[\^]','\\^') local $aExclude = stringregexp($exclude,'(.*)\R',3) for $1 = 0 to ubound($aExclude) - 1 $pattern &= $1 < ubound($aExclude) - 1 ? $aExclude[$1] & ' |' : ' ' & $aExclude[$1] Next ConsoleWrite('' & @CRLF) ConsoleWrite('! --- Pattern = [' & $pattern & ']' & @CRLF) ConsoleWrite('' & @CRLF) ConsoleWrite(stringregexpreplace($text,$pattern,'') & @CRLF) ConsoleWrite('' & @CRLF) kylomas edit: added SRE char handling for "^" This is closer, but.... Instead of omitting the word out of the sentence, the entire sentence is ignored, thus nothing written IF the sentence read 'I have a sugar coated brown fox.' then that entire sentence would be passed by and no lines written If the sentence read 'I have a tar coated brown fox.' then the entire sentence would have been captured and exported to a new file and line. I also have an include list as to what i am searching for. If the include list has the word/phrase 'tar coated' it would find sentences with those words or phrases , BUT if it had anything from the exclude list, nothing would be written" IF the sentence was: ' I have a sugar coated brown fox and a tar coated brown fox' this would result in nothing being written as a word or phrase in the sentence is on the ban list. Should have included that part .. caps and lower case should be treated equally. Sorry for not making that clear. , and thank you all for your assistance with this I also included some example txt files, the raw text and the results. I am really looking how to use this StringRegExp to get it to recognize caps, phrases, and lower case treated equally, include numbers, symbols, etc... exclude.txtinclude.txtrawtext.txtresults.txt Edited October 9, 2014 by Carm01 Link to comment Share on other sites More sharing options...
Solution jchd Posted October 9, 2014 Solution Share Posted October 9, 2014 (edited) Carm01, I've hard time figuring out what you want exactly. The keyword here is "exactly". So let me ask some questions to make things clearer: A/ Should input be treated literally? Input: "I'm digging for diamond, silver and ---- gold" Should that match "silver and gold" from include list? B/ Should include/exclude lists be treated literally? Input: "I'm digging for diamond, silver and gold" Should that match "silver */and/* gold" from include list? C/ What is the typical size (in characters) of the include/exclude lists? If the answers are: A/ Yes. No. B/ Yes. No. C/ Small enough/small enough then this should work for you: #include <Array.au3> Local $aText = FileReadToArray("include.txt") Local $sInclude = "(?=.*(?:\Q" & _ArrayToString($aText, "\E|\Q") & "\E))" $aText = FileReadToArray("exclude.txt") Local $sExclude = "(?!.*(?:\Q" & _ArrayToString($aText, "\E|\Q") & "\E))" Local $aText = 0 Local $sInput = FileRead("rawtext.txt") Local $aResult = StringRegExp($sInput, "(?im)" & $sExclude & $sInclude & "^.*(?:$|\R)", 3) _ArrayDisplay($aResult) Edited October 9, 2014 by jchd This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Carm01 Posted October 11, 2014 Author Share Posted October 11, 2014 Carm01, I've hard time figuring out what you want exactly. The keyword here is "exactly". So let me ask some questions to make things clearer: A/ Should input be treated literally? Input: "I'm digging for diamond, silver and ---- gold" Should that match "silver and gold" from include list? B/ Should include/exclude lists be treated literally? Input: "I'm digging for diamond, silver and gold" Should that match "silver */and/* gold" from include list? C/ What is the typical size (in characters) of the include/exclude lists? If the answers are: A/ Yes. No. B/ Yes. No. C/ Small enough/small enough then this should work for you: #include <Array.au3> Local $aText = FileReadToArray("include.txt") Local $sInclude = "(?=.*(?:\Q" & _ArrayToString($aText, "\E|\Q") & "\E))" $aText = FileReadToArray("exclude.txt") Local $sExclude = "(?!.*(?:\Q" & _ArrayToString($aText, "\E|\Q") & "\E))" Local $aText = 0 Local $sInput = FileRead("rawtext.txt") Local $aResult = StringRegExp($sInput, "(?im)" & $sExclude & $sInclude & "^.*(?:$|\R)", 3) _ArrayDisplay($aResult) I like how easily this was done. 100% different from my method using file line reads, then file line writes, which was easy for me, but this little thing is stupid fast. technically this is what i am looking for for the searching part, but would like to output this to a file instead of an array box. Link to comment Share on other sites More sharing options...
guinness Posted October 11, 2014 Share Posted October 11, 2014 Which is trivial to do. Look at a for to loop in the help file and learn about arrays. UDF List: _AdapterConnections() • _AlwaysRun() • _AppMon() • _AppMonEx() • _ArrayFilter/_ArrayReduce • _BinaryBin() • _CheckMsgBox() • _CmdLineRaw() • _ContextMenu() • _ConvertLHWebColor()/_ConvertSHWebColor() • _DesktopDimensions() • _DisplayPassword() • _DotNet_Load()/_DotNet_Unload() • _Fibonacci() • _FileCompare() • _FileCompareContents() • _FileNameByHandle() • _FilePrefix/SRE() • _FindInFile() • _GetBackgroundColor()/_SetBackgroundColor() • _GetConrolID() • _GetCtrlClass() • _GetDirectoryFormat() • _GetDriveMediaType() • _GetFilename()/_GetFilenameExt() • _GetHardwareID() • _GetIP() • _GetIP_Country() • _GetOSLanguage() • _GetSavedSource() • _GetStringSize() • _GetSystemPaths() • _GetURLImage() • _GIFImage() • _GoogleWeather() • _GUICtrlCreateGroup() • _GUICtrlListBox_CreateArray() • _GUICtrlListView_CreateArray() • _GUICtrlListView_SaveCSV() • _GUICtrlListView_SaveHTML() • _GUICtrlListView_SaveTxt() • _GUICtrlListView_SaveXML() • _GUICtrlMenu_Recent() • _GUICtrlMenu_SetItemImage() • _GUICtrlTreeView_CreateArray() • _GUIDisable() • _GUIImageList_SetIconFromHandle() • _GUIRegisterMsg() • _GUISetIcon() • _Icon_Clear()/_Icon_Set() • _IdleTime() • _InetGet() • _InetGetGUI() • _InetGetProgress() • _IPDetails() • _IsFileOlder() • _IsGUID() • _IsHex() • _IsPalindrome() • _IsRegKey() • _IsStringRegExp() • _IsSystemDrive() • _IsUPX() • _IsValidType() • _IsWebColor() • _Language() • _Log() • _MicrosoftInternetConnectivity() • _MSDNDataType() • _PathFull/GetRelative/Split() • _PathSplitEx() • _PrintFromArray() • _ProgressSetMarquee() • _ReDim() • _RockPaperScissors()/_RockPaperScissorsLizardSpock() • _ScrollingCredits • _SelfDelete() • _SelfRename() • _SelfUpdate() • _SendTo() • _ShellAll() • _ShellFile() • _ShellFolder() • _SingletonHWID() • _SingletonPID() • _Startup() • _StringCompact() • _StringIsValid() • _StringRegExpMetaCharacters() • _StringReplaceWholeWord() • _StringStripChars() • _Temperature() • _TrialPeriod() • _UKToUSDate()/_USToUKDate() • _WinAPI_Create_CTL_CODE() • _WinAPI_CreateGUID() • _WMIDateStringToDate()/_DateToWMIDateString() • Au3 script parsing • AutoIt Search • AutoIt3 Portable • AutoIt3WrapperToPragma • AutoItWinGetTitle()/AutoItWinSetTitle() • Coding • DirToHTML5 • FileInstallr • FileReadLastChars() • GeoIP database • GUI - Only Close Button • GUI Examples • GUICtrlDeleteImage() • GUICtrlGetBkColor() • GUICtrlGetStyle() • GUIEvents • GUIGetBkColor() • Int_Parse() & Int_TryParse() • IsISBN() • LockFile() • Mapping CtrlIDs • OOP in AutoIt • ParseHeadersToSciTE() • PasswordValid • PasteBin • Posts Per Day • PreExpand • Protect Globals • Queue() • Resource Update • ResourcesEx • SciTE Jump • Settings INI • SHELLHOOK • Shunting-Yard • Signature Creator • Stack() • Stopwatch() • StringAddLF()/StringStripLF() • StringEOLToCRLF() • VSCROLL • WM_COPYDATA • More Examples... Updated: 22/04/2018 Link to comment Share on other sites More sharing options...
mikell Posted October 11, 2014 Share Posted October 11, 2014 Or look at how to File Write From an Array in UDFs/File Management Link to comment Share on other sites More sharing options...
jchd Posted October 11, 2014 Share Posted October 11, 2014 Or reverse the idea and use StringRegExpReplace to void any line which does NOT satisfy the requirements. Exercise left to the readers. This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe hereRegExp tutorial: enough to get startedPCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta. SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt) Link to comment Share on other sites More sharing options...
Carm01 Posted October 11, 2014 Author Share Posted October 11, 2014 Or look at how to File Write From an Array in UDFs/File Management at the end ;_ArrayDisplay($aResult) Local $sFilePath = @ScriptDir & "Examples.txt" _FileWriteFromArray($sFilePath, $aResult, 1) I guess it was too lat/early for my brain... Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now