vinnyMS Posted April 19, 2021 Posted April 19, 2021 i need a script that can extract a sentence containing a word written in a list. the result is a text file with sentences extracted with a period as a sentence end limit. after and before a period is the extracted sentence. word list text file: word 1 word 2 word 3 extracted sentence written in "sentence" text file: this is word 1 sentence. this is word 2 sentence. this is word 3 sentence.
Musashi Posted April 19, 2021 Posted April 19, 2021 Just to understand better : Is this what you want ? Sourcetext : Sentence 1 without the searched term.Sentence 2 is word 1 sentence.Sentence 3 without the searched term.Sentence 4 without the searched term.Sentence 5 is word 2 sentence.Sentence 6 is word 3 sentence.Sentence 7 without the searched term. Word list (as textfile) : word 1 , word 2 , word 3 Resulttext : Sentence 2 is word 1 sentence.Sentence 5 is word 2 sentence.Sentence 6 is word 3 sentence. By the way: It would be helpful if you could provide a source and the word list as text files. Only a few helpers have time and passion to create the files themselves . "In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."
mikell Posted April 19, 2021 Posted April 19, 2021 (edited) On 4/19/2021 at 6:13 AM, Musashi said: Only a few helpers have time and passion to create the files themselves Expand ... but some are passionate guys who create something similar themselves so this allows a first try #Include <Array.au3> $p = "word 1|word 2|word 3" $txt = " Sentence 1 without the searched term. Sentence 2 is word 1 sentence. " & @crlf & "Sentence 3 without the searched term. Sentence 4 without the searched term. Sentence 5, is word 2 sentence. " & @crlf & "Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term. " $res = StringRegExp($txt, '(?s)\s*([^.]+\b(?|' & $p & ')\b[^.]+\.)', 3) _ArrayDisplay($res) Edit Waiting now for new requirements to come Edited April 19, 2021 by mikell vinnyMS, FrancescoDiMuro and Musashi 1 1 1
Alecsis1 Posted April 19, 2021 Posted April 19, 2021 Hello! Try something like this. Btw, sorry for my bad English… vimmyMS.zipFetching info...
JockoDundee Posted April 19, 2021 Posted April 19, 2021 On 4/19/2021 at 7:11 AM, Alecsis1 said: Btw, sorry for my bad English… Expand I doubt vimmy even cares about such things Code hard, but don’t hard code...
Musashi Posted April 19, 2021 Posted April 19, 2021 (edited) On 4/19/2021 at 8:16 AM, JockoDundee said: I doubt vimmy even cares about such things Expand I doubt that too . @Alecsis1 : As far as I have tested this on the quick, your script also delivers the desired result. However, the RegEx variant from @mikell is much shorter (as usual ). BTW : I would remove the following directive : [...] #pragma compile(UPX, True) [...] AV scanners react badly on UPX compressed executables. Use #AutoIt3Wrapper_UseUpx = N or #pragma compile(UPX, False) (which is the default) instead. Edited April 19, 2021 by Musashi typo "In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move."
Nine Posted April 19, 2021 Posted April 19, 2021 An hybrid solution maybe ? #include <Constants.au3> $p = "word 1|word 2|word 3" $txt = "Sentence 1 without the searched term. Sentence 2 is word 1 sentence. " & @crlf & "Sentence 3 without the searched term. Sentence 4 without the searched term. Sentence 5, is word 2 sentence. " & @crlf & "Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term." $aSentence = StringSplit($txt, ".", $STR_NOCOUNT) For $i = 0 to UBound($aSentence) - 2 If StringRegExp($aSentence[$i], "\b(" & $p & ")\b") Then FileWriteLine("Result.txt", StringStripWS($aSentence[$i], $STR_STRIPLEADING+$STR_STRIPTRAILING)) Next vinnyMS 1 “They did not know it was impossible, so they did it” ― Mark Twain Reveal hidden contents Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Debug Messages Monitor UDF Screen Scraping Round Corner GUI UDF Multi-Threading Made Easy Interface Object based on Tag
mikell Posted April 19, 2021 Posted April 19, 2021 On 4/19/2021 at 11:27 AM, Musashi said: delivers the desired result Expand I confess I omitted some details because it sounded a bit like spoon-feeding #Include <Array.au3> #cs 1.txt : word 1 word 2 word 3 #ce $p = StringReplace(StringStripWS(FileRead("1.txt"), 3), @crlf, "|") ;$p = "word 1|word 2|word 3" #cs 2.txt : Sentence 1 without the searched term. Sentence 2 is word 1 sentence. Sentence 3 without the searched term. Sentence 4 without the searched term. Sentence 5, is word 2 sentence. Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term. #ce $txt = FileRead("2.txt") ;$txt = " Sentence 1 without the searched term. Sentence 2 is word 1 sentence. " & @crlf & "Sentence 3 without the searched term. Sentence 4 without the searched term. Sentence 5, is word 2 sentence. " & @crlf & "Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term. " $res = StringRegExp($txt, '(?s)\s*([^.]+\b(?|' & $p & ')\b[^.]+\.)', 3) ;_ArrayDisplay($res) FileWrite("result.txt", _ArrayToString($res, @crlf))
vinnyMS Posted April 19, 2021 Author Posted April 19, 2021 (edited) On 4/19/2021 at 3:49 PM, mikell said: I confess I omitted some details because it sounded a bit like spoon-feeding #Include <Array.au3> #cs 1.txt : word 1 word 2 word 3 #ce $p = StringReplace(StringStripWS(FileRead("1.txt"), 3), @crlf, "|") ;$p = "word 1|word 2|word 3" #cs 2.txt : Sentence 1 without the searched term. Sentence 2 is word 1 sentence. Sentence 3 without the searched term. Sentence 4 without the searched term. Sentence 5, is word 2 sentence. Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term. #ce $txt = FileRead("2.txt") ;$txt = " Sentence 1 without the searched term. Sentence 2 is word 1 sentence. " & @crlf & "Sentence 3 without the searched term. Sentence 4 without the searched term. Sentence 5, is word 2 sentence. " & @crlf & "Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term. " $res = StringRegExp($txt, '(?s)\s*([^.]+\b(?|' & $p & ')\b[^.]+\.)', 3) ;_ArrayDisplay($res) FileWrite("result.txt", _ArrayToString($res, @crlf)) Expand thank you it works, except it adds the text file 1 words in the end of result.txt Edited April 22, 2021 by vinnyMS
vinnyMS Posted April 19, 2021 Author Posted April 19, 2021 can you make it extract only 3 sentences each time
Nine Posted April 19, 2021 Posted April 19, 2021 (edited) This ? #include <Constants.au3> $p = "\Q" & StringReplace(StringStripWS(FileRead("1.txt"), 3), @CRLF, "\E|\Q") & "\E" $NUMBER_OF_LINES = 3 $txt = "Sentence 1 without the searched term? Sentence 2 is (TCP/IP) sentence. " & @crlf & "Sentence 3 without the searched term! Sentence 4 without the searched term? Sentence 5, is word 2 sentence. " & @crlf & "Sentence 6 is word 3 sentence. Sentence 7 is otherword 3 sentence. Sentence 8 without the searched term ?" ;$txt = FileRead("2.txt") $aSentence = StringSplit($txt, ".?!", $STR_NOCOUNT) For $i = 0 to $NUMBER_OF_LINES - 1 If StringRegExp($aSentence[$i], "\W" & $p & "\W") Then FileWriteLine("Result.txt", StringStripWS($aSentence[$i], $STR_STRIPLEADING+$STR_STRIPTRAILING)) Next Edited April 19, 2021 by Nine “They did not know it was impossible, so they did it” ― Mark Twain Reveal hidden contents Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Debug Messages Monitor UDF Screen Scraping Round Corner GUI UDF Multi-Threading Made Easy Interface Object based on Tag
vinnyMS Posted April 19, 2021 Author Posted April 19, 2021 this extracts a sentence to the period, removes the period and extracts the next sentence that does have the word in it then saves a s result with all the sentences extracted also what i don't need. 2.txt Transmission Control Protocol/Internet Protocol (TCP/IP) is a protocol system—a collection of protocols that supports network communications. The answer to the question What is a protocol? must begin with the question What is a network? Transmission Control Protocol/Internet Protocol (TCP/IP) is a protocol system—a collection of protocols that supports network communications. The answer to the question What is a protocol? must begin with the question What is a network? Transmission Control Protocol/Internet Protocol (TCP/IP) is a protocol system—a collection of protocols that supports network communications. The answer to the question What is a protocol? must begin with the question What is a network? result.txt Transmission Control Protocol/Internet Protocol (TCP/IP) is a protocol system—a collection of protocols that supports network communications The answer to the question What is a protocol? must begin with the question What is a network? Transmission Control Protocol/Internet Protocol (TCP/IP) is a protocol system—a collection of protocols that supports network communications The answer to the question What is a protocol? must begin with the question What is a network? Transmission Control Protocol/Internet Protocol (TCP/IP) is a protocol system—a collection of protocols that supports network communications
Nine Posted April 19, 2021 Posted April 19, 2021 I think I gave you enough tools to work with (as well as the others). Adjust the code to fit your needs now. “They did not know it was impossible, so they did it” ― Mark Twain Reveal hidden contents Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Debug Messages Monitor UDF Screen Scraping Round Corner GUI UDF Multi-Threading Made Easy Interface Object based on Tag
JockoDundee Posted April 19, 2021 Posted April 19, 2021 On 4/19/2021 at 1:32 PM, Nine said: An hybrid solution maybe ? Expand Did you use “An” because H is silent in French? Code hard, but don’t hard code...
Nine Posted April 19, 2021 Posted April 19, 2021 On 4/19/2021 at 5:12 PM, JockoDundee said: Did you use “An” because H is silent in French? Expand So you are telling I should have used a hybrid ? “They did not know it was impossible, so they did it” ― Mark Twain Reveal hidden contents Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Debug Messages Monitor UDF Screen Scraping Round Corner GUI UDF Multi-Threading Made Easy Interface Object based on Tag
JockoDundee Posted April 19, 2021 Posted April 19, 2021 On 4/19/2021 at 5:54 PM, Nine said: So you are telling I should have used a hybrid Expand Yes. We say “An hour”, but “A history”. Or “An unsigned integer”, but “A Ulimit”. It depends on whether there is a consonant sound that starts the word after the a or not. FrancescoDiMuro and Musashi 2 Code hard, but don’t hard code...
Nine Posted April 19, 2021 Posted April 19, 2021 Ahhh. Always had a hard time with languages. One of my prof told me once, that I speak better Fortran that I speak french. FrancescoDiMuro, Musashi and JockoDundee 3 “They did not know it was impossible, so they did it” ― Mark Twain Reveal hidden contents Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Debug Messages Monitor UDF Screen Scraping Round Corner GUI UDF Multi-Threading Made Easy Interface Object based on Tag
JockoDundee Posted April 19, 2021 Posted April 19, 2021 On 4/19/2021 at 6:21 PM, Nine said: Always had a hard time with languages. Expand No, you’re actually correct. Because of your Demain comme jamais tag, whenever I read your posts, I can’t help but hear them (in my mind’s ear) in a thick French accent. So I heard “An eye-brid solution”, which is perfect. FrancescoDiMuro 1 Code hard, but don’t hard code...
Nine Posted April 19, 2021 Posted April 19, 2021 “They did not know it was impossible, so they did it” ― Mark Twain Reveal hidden contents Block all input without UAC Save/Retrieve Images to/from Text Monitor Management (VCP commands) Tool to search in text (au3) files Date Range Picker Virtual Desktop Manager Sudoku Game 2020 Overlapped Named Pipe IPC HotString 2.0 - Hot keys with string x64 Bitwise Operations Multi-keyboards HotKeySet Recursive Array Display Fast and simple WCD IPC Multiple Folders Selector Printer Manager GIF Animation (cached) Debug Messages Monitor UDF Screen Scraping Round Corner GUI UDF Multi-Threading Made Easy Interface Object based on Tag
vinnyMS Posted April 19, 2021 Author Posted April 19, 2021 i tried to fix it, i can't find how to modify it to make it work
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now