qwert 43 Posted June 17, 2019 I'm parsing text to determine if each paragraph (terminated with a CRLF) ends with a proper punctuation mark. As I've added to the set of what's "proper", my method (parse statement) has become unwieldy (and still doesn't cover all cases): $proper = StringInStr($paragraph, '.' & @CRLF) + StringInStr($paragraph, '?' & @CRLF) + StringInStr$paragraph, '!' & @CRLF) + StringInStr($paragraph, '"' & @CRLF) + StringInStr($paragraph, ';' & @CRLF) Can someone suggest a better approach? Thanks in advance for any help. Share this post Link to post Share on other sites
JLogan3o13 1,639 Posted June 17, 2019 I'm a bit confused, are you checking only the punctuation mark at the very end of the paragraph, or the punctuation at the end of each sentence within? And when you say "proper punctuation" are you saying that it just needs to be some punctuation mark? Or do you expect it to know if that mark should be a "?" or a "!"? "Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball How to get your question answered on this forum! Share this post Link to post Share on other sites
qwert 43 Posted June 17, 2019 1 minute ago, JLogan3o13 said: only the punctuation mark at the very end of the paragraph Yes. And "yes", it needs to be some punctuation mark (not just a symbol like ^ for example). There will need to be 8 or 10 in the "dictionary" of proper marks. Share this post Link to post Share on other sites
Nine 993 Posted June 17, 2019 Local $Proper = StringInStr ('.!?";',StringMid ($paragraph, StringInStr ($paragraph, @CRLF)-1,1)) Not much of a signature but working on it... Spoiler Block all input without UAC Save/Retrieve Images to/from Text Tool to search content in au3 files Date Range Picker Sudoku Game 2020 Overlapped Named Pipe IPC x64 Bitwise Operations Multi-keyboards HotKeySet Fast and simple WCD IPC Multiple Folder Selector GIF Animation (cached) Share this post Link to post Share on other sites
qwert 43 Posted June 17, 2019 @Nine: Well, that's cool: turn the algorithm around and look for a single isolated character in a string of choices. I like it! Thanks very much. Share this post Link to post Share on other sites
mikell 1,024 Posted June 17, 2019 As I don't exactly know the final purpose this is a simple try (for the concept) #Include <Array.au3> $s = "AutoIt v3 is a freeware BASIC-like scripting language designed for automating the Windows GUI and general scripting" & @crlf & _ "It uses a combination of simulated keystrokes, mouse movement and window/control manipulation in order to automate tasks in a way not possible or reliable with other languages (e.g. VBScript and SendKeys); " & @crlf & _ "AutoIt is also very small, self-contained and will run on all versions of Windows out-of-the-box with no annoying runtimes required" $s = StringRegExpReplace($s, '\h*\R|$', @crlf) $res = StringRegExp($s, '([^\.\?!;"])\r', 3) _ArrayDisplay($res) Share this post Link to post Share on other sites
pixelsearch 227 Posted June 17, 2019 Hi all, I tried a regexp approach, using a "negative lookbehind" assertion to search for the character before any newline sequence (\R matches @CRLF or lone @CR or @LF) not being one of those indicated by Qwert. I typed ERROR HERE in the Replace Pattern tab . Here are the results, where a comma ending a line is detected : Thanks to our regexp gurus for commenting, if the expression can be improved Share this post Link to post Share on other sites