Sign in to follow this  
Followers 0
Servant

Delete the 2nd until the last sentence of a set paragraph range on a Word

12 posts in this topic

#1 ·  Posted (edited)

I tried a lot of techniques but still have no luck..

How can I delete the second sentence until the last sentence of a set paragraph range on a Microsoft Word document?

#include <Word.au3>

Global $oWord, $oDoc

$oWord = _Word_Create()
$oDoc = _Word_DocGet($oWord, 1)

Global Const $Count = $oDoc.Paragraphs.Count

For $i = 0 To $Count - 1
   $oRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)

   ; Here will be placed the missing code
Next

Sample of the beginning of a Word document:

This is a sentence 1. This is a sentence 2. This is a sentence 3.

This is a sentence 4. This is a sentence 5. This is a sentence 6.

This is a sentence 7. This is a sentence 8. This is a sentence 9.

Sample of the final result:

This is a sentence 1.

This is a sentence 4.

This is a sentence 7.
Edited by Servant

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

Hi,

I don't use the Word UDF, so this may not be the best method :

;$s is the text of your paragraph
;$s2 is the replacement text
$s2 = StringRegExpReplace($s, "(?m)(.*?\.)(?:.*?)$", "$1")

_

Br, FireFox.

Edited by FireFox

 

OS : Win XP SP2 (32 bits) / Win 7 SP1 (64 bits) / Win 8 (64 bits) | Autoit version: latest stable / beta.
Hardware : Intel(R) Core(TM) i5-2400 CPU @ 3.10Ghz / 8 GiB RAM DDR3.

My UDFs : Skype UDF | TrayIconEx UDF | GUI Panel UDF | Excel XML UDF | Is_Pressed_UDF

My Projects : YouTube Multi-downloader | FTP Easy-UP | Lock'n | WinKill | AVICapture | Skype TM | Tap Maker | ShellNew | Scriptner | Const Replacer | FT_Pocket | Chrome theme maker

My Examples : Capture toolIP Camera | Crosshair | Draw Captured Region | Picture Screensaver | Jscreenfix | Drivetemp | Picture viewer

My Snippets : Basic TCP | Systray_GetIconIndex | Intercept End task | Winpcap various | Advanced HotKeySet | Transparent Edit control

 

Share this post


Link to post
Share on other sites

This deletes from the first "." in each paragraph to the end of the paragraph.

#include <Word.au3>

Global $oWord = _Word_Create()
If @error <> 0 Then Exit MsgBox(16, "Word UDF: _Word_DocFind Example", "Error creating a new Word application object." & @CRLF & "@error = " & @error & ", @extended = " & @extended)
Global $oDoc = _Word_DocOpen($oWord, "Test Leerzeichen.docx", Default, Default, True)
If @error <> 0 Then Exit MsgBox(16, "Word UDF: _Word_DocFind Example", "Error opening 'Test Leerzeichen.docx'." & @CRLF & "@error = " & @error & ", @extended = " & @extended)

Local $oRangeFound, $oRangeText
$oRangeFound = _Word_DocFind($oDoc, ".", 0) ; Search the whole document
If @error Then Exit MsgBox(16, "Word UDF: _Word_DocFind Example 3", "Error locating the specified text in the document." & @CRLF & "@error = " & @error & ", @extended = " & @extended)
; Create a new range (duplicate to not alter the result of the find operating)
$oRangeText = $oRangeFound.Duplicate
$oRangeText = _Word_DocRangeSet($oDoc, $oRangeText, $WdCharacter, 1, $wdParagraph, 1) ; Move the start of the range past the "." and the end of range to the end of the paragraph
$oRangeText = _Word_DocRangeSet($oDoc, $oRangeText, Default, Default, $wdCharacter, -1) ; Move the end of the range one character to the left to not delete the new line character
$oRangeText.Text = ""
While 1
    $oRangeFound = _Word_DocFind($oDoc, ".", 0, $oRangeFound) ; Search the next "."
        If @error Then ExitLoop
    $oRangeText = $oRangeFound.Duplicate
    $oRangeText = _Word_DocRangeSet($oDoc, $oRangeText, $WdCharacter, 1, $wdParagraph, 1)
    $oRangeText = _Word_DocRangeSet($oDoc, $oRangeText, Default, Default, $wdCharacter, -1)
    $oRangeText.Text = ""
WEnd
1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

This is my better solution:

#include <Word.au3>

Global $oWord = _Word_Create()
If @error <> 0 Then Exit MsgBox(16, "Word UDF: _Word_DocFind Example", "Error creating a new Word application object." & @CRLF & "@error = " & @error & ", @extended = " & @extended)

Global $oDoc = _Word_DocGet($oWord, 1)
If @error <> 0 Then Exit MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocGet Example", _
"Error accessing collection of documents." & @CRLF & "@error = " & @error & ", @extended = " & @extended)

Global $pCount = $oDoc.Paragraphs.Count
Local $oRange, $oRange2, $sCount, $1st, $p
Local $sFindText, $oFind, $oRangeFound, $oRangeText

For $i = 0 To $pCount - 1
  $oRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)

  $sCount = $oRange.Sentences.Count

  While 1
    If $sCount > 1 Then
      $oRange2 = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdSentence, 1)
      $1st = $oRange2.Text
      $oRange2 = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)
      $p = $oRange2.Text

      $sFindText = StringReplace($p, $1st, "")
      $sFindText = StringReplace($sFindText, @CR, "")
      $oFind = _Word_DocFind($oDoc, $sFindText, $oRange2, Default, Default, False, False, False)

      If @error <> 0 Then
        $oRangeFound = _Word_DocFind($oDoc, ".", $oRange2) ; Search the $oRange2

        If @error Then Exit MsgBox(16, "Word UDF: _Word_DocFind Example 3", "Error locating the specified text in the range." & @CRLF & "@error = " & @error & ", @extended = " & @extended)

        ; Create a new range (duplicate to not alter the result of the find operating)
        $oRangeText = $oRangeFound.Duplicate
        $oRangeText = _Word_DocRangeSet($oDoc, $oRangeText, $WdCharacter, 1, $wdParagraph, 1) ; Move the start of the range past the "." and the end of range to the end of the paragraph
        $oRangeText = _Word_DocRangeSet($oDoc, $oRangeText, Default, Default, $wdCharacter, -1) ; Move the end of the range one character to the left to not delete the new line character
        $oRangeText.Text = ""
      Else
        $oFind.Delete
      EndIf

    EndIf

    $oRange2 = _Word_DocRangeSet($oDoc, -1, $wdParagraph, $i, $wdParagraph, 1)
    $sCount = $oRange2.Sentences.Count
  WEnd
Next

But please review the sentence below:

The FYE 2012 Transfer Pricing Report shows that the set of comparable companies chosen to benchmark the O&M services has a three year period weighted average (“PWAVG”) interquartile range (“IQR”) of 3.3 percent to 19.3 percent and a one year IQR of 4.0 percent to 21.8 percent.

After that code was run, the sentence in the new document was:

The FYE 2012 Transfer Pricing Report shows that the set of comparable companies chosen to benchmark the O&M services has a three year period weighted average (“PWAVG”) interquartile range (“IQR”) of 3.

And this does not seem to happen consistently I think it's because of my new code.

For example, the following sentence before and after did not have this problem.

Before:

For transaction 4, the mark-up on total cost (OI/TC) KPMG calculated is -23.8 percent, while the mark-up of total cost PwC presented in the report is -31.3 percent, which happened to be mark-up on total revenue (OI/Revenue);

And After:

For transaction 4, the mark-up on total cost (OI/TC) KPMG calculated is -23.8 percent, while the mark-up of total cost PwC presented in the report is -31.3 percent, which happened to be mark-up on total revenue (OI/Revenue);

Is it possible to fix this issue?

I think when this code execute:

$oFind = _Word_DocFind($oDoc, $sFindText, $oRange2, Default, Default, False, False, False)

  If @error <> 0 Then

...and produce the error "4 - $sFindText could not be found" it will then execute your code but it will treat the decimal point in a number as the end of the sentence..

Edited by Servant

Share this post


Link to post
Share on other sites

 

How can I delete the second sentence until the last sentence of a set paragraph range on a Microsoft Word document?

The solution depends on how you define a "sentence". You need to search for ". " for "sentecnes" within a paragraph or for ".P" where "P" is the control character for a new paragraph.

1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Would that be ".¶" aka '.' & ChrW(0xB6) ?


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Yes.



 


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Beware that a correctly written paragraph may in the general case also end in various other characters like ! ? ) … » ” „ ’ ‼ ⁇ ⁈ ⁉ ❩ ❫ ¿ ¡ and most probably another set of quotes and exotic punctuation marks when, for instance the paragraph ends with a citation from some non-english language. Incorrect punctuation only adds more difficulty.

So why not rely on paragraph marks only? Sounds more reliable.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Because in post 2 the OP showed an example with only period as an ending character.

But you are correct, sentences can end with a lot of characters.

Let's see what the OP needs ;)


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

I was just saying, both for the OP and/or for future reference.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

I see. So I think to properly identify "sentences" within a paragraph SRE would be needed?


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

I don't have Office installed here and I don't remember how powerful/painful regexpes are in Word.


This wonderful site allows debugging and testing regular expressions (many flavors available). An absolute must have in your bookmarks.
Another excellent RegExp tutorial. Don't forget downloading your copy of up-to-date pcretest.exe and pcregrep.exe here
RegExp tutorial: enough to get started
PCRE v8.33 regexp documentation latest available release and currently implemented in AutoIt beta.

SQLitespeed is another feature-rich premier SQLite manager (includes import/export). Well worth a try.
SQLite Expert (freeware Personal Edition or payware Pro version) is a very useful SQLite database manager.
An excellent eBook covering almost every aspect of SQLite3: a must-read for anyone doing serious work.
SQL tutorial (covers "generic" SQL, but most of it applies to SQLite as well)
A work-in-progress SQLite3 tutorial. Don't miss other LxyzTHW pages!
SQLite official website with full documentation (may be newer than the SQLite library that comes standard with AutoIt)

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • Neonovaz
      By Neonovaz
      Hello
       
      Is there anyway to store word documents in Autoit GUI? For example I have a instruction sheet that I want to bundle up with the exe.

      So a user simply clicks the icon and the stored document will launch  (Something like how you can add objects like excel sheets in word documents )

      (I Know we can launch word files from script directory)

       
    • hcI
      By hcI
      Hello I would like to know if there is a way to return a sentence in cmd when I launch from it (because I add arguments).
      For example, diskpart.exe which help to manage the key and hdd connected, when you launch it with the parameter "/f" the app return a sentence saying that it don't recognize the parameter "/f" and it return the sentence in the cmd where i started the application, not a new one.
      That's what I want to do but I couldn't find anything that would solve my problem on internet and on AutoIt like ConsoleWrite / ConsoleWriteError (don't work).
       
      Thanks
    • nacerbaaziz
      By nacerbaaziz
      Hi all
      I want a way to get the last key pressed.
      I have a program that works with keyboard shortcuts and I want to  give the permission for the user to edit shortcut keys depending on what suits him
      i  want to make read-only edit box and the program writes the latest shortcut key pressed
      Please help me,
      greetings to all
      And thanks in advance
    • Jury
      By Jury
      I've failed to find an example of _Word_DocFindReplace which searches for formatted text (I'm looking for stand alone paragraph marks that are formatted other than normal i.e. Bold Italic, Underlined). 
      The reason being that when converting a Word document to html one of the main problems in the results is that a stand alone paragraph mark is converted to an html space that retains the formatting ...>&nbsp;<... thus showing up as a underline _  in a browser when it should be blank.  I've played around with the script and got it to at least un-bold  the first paragraph mark regardless if it was bold or not but I'd like to clear all formatting from any stand alone paragraph marks in the whole document.  Below is what I've done so far (not much more than in the help file I'm afraid) .  Way down at the bottom of the _Word_DocFindReplace  help  text is this parameter but without any examples to be found :
      $bFormat   [optional] True to have the find operation locate formatting in addition to or instead of the find text (default = False) #include <MsgBoxConstants.au3> #include <Word.au3> $processing = @MyDocumentsDir & '\AutoIt_code\getter\processing\' Global $oWord = _Word_Create() Global $sTestfile = $processing & "Testing.docx" ConsoleWrite($sTestfile & @CRLF) Global $oDoc = _Word_DocOpen($oWord, $sTestfile) If @error Then Exit MsgBox($MB_SYSTEMMODAL, "ERROR", "Error opening file = '" & $sTestfile & "'" & @CRLF & "@error = " & @error & ", @extended = " & @extended) $oRangeFound = _Word_DocFind($oDoc, "^p", Default, Default) If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocFind Example", _ "Error locating paragraph control character in the document." & @CRLF & "@error = " & @error & ", @extended = " & @extended) $oRangeFound.Bold = False If @error Then Exit MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocFind Example", _ "Error inserting text after the paragraph control character in the document." & @CRLF & "@error = " & @error & _ ", @extended = " & @extended) MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocFind Example", "Paragraph control character successfully replaced." & @CRLF & _ "Text inserted in paragraph 2.")  
    • ahha
      By ahha
      Okay this is likely due to my not properly understanding objects.
      I'm using _Word_DocRangeSet to extend a range (in this case to the end of a line).
      The issue I've encountered is that extending one range seems to affect another range.
      It may be that an object can't be assigned or equated.
      In any event the program and test file (place in the same directory) illustrate the issue.
      Test 1 - shows the documentation for _Word_DocRangeSet correctly shows how the range is extended and the assigned result is extended.  No problem here just part of my learning.
      Test 2 - like Test 1 but no assignment of the result from  _Word_DocRangeSet is needed.  Again correct and my learning.
      Test 3 - here is where the issue is.  After an assignment to a new object the old one seems to be affected by _Word_DocRangeSet.  This I don't understand (perhaps the assignment is really a namespace pool and points to the same structure like aliases <-- wild guess).
      Test 4 - shows that using .Select can extend the range and leave the original range alone.
      Any hints/pointers on what's going on appreciated.
      Thanks
      WordRangeTesting v1c.au3
      Test for WordRangeTesting v1c.docx