Jump to content

Attempting to read each line of an office document (.doc, .docx, .xls, .xlsx) - Similarly to a FileReadLine()


Recommended Posts

I'm attempting to read each line of a word document and assign the line to a variable. Similarly to how you can read a line from a text file (.txt or .csv) using FileReadLine(). So far i have been unsuccessful in reading from a .doc/.docx file, nor have i found any documentation that has helped.

In searching for a solution i did find a function to convert the word doc to a text file, however my script is for (PCI) auditing purposes and i do not want to create a new file on the HDD.  I have also read through the _Word UDF help files... Unless im not understanding the _Word UDF correctly, I did not see anything that functions similarly to the FileReadLine function.

Any help/advice is greatly appreciated!  

 

Here is what i have been attempting to do(doesn't work): 
 

#include <file.au3>
#include <Array.au3>
#include <LuhnCheck.au3>
#include <Excel.au3>
#include <Word.au3>

Global $sPath = 'C:\Users\'
Global $filePath
Global $pii = @ScriptDir & '\pii_CreditCard.csv'

Global $filesArray = _FileListToArrayRec($sPath , '*.txt;*.csv;*.doc;*.docx;*.xls;*.xlsx',1,1,0,2)

For $i = 1 to $filesArray[0]  ;Loop through file extensions and add files to the fileArray
    ;Assign the position in the filesArray to filePath (filePath is set to full path in FileListToArrayRec)
    $filePath = $filesArray[$i]
    readFile($filePath)
Next

Func readFile($file)
    If StringInStr($file, '.txt') Or StringInStr($file, '.csv') Then ; .txt file
        readTxtFile($file)
    ElseIf StringInStr($file, '.doc') Then ; .doc & .docx files
        ;============================================== part that does not work=========================
        Local $oWord = _Word_Create()
        ;$openFile = FileOpen($file, 0);
            While 1
                Local $line = FileReadLine(_Word_DocOpen($oWord, $file, Default, Default, True))
                If @error = -1 Then ExitLoop
                ;lookForCreditCardNumbers($line)
                MsgBox(0,0, $line)
            WEnd
    FileClose($openFile)
        ;============================================== part that does not work==========================
    EndIf
EndFunc

Func readTxtFile($fileToOpen)
    $openFile = FileOpen($fileToOpen, 0); open file for reading and assing it to the openFile variable
    While 1
        Local $line = FileReadLine($openFile)
        If @error = -1 Then ExitLoop
        lookForCreditCardNumbers($line)
    WEnd
    FileClose($openFile)
EndFunc

Func lookForCreditCardNumbers($evaluateString)

    $aResult = StringRegExp($evaluateString, '[4|5|3|6][0-9]{15}|[4|5|3|6][0-9]{3}[-| ][0-9]{4}[-| ][0-9]{4}[-| ][0-9]{4}', $STR_REGEXPARRAYMATCH)
    If Not @error Then
        Local $newString1 = StringReplace($aResult[0], ' ', '') ;remove spaces
        Local $newString2 = StringReplace($newString1, '-', '') ;remove dashes

        Local $bool = _LuhnValidate($newString2) ; Check possible CC number against the Luhn algorithm

        If $bool = 'True' Then
            Local $piiCSV = FileOpen($pii, 1) ;open text file for appending/writing, 1
                                FileWriteLine($piiCSV, $filePath & ', ' & $newString2)
                            FileClose($piiCSV)
        EndIf
    EndIf

EndFunc

 

Edited by Duck
Changed title to match the content of thread
Link to post
Share on other sites

About how many lines do talk? Is it just simple text or does the document contain tables, heading lines ...?

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites
5 hours ago, water said:

About how many lines do talk? Is it just simple text or does the document contain tables, heading lines ...?

The number of lines in the word document will be unknown since it needs to read multiple word documents. I should only need to read simple text for now. 


My goal with this project is to make something that falls in between the Powershell scripts (another PS pii script) to do this type of pii auditing and openDLP. Powershell tends to get messy with the reporting (at least in my testing), and it has to be installed on the PC you are running the script on. OpenDLP is on the other end of the spectrum as it requires being installed on a LAMP server and uses agents or scans over the network, and it's a bit excessive for the audits i need to run. So I choose to create AutoIT scripts to run my audits. I thought i was doing rather well until i ran into MS Office docs. lol.  

Link to post
Share on other sites

The following script reads the whole Word document and splits it line by line into an array:

#include <Word.au3>
#include <Array.au3>

Global $sPath = "C:\temp\test.docx"
Global $oWord = _Word_Create()
Global $oDoc = _Word_DocOpen($oWord, $sPath)
Global $oRange = $oDoc.Range
Global $sText = $oRange.Text
Global $aLines = StringSplit($sText, @CR)
_ArrayDisplay($aLines)

 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites
32 minutes ago, water said:

The following script reads the whole Word document and splits it line by line into an array:

#include <Word.au3>
#include <Array.au3>

Global $sPath = "C:\temp\test.docx"
Global $oWord = _Word_Create()
Global $oDoc = _Word_DocOpen($oWord, $sPath)
Global $oRange = $oDoc.Range
Global $sText = $oRange.Text
Global $aLines = StringSplit($sText, @CR)
_ArrayDisplay($aLines)

 

Thank you very much water!! This works perfectly! 

Link to post
Share on other sites

:)

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites

Water, might I be able to get your assistance once again? This time with Excel. I attempted to modify your code to work similarly to word, however i keep getting an object error on the $oWorkbook.Range portion. Also for future reference, where are you finding the documentation for the $object.Range (etc.) portion of the above code? 

 

;Read text from an excel document
        Local $sPath = $file
        Local $oExcel = _Excel_Open(False) ; create application object
        Local $oWorkbook = _Excel_BookOpen($oExcel, $sPath)
        Local $oRange = $oWorkbook.Range
        Local $sText = $oRange.Text
        Local $aLines = StringSplit($sText, @CR)
        Local $aResult = _ArrayDisplay($aLines)
        _Excel_Close($oExcel) ;close excel so it does not lock the file in read only mode

 

Link to post
Share on other sites

To retrieve the value of all used cells in Excel simply use:

Local $aRange = _Excel_RangeRead($oWorkbook)

 

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites

The Word 2010 Object Model can be found here: http://msdn.microsoft.com/en-us/library/ff841702
The Excel 2010 Object Model can be found here: http://msdn.microsoft.com/en-us/library/ff846392.aspx

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites

:)

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites
  • 2 weeks later...

I'm currently experiencing some odd behavior from the Word and Excel UDFs when executing my script on a remote computer. If i move my script (via SMB) to a remote PC and execute it (via PSEXEC), the script will not read any word or excel documents. (It will read all the text based files without issue when executed remotely.) If I RDP to that same computer and double click the script to run it, the script will read the word and excel documents without issue. I have attempted to execute my script by calling batch files, VBScripts, and even scheduling a task on the remote PC without correcting this issue. So far I've yet to find a reason to why the script works fine when executed from the GUI by clicking it (via RDP, or locally on a PC) but fails to read word/excel documents when executed on a remote PC via PSEXEC. 

I can also execute the script via the command line locally on any PC, and it will read word/Excel documents without issue. So i do not believe PSEXEC has any influence on how the script performs vs the local command line or even the windows GUI (explorer). Please correct me if I'm wrong in this assumption. 

Any insight into this issue is greatly appreciated!    

 

Here is my code. The readFile() contains the Word and Excel functions im experiencing issues with: 

Edit:  here is the link to the LuhnCheck.au3 im using:

#include <File.au3>
#include <Array.au3>
#include <LuhnCheck.au3>
#include <Excel.au3>
#include <Word.au3>
#include <Date.au3>

Global $filePath ;Holds the absolute path of the current file being scanned
Global $logsFolder = @ScriptDir & '\' & @ComputerName & '.Logs' ;Holds the path for the folder created to store reporting information
Global $piiCC = @ScriptDir & '\' & @ComputerName &'.Logs\pii_CreditCard.csv' ;Holds the path to the output file which reports of possible credit card information found
Global $piiPassword = @ScriptDir & '\' & @ComputerName &'.Logs\pii_Password.csv' ;Holds the path to the output file which reports of possible password information found
Global $piiPasswordFile = @ScriptDir & '\' & @ComputerName &'.Logs\pii_PasswordFile.csv'
Global $piiSSN = @ScriptDir & '\' & @ComputerName &'.Logs\pii_SSN.csv' ;Holds the path to the output file which reports of possible Social Security Numbers information found
Global $systemLog = @ScriptDir& '\' & @ComputerName &'.Logs\AgentLog.txt' ;Holds the path to the output file which reports on Agent settings and performance
Global $sPath  ;Holds the path for the root directory to scan - Scan will start at the root directory, scanning all files and subfolders
Global $sFileExtensionTypes  ;Holds the vaules of file extensions to be read. Assigned value from the .ini file
Global $failedCount = 0  ;Holds the count of files this application failed to read
Global $totalCount = 0   ;Holds the count of total files this application attempted to read
Global $regex_CC ;Hold the value of the regex statement used to look for credit card information
Global $regex_SSN ;Hold the value of the regex statement used to look for Social Security Numbers
Global $lookForPasswords ;Holds the boolean value of True or False - tells the lookForPii() function if it should evaluate strings for passwords
Global $verboseLogging ;Holds "Yes" or "No" - used to determine the verbosity of reporting pii
Global $reportSensitiveData ;Holds "Yes" or "No"  - Used to determin if sensitive data should be displayed in the reporting
Global $removeTempFiles ;Holds "Yes" or "No" - Used to determin if the agent should delete temp files prior to searching the system
Global $lastFlaggedFileCC
Global $lastFlaggedFileSSN
Global $lastFlaggedFilePasswd

Global $showAllFilePaths ; For testing


startAgent();Start the scan for pii

Func startAgent()
    ;Create a Folder at script dir to hold Agent Logs & PII Reports. Name of Folder = ComputerName.
    If Not FileExists($logsFolder) Then DirCreate($logsFolder)

    ;Get Agent Settings from .ini file
    $sFileExtensionTypes = IniRead(@ScriptDir & '\pii_Agent.ini', 'AgentSettings', 'fileType', '*.doc;*.docx;*.xls;*.xlsx;*.txt;*.csv;*.htm;*.html;*.xml;')
    $sPath = IniRead(@ScriptDir & '\pii_Agent.ini', 'AgentSettings', 'rootDir', 'C:\')
    $verboseLogging = IniRead(@ScriptDir & '\pii_Agent.ini','AgentSettings', 'verboseLogging', 'No')
    $reportSensitiveData = IniRead(@ScriptDir & '\pii_Agent.ini','AgentSettings', 'reportSensitiveData', 'No')
    $removeTempFiles = IniRead(@ScriptDir & '\pii_Agent.ini', 'AgentSettings', 'removeTempFiles', 'No')
    $regex_CC = IniRead(@ScriptDir & '\pii_Agent.ini', 'ScanSettings', 'regexCC', '[4|5|3|6][0-9]{15}|[4|5|3|6][0-9]{3}[-| ][0-9]{4}[-| ][0-9]{4}[-| ][0-9]{4}')
    $regex_SSN = IniRead(@ScriptDir & '\pii_Agent.ini', 'ScanSettings', 'regexSSN', '[0-9]{3}[-| ][0-9]{2}[-| ][0-9]{4}')
    $lookForPasswords = IniRead(@ScriptDir & '\pii_Agent.ini', 'ScanSettings', 'passwd', 'No')
    $showAllFilePaths = IniRead(@ScriptDir & '\pii_Agent.ini', 'AgentSettings', 'allFiles', 'No')

    ;Create Agent Settings Log File with System's information.
    startSystemLog()
    removeTempFilesPreScan()

    ;Get all files in the root directory and in subfolders that match the desired file extension type and assign the array to the filesArray variable.
    Global $filesArray = _FileListToArrayRec($sPath , $sFileExtensionTypes,1,1,0,2)

    For $i = 1 To $filesArray[0]  ;Loop through all files in the fileArray
        ;Assign the position in the filesArray to filePath (filePath is set to absolute file paths in FileListToArrayRec())
        $filePath = $filesArray[$i]
        checkBlackList($filePath) ;Send file's path to the checkBlackList function to remove undesired directories
        $totalCount = $totalCount + 1 ; Keep count of the total number of files this application attempts to read
    Next

    finishedSystemLog(); Alert when the scan completes
EndFunc

Func checkBlackList($dirPath)
    If StringInStr($dirPath, 'Application Data') Or StringInStr($dirPath, 'AppData') Or StringInStr($dirPath, 'Temporary Internet Files') Or StringInStr($dirPath, 'System32') Or StringInStr($dirPath, '\~$') Then
        ;Do Nothing - Used to skip over undesireable directories
    Else
        If $lookForPasswords <> 'No' Then
            checkFilePath();Look for key words in the file's path
        EndIf
        readFile($dirPath)
        logAllFilePaths()
    EndIf
EndFunc

Func readFile($file)
    Local $sFileExtension = StringRight($file, 6) ; Get the last 6 characters from the string $file
    ConsoleWrite($file & @LF) ;For testing purposes only
    If StringInStr($file, 'pii_CreditCard.csv') Or StringInStr($file, 'pii_Password.csv') Or StringInStr($file, 'pii_PasswordFile.csv') Or StringInStr($file, 'pii_SSN.csv') Or StringInStr($file, 'AgentLog.txt') Or StringInStr($file, 'pii_Agent.ini') Then
        ;Do nothing
    ElseIf StringInStr($sFileExtension, '.txt') Or StringInStr($sFileExtension, '.csv') Or StringInStr($sFileExtension, '.htm') Or StringInStr($sFileExtension, '.xml') Then ; .txt, .csv, .htm, .html, .xml file
            readTextFile($file)
    ElseIf StringInStr($sFileExtension, '.doc') Then ; .doc & .docx files
        Local $sPath = $file
        Local $oWord = _Word_Create(False) ; Create a new Word object
        Local $oDoc = _Word_DocOpen($oWord, $sPath)
        If IsObj($oDoc) Then
            For $i = 1 To 1
                Local $oRange = $oDoc.Range
                Local $sText = $oRange.Text
                Local $aLine = StringSplit($sText, @CR)
            Next
            For $i = 1 To $aLine[0]
                lookForPii($aLine[$i])
            Next
        Else
            failedToReadFile($file)
        EndIf
        _Word_Quit($oWord)
    ElseIf StringInStr($sFileExtension, '.xls') Then ; .xls & xlsx files
        Local $sPath = $file
        Local $oExcel = _Excel_Open(False)
        Local $oWorkbook = _Excel_BookOpen($oExcel, $sPath)
        If IsObj($oWorkbook) Then
            Local $aRange = _Excel_RangeRead($oWorkbook)
            For $i = 0 to UBound($aRange, 1) -1
                For $j = 0 to UBound($aRange, 2) -1
                    lookForPii($aRange[$i][$j])
                Next
            Next
        Else
            failedToReadFile($file)
        EndIf
        _Excel_Close($oExcel)
    EndIf
EndFunc

Func readTextFile($file)
    Local $openFile = FileOpen($file, 0); open file for reading and assing it to the openFile variable
    If $openFile <> -1 Then ; Check if file has successfully opened for reading
        While 1
            Local $line = FileReadLine($openFile)
            If @error = -1 Then ExitLoop
                lookForPii($line)
        WEnd
    Else
        failedToReadFile($file)
    EndIf
    FileClose($openFile)
EndFunc

Func lookForPii($sEvaluate)
    If $verboseLogging = 'No' Then
        If $regex_CC <> 'none' And $lastFlaggedFileCC <> $filePath Then
            lookForCreditCardNumbers($sEvaluate)
        EndIf
        If $regex_SSN <> 'none' And $lastFlaggedFileSSN <> $filePath Then
                lookForSSN($sEvaluate)
        EndIf
        If  $lookForPasswords <> 'No' And $lastFlaggedFilePasswd <> $filePath Then
            lookForPasswordsInFiles($sEvaluate)
        EndIf
    Else
        If $regex_CC <> 'none' Then
            lookForCreditCardNumbers($sEvaluate)
        EndIf
        If $regex_SSN <> 'none' Then
                lookForSSN($sEvaluate)
        EndIf
        If  $lookForPasswords <> 'No' Then
            lookForPasswordsInFiles($sEvaluate)
        EndIf
    EndIf
EndFunc

Func lookForCreditCardNumbers($evaluateString)
    $aResult = StringRegExp($evaluateString, $regex_CC, $STR_REGEXPARRAYMATCH) ;Checks the value of the string against the $regex_CC value
    If Not @error Then
        Local $newString1 = StringReplace($aResult[0], ' ', '') ;remove spaces
        Local $newString2 = StringReplace($newString1, '-', '') ;remove dashes

        ;Check possible CC number against the Luhn algorithm. This helps cut down on false possitives
        Local $bool = _LuhnValidate($newString2); Returns a boolean (True/False) statement > True means the number could be a CC number

        If $bool = 'True' Then
            Local $piiCSV = FileOpen($piiCC, 1) ;open text file for appending/writing, 1
                                If $reportSensitiveData = 'Yes' Then
                                    FileWriteLine($piiCSV, $filePath & ', ' & $newString2 & ', ' & StringReplace($evaluateString, ',', '*') )
                                Else
                                    FileWriteLine($piiCSV, $filePath )
                                EndIf
                            FileClose($piiCSV)
            $lastFlaggedFileCC = $filePath
        EndIf
    EndIf
EndFunc

Func lookForSSN($evaluateSSN)
    $aResult = StringRegExp($evaluateSSN, $regex_SSN, $STR_REGEXPARRAYMATCH) ; Checks the value of the string against the $regex_SSN value
    If Not @error Then
        Local $ssn = FileOpen($piiSSN, 1) ;open text file for appending/writing, 1
                            If $reportSensitiveData = 'Yes' Then
                                FileWriteLine($ssn, $filePath & ', ' & $aResult[0] & ',' & StringReplace($evaluateSSN, ',', '*') )
                            Else
                                FileWriteLine($ssn, $filePath )
                            EndIf
                    FileClose($ssn)
        $lastFlaggedFileSSN = $filePath
    EndIf
EndFunc

Func lookForPasswordsInFiles($sLookForPasswd)
        If StringInStr($sLookForPasswd, 'password') Or StringInStr($sLookForPasswd, 'passwd') Or StringInStr($sLookForPasswd, 'passphrase') Or StringInStr($sLookForPasswd, 'secret') Then
            Local $pw = FileOpen($piiPassword, 1) ;open text file for appending/writing, 1
                            If $reportSensitiveData = 'Yes' Then
                                FileWriteLine($pw, $filePath & ', ' & StringReplace($sLookForPasswd, ',', '*') )
                            Else
                                FileWriteLine($pw, $filePath )
                            EndIf
                        FileClose($pw)
        $lastFlaggedFilePasswd = $filePath
        EndIf
EndFunc

Func checkFilePath()
    IF StringInStr($filePath, 'password') Or StringInStr($filePath, 'passwd') Or StringInStr($filePath, 'pwd') Or StringInStr($filePath, 'secret') Then
            Local $pw = FileOpen($piiPasswordFile, 1) ;open text file for appending/writing, 1
                                FileWriteLine($pw, $filePath )
                            FileClose($pw)
    EndIf
EndFunc

Func startSystemLog()
    Local $sysLog = FileOpen($systemLog, 1)
                        FileWriteLine($sysLog,'**********************************************')
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'Computer Name: ' & @ComputerName)
                        FileWriteLine($sysLog,'Search Started at: ' & _Now() )
                        FileWriteLine($sysLog,'Search Started at the Root Directory of: ' & $sPath)
                        FileWriteLine($sysLog,'Searched for the Following File Extensions: ' & $sFileExtensionTypes)
                        FileWriteLine($sysLog,'RegEx Expression Used to Look for Credit Card Information: ' & $regex_CC)
                        FileWriteLine($sysLog,'RegEx Expression Used to Look for Social Security Numbers: ' & $regex_SSN)
                        FileWriteLine($sysLog,'Searched File Paths and File''s Content for Possible Passwords: ' & $lookForPasswords)
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'**********************************************')
                        FileWriteLine($sysLog,'**********************************************')
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'  ')
                    FileClose($sysLog)
EndFunc

Func failedToReadFile($file)
    Local $sysLog = FileOpen($systemLog, 1)
                        $failedCount = $failedCount + 1
                        FileWriteLine($sysLog, $failedCount & ': Failed to Read: ' & $file)
                    FileClose($sysLog)
EndFunc

Func finishedSystemLog()
    Local $sysLog = FileOpen($systemLog, 1)
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'**********************************************')
                        FileWriteLine($sysLog,'**********************************************')
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'Scan Finished at: ' & _Now() )
                        FileWriteLine($sysLog,'Total Files Scanned: ' & $totalCount )
                        FileWriteLine($sysLog,'Failed to Read: ' & $failedCount )
                        FileWriteLine($sysLog,'Successfully Read: ' & $totalCount - $failedCount )
                        FileWriteLine($sysLog,'  ')
                        FileWriteLine($sysLog,'**********************************************')
                        FileWriteLine($sysLog,'  ')
                    FileClose($sysLog)

    Local $finishedFlag = FileOpen(@ScriptDir & '\' & @ComputerName &'.Logs\FINISHED.SCAN.txt', 1) ;Used to flag a completed scan
                          FileClose($finishedFlag)
EndFunc

Func removeTempFilesPreScan()
    IF $removeTempFiles = 'Yes' Then
        RunWait(@ComSpec & " /c " & 'del /q/f/s %TEMP%\*', @SystemDir, @SW_HIDE)
        RunWait(@ComSpec & " /c " & 'del /q/f/s C:\temp\*', @SystemDir, @SW_HIDE)
        RunWait(@ComSpec & " /c " & 'del /q/f/s C:\Windows\temp\*', @SystemDir, @SW_HIDE)
        RunWait(@ComSpec & " /c " & 'rmdir /q/s %TEMP%\*', @SystemDir, @SW_HIDE)
        ;RunWait(@ComSpec & " /c " & 'mkdir %temp%', @SystemDir, @SW_HIDE)
        RunWait(@ComSpec & " /c " & 'rmdir /q/s C:\temp\*', @SystemDir, @SW_HIDE)
        ;RunWait(@ComSpec & " /c " & 'mkdir C:\temp', @SystemDir, @SW_HIDE)
    EndIf
EndFunc

Func logAllFilePaths() ;For testing 
    Local $fileCount
    If $showAllFilePaths = 'Yes' Then
        Local $fileLog = FileOpen(@ScriptDir & '\AllFiles.Log.txt', 1)
                         $fileCount = $fileCount + 1
                        FileWriteLine($fileLog, $fileCount & ': Failed to Read: ' & $filePath)
                    FileClose($fileLog)
    EndIf
EndFunc

 

Edited by Duck
Link to post
Share on other sites

My assumptions about PSEXEC were wrong and I've corrected the issue. I've discovered if i run the script via PSEXEC using the -i -d switches (which allow interaction with the desktop) that the word and excel functions work without issue. If anyone is able to shed some light on why the word and excel functions require interaction with the desktop i would appreciate it. Just for a better understanding of how/why this works. 

 

This brings me to my next question. Is it possible to continue past the errors (which hang my script) that Office throws when a document/workbook is in read only mode? Or when Office throws an error stating "the last time the document was opened it caused and error. Would you like to continue."?  

Example of one of the errors I'm referring to: https://support.microsoft.com/en-us/kb/286017
8-1-2009%205-39-38%20AM_thumb%5B5%5D.png
 

Edited by Duck
Link to post
Share on other sites

We had a third party closed source system which used Word to print letters, and we had to handle the above sorts of messages which stopped the process.  You may have more control over things since you yourself are starting the Word process up, but I ended up writing a set of rules to handle these errors.  Have a look here first I think

 

https://msdn.microsoft.com/en-us/library/office/ff822397.aspx

 

Otherwise you might need to create a sort of rules engine for these notifications.  Happy to post mine up but you may have more control with the Word options object

Link to post
Share on other sites
  • 2 weeks later...

_Word_DocOpen sets @error = 4 when AutoIt could not obtain a read/write access.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2021-11-10 - Version 1.6.0.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2017-07-21 - Version 0.4.0.1) - Download - General Help & Support - Example Scripts
OutlookEX (NEW 2021-11-16 - Version 1.7.0.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX_GUI (2021-04-13 - Version 1.4.0.0) - Download
Outlook Tools (2019-07-22 - Version 0.6.0.0) - Download - General Help & Support - Wiki
PowerPoint (NEW 2021-08-31 - Version 1.5.0.0) - Download - General Help & Support - Example Scripts - Wiki
Task Scheduler (2019-12-03 - Version 1.5.1.0) - Download - General Help & Support - Wiki

Standard UDFs:
Excel - Example Scripts - Wiki
Word - Wiki

Tutorials:
ADO - Wiki
WebDriver - Wiki

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By diff
      Hello,
       
      so I have started to learn to use the Word UDF and got issue to add my pictures after exact paragraphs. I was searching in the forum for the solution, checked with examples and still I don't understand how to add pictures after paragraph in new line.
       
      My word document has like 8 pages and for example on page I have paragraph named "My examples:" and here starts my problem.
       
      I have tried to do this:
      #include <Word.au3> Local $oWord = _Word_Create() Local $oDoc = _Word_DocOpen($Word, @ScriptDir & "\examples.docx", Default, Default, True) $oSearchRange = _Word_DocRangeSet($oDoc, -1, $wdParagraph, 0) $oRangeFound = _Word_DocFind($oDoc, "My examples:", $oSearchRange) _Word_DocPictureAdd($oDoc, @ScriptDir & "\pic1.jpg", Default, Default, $oRangeFound) And here the picture adds before the paragraph My examples: on same line and looks like
      {pic1}My Examples:
       
      What I want to see is:
      My examples:
      {PICTURE IN NEW LINE} which is pic1.jpg from my code.
       
      How I can do that? I want to add 3 pictures in a row in new line each like:
      My examples:
      {pic1.jpg}
      {pic2.jpg}
      {pic3.jpg}
       
      Hope I explained this well, but you can ask me if you need any additional information to clarify.
    • By ahha
      I'm trying to get the number of columns in a specific row in a Word table and am stuck.   I need a push.  Program below and Word file attached.
      Thanks.
      #AutoIt3Wrapper_run_debug_mode=Y ;use this to debug in console window #include <Word.au3> $oWord = _Word_Create(True, True) ;Create Word application object, make it visible, and force a new instance of Word $oDoc = _Word_DocOpen($oWord, @ScriptDir&"\ColumnTest.docx", Default, Default, True) ;Open the Word document $iTablesCount = $oDoc.Tables.Count ;get Tables count in $oDoc Pause("$iTablesCount = '" & $iTablesCount & "'") $iRowCount = $oDoc.Tables.Item(1).Rows.Count ;Table hard coded $iColCount = $oDoc.Tables.Item(1).Columns.Count Pause("Table#1 $iRowCount = '" & $iRowCount & " $iColCount = '" & $iColCount & "'") ;trying to get the number of columns in each row ;$ColCountInRow = $oDoc.Tables.Item(1).Rows(1).Columns.Count ;this fails and read somewhere to use Cells.Count $ColCountInRow = $oDoc.Tables.Item(1).Rows(1).Cells.Count ;hard code Row 1 <<<<< ERROR here ;this is the error I get ;: ==> The requested action with this object has failed.: ;$ColCountInRow = $oDoc.Tables.Item(1).Rows(1).Cells.Count ;$ColCountInRow = $oDoc.Tables.Item(1)^ ERROR Pause("Row 1 has " & $ColCountInRow & " Columns") Exit Func Pause($text="") MsgBox(262144, "DEBUG", "Paused: " & $text) EndFunc  
      ColumnTest.docx
    • By Fenzik
      Hello!
      i wrote this function as alternative to using the Com Object or Commandline version of this project, discussed also earlyer on this forum.
      Project site - http://ebstudio.info/home/xdoc2txt.html
      Advantage of this implementation is that you do not need to register Com dll, using regsvr32.
      But you still need the project Dll (xd2txlib.dll).
      Enjoy!
      ; #FUNCTION# ==================================================================================================================== ; Name ..........: _ExtractText ; Description ...: Extracts text from advanced documment formats (Doc, Docx, ODT, XLS, ...) ; Syntax ........: _ExtractText($sFilename[, $bProperties = False[, $hDll = 0]]) ; Parameters ....: $sFilename - a string value. ; $bProperties - [optional] a boolean value. Default is False. If True, documment properties will be returned instead of the text. ; $hDll - [optional] a handle value. Default is 0. Optional handle to previously opened xd2txlib.dll. By default the xd2txlib.dll (Expected in @scriptdir) will be opened and closed during the function call. ; Return value .: String, containing the text or documment properties or empty string and Error as follows: ;1 - The file does not exists. ;2 - Error during opening xd2txlib.dll. ;3 - No text returned. ; Author ........: Fenzik ; Modified ......: ; Remarks .......: Project site - http://ebstudio.info/home/xdoc2txt.html ; Related .......: ; Link ..........: ; Example .......: No ; =============================================================================================================================== Func _ExtractText($sFilename, $bProperties = False, $hDll = 0) If Not FileExists($sFilename) Then Return SetError(1, "", "") Local $bLoaded = False If $hDll = 0 Then $hDll = DllOpen(@scriptdir&"\xd2txlib.dll") If $hDll = -1 Then Return SetError(2, "", "") $bLoaded = True Endif $aResult = DllCall($hDll, "int:cdecl", "ExtractText", "WSTR", $sFilename, "BOOL", $bProperties, "WSTR*", "") If $aResult[0] = 0 Then Return SetError(3, "", "") If $bLoaded = True Then DllClose($hDll) Return $aResult[3] EndFunc  
       
      xd2txlib-example.zip
    • By jestoner
      I have worked through a bunch of hoops to get here, and just hit a wall with this one.  I have a MS Office install and when I run it the window on it appears 3 times, each time with the same label.  I would like to be able to select close on it automatically and move to the next software.  When I try and get the info for the window, it is identical to the starting window, but has different info in it.  When I look at visible text, there is none.  I have it set now so that I can manually click close and continue, but I want to walk away and have it continue to work.
    • By lavascript
      I have a Word document containing a 9-column table where row 1 is the column headers. My goal is to read the table into a 2d array, remove some rows, update some fields, and add a few rows to the end. The resulting array will likely be a different length. Next, I want to write the data back into the table. If it's easier, I can write the data to a new document from a template containing the same table header with a blank 2nd row.
      Here's my early attempt:
      Local $oWord = _Word_Create() Local $oDoc = _Word_DocOpen($oWord, $sFile) Local $aData = _Word_DocTableRead($oDoc, 1) $aData[3][5] = "Something else" Local $oRange = _Word_DocRangeSet($oDoc, 0) $oRange = _Word_DocRangeSet($oDoc, $oRange, $wdCell, 9) _Word_DocTableWrite($oRange,$aData) This, unfortunately, writes the entire array into the first cell of row 2. What am I doing wrong?
       
×
×
  • Create New...