Jump to content
gius

search text in .xls .ppt .pdf

Recommended Posts

Melba23,

It is really interesting this UDF for ZIP.

I read its functions,
in my case should be used

_Zip_SearchInFile ()

right?

but how do not understand it ...
can you help me?

thanks

Share this post


Link to post
Share on other sites

gius,

 

can you help me?

Certainly. :)

This works for me:

#include <Array.au3>

#include "_Zip.au3"

; Copy the file get a standard .zip extension - the UDF needs this extension type
FileCopy("Full_Path.docx", "Full_Path.zip") ; Of course you can save the file to Temp if required

; Search in the file for the text
$aRet = _Zip_SearchInFile("Full_Path.zip", "Text_To_Find")

; Display the return - no return means text not found
_ArrayDisplay($aRet, "", Default, 8)

; Delete the copied file
FileDelete("Full_Path.zip")
So if you get an array returned from the _Zip_SearchInFile function you know that the docx file contains the text string for which you are searching. :)

M23


Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

@gius

ReplaceTemplateDOCX(Ru) - Retrieves the docx temp folder replaces text on the template, then packs them. Is intended to fill in documents. Instead of names and other personal data, the pattern includes the label text. You fill out the application form, and the program instead of labels in a document inserts your data. This is the same principle to replace text in documents docx

Share this post


Link to post
Share on other sites

Melba23,

this is my code

#include <File.au3>
#include <Array.au3>
#include <MsgBoxConstants.au3>
#include <GUIConstants.au3>
#include <GUIlistview.au3>
#include <FontConstants.au3>
#include <WinAPI.au3>

Global $Zip = FileCopy("Full_Path.docx", "Full_Path.zip")

GUIRegisterMsg($WM_NOTIFY, "WM_NOTIFY")
Global $listView
Global $folders[2] = [@DesktopDir, @MyDocumentsDir]
Global $aResult[100000][2], $n = 0

Local $obj = ObjCreate("xd2txcom.Xdoc2txt.1")

Local $sFilter = "*.doc;*.txt;*.pdf;*xls" ; Create a string with the correct format for multiple filters
Local $word = GUICtrlCreateInput("", 464, 24, 129, 37)
$Form1 = GUICreate(" ", 623, 438, 192, 124, BitOR($GUI_SS_DEFAULT_GUI, $WS_MAXIMIZEBOX, $WS_TABSTOP))


GUISetFont(18, 600, 0, "MS Sans Serif") ; Set the font for all controls in one call
$Label1 = GUICtrlCreateLabel("Search:", 16, 24, 422, 41)

GUICtrlSetFont(-1, 14, 400, 0, "MS Sans Serif")
$input = GUICtrlCreateInput("", 364, 24, 129, 37) ; $input is the ControlID of the input control
GUISetState(@SW_SHOW)



While 1
    Switch GUIGetMsg()
        Case $GUI_EVENT_CLOSE
            Exit
        Case $input ; When {ENTER} pressed in input
            $word = GUICtrlRead($input) ; Read the content of the input
            GUIDelete($Form1) ; Delete GUI
            SplashTextOn("", "...", 200, 200, -1, -1, 4, "", 24)
            ExitLoop
    EndSwitch
WEnd

For $k = 0 To UBound($folders) - 1 ; loop through folders
    $aFiles = _FileListToArrayRec($folders[$k], "*" & $sFilter, 1, 1, 0, 2) ; list files
    If Not @error Then
        For $i = 1 To $aFiles[0] ; loop through files
           If StringRight($aFiles[$i],4) = '.doc' Then
              $content = $obj.ExtractText($aFiles[$i], False)
           Else
              $content = FileRead($aFiles[$i])
              EndIf
            ; the following regex captures the full lines containing $word
            $res = StringRegExp($content, '(?im)(.*\b\Q' & $word & '\E\b.*)\R?', 3) ; if $word must be a lone word
            ; $res = StringRegExp($content, '(?im)(.*\Q' & $word & '\E.*)\R?', 3) ; if $word can be part of another word
            If IsArray($res) Then
                $aResult[$n][0] = $aFiles[$i] ; file path
                For $j = 0 To UBound($res) - 1
                    $aResult[$n + $j][1] = $res[$j] ; lines
                Next
                $n += $j
            EndIf
        Next
    EndIf
Next

ReDim $aResult[$n][2]
aGUI($aResult)
SplashOff()
While 1
    Switch GUIGetMsg()
        Case $GUI_EVENT_CLOSE
            Exit
    EndSwitch
WEnd

Func aGUI($array, $title = "")
    Local $gui, $i, $findStrPos, _
            $itemText, $replacedText, _
            $leftStr
    ;$gui = GUICreate($title, 1024, 768, 192, 124)
    $gui = GUICreate($title, 800, 640, 192, 124, BitOR($GUI_SS_DEFAULT_GUI, $WS_MAXIMIZEBOX, $WS_TABSTOP))
    $Label1 = GUICtrlCreateLabel("", 100, 0, 400, 35, $SS_CENTER)
    GUICtrlSetFont(-1, 16, 400, 4, 'Comic Sans Ms')
    $listView = _GUICtrlListView_Create($gui, "file", 20, 35, @DesktopWidth - 20, @DesktopHeight - 120, BitOR($LVS_REPORT, $LVS_SINGLESEL))
    $hFont1 = _WinAPI_CreateFont(25, 6, 0, 0, $FW_MEDIUM, False, False, False, _
            $DEFAULT_CHARSET, $OUT_DEFAULT_PRECIS, $CLIP_DEFAULT_PRECIS, $PROOF_QUALITY, $DEFAULT_PITCH, 'Tahoma') ; <<<<<<<<<<<<<<<< make our own font using WinAPI
    _WinAPI_SetFont($listView, $hFont1, True) ; <<<<<<<<<<<<<<<<<<<<<<<<<<< Here we set the font for the $listview items
    $header = HWnd(_GUICtrlListView_GetHeader($listView)) ; <<<<<<<<<<<<<<<< Here we get the header handle
    _WinAPI_SetFont($header, $hFont1, True) ; <<<<<<<<<<<<<<<<<<< Here we set the header font
    _GUICtrlListView_SetExtendedListViewStyle($listView, $LVS_EX_GRIDLINES)
    _GUICtrlListView_AddColumn($listView, "Text")
    _GUICtrlListView_AddColumn($listView, "")
    _GUICtrlListView_AddArray($listView, $array)
    For $i = 0 To UBound($array) - 1 Step 1
        $itemText = _GUICtrlListView_GetItemText($listView, $i)
        If $itemText <> "" Then
            $findStrPos = StringInStr($itemText, "\", 0, -1)
            $leftStr = StringLeft($itemText, $findStrPos)
            $replacedText = StringReplace($itemText, $leftStr, "")
            _GUICtrlListView_SetItemText($listView, $i, $replacedText)
        EndIf
    Next
    _GUICtrlListView_SetColumnWidth($listView, 0, $LVSCW_AUTOSIZE_USEHEADER)
    _GUICtrlListView_SetColumnWidth($listView, 1, $LVSCW_AUTOSIZE_USEHEADER)
    GUISetState(@SW_SHOW)
EndFunc   ;==>aGUI

Func WM_NOTIFY($hWnd, $iMsg, $iwParam, $ilParam)
    Local $hWndFrom, $iIDFrom, $iCode, $tNMHDR, _
            $sIndices, $sData, $sAdata, $file, $splitFile
    $tNMHDR = DllStructCreate($tagNMHDR, $ilParam)
    $hWndFrom = HWnd(DllStructGetData($tNMHDR, "hWndFrom"))
    $iIDFrom = DllStructGetData($tNMHDR, "IDFrom")
    $iCode = DllStructGetData($tNMHDR, "Code")
    Switch $hWndFrom
        Case $listView
            Switch $iCode
                Case $NM_DBLCLK
                    $sIndices = _GUICtrlListView_GetSelectedIndices($listView)
                    $sData = _GUICtrlListView_GetItemText($listView, $sIndices)
                    $sAdata = _ArraySearch($aResult, $sData, Default, Default, Default, 3)
                    $file = _ArrayToString($aResult, Default, $sAdata, 0, Default)
                    $splitFile = StringSplit($file, "|")
                    ShellExecute($splitFile[1])
            EndSwitch
    EndSwitch
EndFunc   ;==>WM_NOTIFY

in this code I have to change the line

$res = StringRegExp($content, '(?im)(.*\b\Q' & $word & '\E\b.*)\R?', 3) ; if $word must be a lone word

with

$aRet = _Zip_SearchInFile()

It is this change right?

for AZJIO,

thanks for your advice, it's really interesting to your project

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×
×
  • Create New...