Sign in to follow this  
Followers 0
KatoXY

Get property from pdf files

12 posts in this topic

Hi,

I have huge numbers of pdf files. All of them have set property such like Author, Subject, Keywords.

Is any way to get this informations?

I mean about _WordDocPropertyGet function which works with pdf files.

Share this post


Link to post
Share on other sites



There is an available to retrieve extended properties of a file.


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

KatoXY,

And a slightly updated version with the property codes for XP, Vista and Win7 here. :D

M23


Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind._______My UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Share this post


Link to post
Share on other sites

water:

Generally this script working with files but it doesn't read properties from pdf.

I take one from Dell page: http://support.dell.com/support/edocs/systems/op990/en/ts/ts_en.pdf

Normally it has Title and Author but script doesn' read it.

Melba23:

I don't understand this topic. Probably I couldn't use this function.

I checked in Windows 7 x64, AutoitScript 3.3.6.1

Share this post


Link to post
Share on other sites

If you search for properties that aren't visible for Windows then you have to open the document and retrieve the data.

How to access this data depends on the program you have installed as PDF reader (Adobe Acrobat Reader, Foxit ...).

What do you use?


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Adobe Acrobat Reader X

Ver 10.0.1.434

Share this post


Link to post
Share on other sites
you can find a somehow "brute" method by reading the file and searching for the keyword. I don't know how fast this emthod is i you have a lot of documents.

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Or you can try the Adobe COM interface as described


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

#9 ·  Posted (edited)

Here is an example script to retrieve title, Author, Keywords, Subject, Creator, Producer, CreationDate:

$oMyError = ObjEvent("AutoIt.Error","MyErrFunc")    ; Initialize a COM error handler
$sFilePath = "C:tempTest.pdf"
$oApp = ObjCreate("AcroExch.App")   ; start Adobe Acrobat
$oDoc = ObjCreate("AcroExch.AVDoc")  ; connect to Ac Viewer
If $oDoc.Open($sFilePath, "") Then   ; open the file
    $oPDDoc = $oDoc.GetPDDoc
    $sAuthor = $oPDDoc.GetInfo("Author")
    $sTitle =  $oPDDoc.GetInfo("Title")
    $sKeywords = $oPDDoc.GetInfo("Keywords")
    $sSubject = $oPDDoc.GetInfo("Subject")
    $sCreator = $oPDDoc.GetInfo("Creator")
    $sProducer = $oPDDoc.GetInfo("Producer")
    $sCreationDate = $oPDDoc.GetInfo("CreationDate")
    MsgBox(64, "PDF Info", "Author: " & $sAuthor & @CRLF & _
        "Title: " & $sTitle & @CRLF & _
        "Subject: " & $sSubject & @CRLF & _
        "Keywords: " & $sKeywords & @CRLF & _
        "Creator: " & $sCreator & @CRLF & _
        "Producer: " & $sProducer & @CRLF & _
        "CreationDate: " & $sCreationDate)
EndIf
$oApp.exit() ; close process
; release objects
$oDoc = 0
$oApp = 0
$oPDDoc = 0

; This is my custom defined error handler
Func MyErrFunc()

  Msgbox(0,"AutoItCOM Test","We intercepted a COM Error !"    & @CRLF  & @CRLF & _
             "err.description is: " & @TAB & $oMyError.description  & @CRLF & _
             "err.windescription:"   & @TAB & $oMyError.windescription & @CRLF & _
             "err.number is: "       & @TAB & hex($oMyError.number,8)  & @CRLF & _
             "err.lastdllerror is: "   & @TAB & $oMyError.lastdllerror   & @CRLF & _
             "err.scriptline is: "   & @TAB & $oMyError.scriptline   & @CRLF & _
             "err.source is: "       & @TAB & $oMyError.source       & @CRLF & _
             "err.helpfile is: "       & @TAB & $oMyError.helpfile     & @CRLF & _
             "err.helpcontext is: " & @TAB & $oMyError.helpcontext _
            )

Endfunc

A good reference for Acrobat X.

Edited by water
1 person likes this

My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

#10 ·  Posted (edited)

Mmmh,

the product : "Adobe Acrobat Reader X Ver 10.0.1.434" doesn't exists.

You may have; "Adobe Acrobat x ... the $$$ product or just the free "Adobe Reader X .."

I would asume you have Reader only.

With that you don't can use the ActiveX components like "$oApp = ObjCreate("AcroExch.App")", which is stated in the third post from @water.

The "brute" method, also remarked by @water, my work and is maybe good enough.

Otherwise I would take pdfTk.exe or perhabs http://www.becyhome.de/download_eng.htm#becypdfmetaedit

which also can be used as command line tool and has a batch option.

HTH, Reinhard

Edited by ReFran

Share this post


Link to post
Share on other sites

Oops :D

I didn't know that Adobe Acrobat was still installed on my new machine. I told our IT department to uninstall Adobe Acrobat some months ago - but they didn't do it.

So as the script was working fine I assumed the Adobe Reader had a COM interface as well.


My UDFs and Tutorials:

Spoiler

UDFs:
Active Directory (NEW 2017-04-18 - Version 1.4.8.0) - Download - General Help & Support - Example Scripts - Wiki
OutlookEX (NEW 2017-02-27 - Version 1.3.1.0) - Download - General Help & Support - Example Scripts - Wiki
ExcelChart (2015-04-01 - Version 0.4.0.0) - Download - General Help & Support - Example Scripts
Excel - Example Scripts - Wiki
Word - Wiki
PowerPoint (2015-06-06 - Version 0.0.5.0) - Download - General Help & Support

Tutorials:
ADO - Wiki

 

Share this post


Link to post
Share on other sites

Let Adobe Acobat on the maschine.

For most quick task you still can use older versions (5-7,8,9).

For merge, split, bookmarks .. I use mostly pdftk or mbtPdsAsm.

The Reader has a small interface for use it embedded (like for IE,..), but there are only some few properties, which can be influenced (zoom, pageview, ..).

best regards, Reinhard

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0