Sign in to follow this  
Followers 0
JonF

Modify properties of an existing PDF file

3 posts in this topic

There's several topics and UDFs I've found that create PDF files, but I don't see anything that changes the properties of an existing PDF. I've tried PDFTK and neither update_info or update_info_utf8 change the properties as displayed in Acrobat.

We want to file a whole bunch of PDFs of journal articles as references, with the Title property set to the title of the paper, the Author property set to the author(s) names, the Subject property set to the journal reference in an extremely fixed format, and the Keywords set to the complete citation ready for copying and pasting. For example:

Title: Tests for skewness, kurtosis, and normality for time series data

Author: Bai J, Ng S

Subject: J Bus Econ Stat. 2005 Jan;23(1):49-60

Keywords: Bai J, Ng S. Tests for skewness, kurtosis, and normality for time series data. J Bus Econ Stat. 2005 Jan;23(1):49-60.

Then we name the file with the authors, a hyphen, and the title:

Bai J, Ng S-Tests for skewness, kurtosis, and normality for time series data.pdf

Collecting this information is a pain, and often one wants to scroll through the article while entering this info into a dialog box. Obviously an AutoIT program can create the keywords field and the filename from the other properties, and there may be opportunities to help out putting the rest of the information together.

Any way to write these properties to a PDF, or do I have to open Acrobat and the Properties dialog and punch the values in? (I do have full Acrobat).

Share this post


Link to post
Share on other sites



Mmmh,

I shortly tested it with PdfTk. It can dump out the data, but it seems the update works only for userdefined, not general, meta data / InfoKeys.

However Adobe (full Version can do - have a look at the Acro-JS help file),

mbtPdfAsm can do - you may also download the GUI = BeCyPdfAsm for quick testing -

and also BeCyPdfMetaEdit can do it. With all you can work with batch jobs.

Urls you can find with search here or in the Intranet. I'm just a little bit in hurry.

HTH, Reinhard

Share this post


Link to post
Share on other sites

IF your pdf has not outlines (those have /Title also), here's a fast hack, no error checking:

#include <Array.au3>;just for display the arrays

_Test()

Func _Test()
Local $sFile = @ScriptDir & "\PDFReference13.pdf"

Local $aOldData = _PDF_GetProperties($sFile)
_ArrayDisplay($aOldData)

Local $aNewData[6][2] = [["Title", "New Title"],["Producer", "New Producer"],["Author", "New Author"],["Creator", "New Creator"],["Subject", "New Subject"],["Keywords", "New keywords"]]

Local $sNewFile = _PDF_SetProperties($sFile, $aOldData, $aNewData)
Local $aCheck = _PDF_GetProperties($sNewFile)
_ArrayDisplay($aCheck)
EndFunc ;==>_Test

Func _PDF_GetProperties($sFile)
Local $a_Prop[6][2] = [["Title", ""],["Producer", ""],["Author", ""],["Creator", ""],["Subject", ""],["Keywords", ""]]
Local $hFile = FileOpen($sFile)
Local $sTxt = FileRead($hFile)
FileClose($hFile)
Local $title = StringRegExp($sTxt, "(?i)(/Title) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[0][1] = "no match"
Else
$a_Prop[0][1] = $title[1]
EndIf
Local $producer = StringRegExp($sTxt, "(?i)(/Producer) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[1][1] = "no match"
Else
$a_Prop[1][1] = $producer[1]
EndIf
Local $author = StringRegExp($sTxt, "(?i)(/Author) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[2][1] = "no match"
Else
$a_Prop[2][1] = $author[1]
EndIf
Local $creator = StringRegExp($sTxt, "(?i)(/Creator) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[3][1] = "no match"
Else
$a_Prop[3][1] = $creator[1]
EndIf
Local $subject = StringRegExp($sTxt, "(?i)(/Subject) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[4][1] = "no match"
Else
$a_Prop[4][1] = $subject[1]
EndIf
Local $keywords = StringRegExp($sTxt, "(?i)(/Keywords) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[5][1] = "no match"
Else
$a_Prop[5][1] = $keywords[1]
EndIf
Return $a_Prop
EndFunc ;==>_PDF_GetProperties

Func _PDF_SetProperties($sFile, $aOld, $aNew)
Local $hFile = FileOpen($sFile)
Local $sTxt = FileRead($hFile)
FileClose($hFile)
For $i = 0 To UBound($aOld) - 1
If $aOld[$i][1] <> "no match" Or $aOld[$i][1] <> "" Then
$sTxt = StringRegExpReplace($sTxt, "(?i)(/" & $aOld[$i][0] & ") {0,1}\((.*?)\)", "/" & $aOld[$i][0] & " (" & $aNew[$i][1] &") ", 1)
EndIf
Next
Local $sFileName = StringRegExpReplace($sFile, ".*\\(.*).{4}", "$1")
Local $sNewFile = StringReplace($sFile, $sFileName, $sFileName & "_mod.pdf")
Local $hNew = FileOpen($sNewFile, 18)
FileWrite($hNew, $sTxt)
FileClose($hNew)
Return $sNewFile
EndFunc ;==>_PDF_SetProperties

It will create a new pdf with the name original_mod.pdf and with the new properties.

I suck at RegExp, but this script is tested on several pdf's, WITHOUT outlines (in this case the Title property is altered, the rest is ok).

If you use parentheses within a field, e.g. John (Cheese) Doe, the right replacement is John \(Cheese\) Doe.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • Mag91
      By Mag91
      Hey Community,
      cause im too new in the Auto it world i will try it with the your help. hopefully.
      I woud like to know how i can handle my Problem.
      ----
      I have a Excel Data with 362 random numbers.
      For Example:
      1166642335374 1172899897343
      .....
      this numbers are a part of the filepath ...example
      D:\Projekte\1166_64233_5374
      as u can see its the first number of the Excel data. After the first 4 numbers it shoud make a "_" than another 5 "_"
      This is my first question. How can i handle this to make it Shell execute.
       
      --------
      Second question:
      If i am in the path.
      For Example:
      D:\Projekte\1166_64233_5374
      the code shoud search for specific PDF Files.
      They are named like: 0050569E364B1ED79B900F73E62660EC.pdf
      the first 15 letters are always the same
      0050569E364B1ED
      when he found this data he has to copy it on a Folder on the Desktop.
      (There can also be 2 or 3 pdfs in one Folder with this letters)
      ----
      Please give me some help :-)
       
       
       
       
       
       
    • Mag91
      By Mag91
      Hey Everybody,
      as you know im on a very low autoit-level.
      My question is: How can i read all PDFs from a Folder wich is open and copy them to a Folder on a Desktop.
       
      The Folder wich contains the PDFs is variable Z:\Projektls\"*"*"*EVERYTIME ANOTHER ENDING"*"*"*"*"
      There can be 1 PDF or even 15 PDFs.
      i tried it with _FileListToArray and _FileCopy but i Need some help to understand this language
       
      THANKS!
       
    • Skeletor
      By Skeletor
      Hi Guys,
      I've been reading this post ...
      When I came accross the examples, non of them had what I was looking for.
      I basically want to "snapshot" my GUI's multiple tabs and send them into the pdf.
      A little nudge from you guys would be great.
      Im really stuck with this one, therefore I have no code.
      Lets discuss or point me in a right direction... thanks alot
       

    • FrancescoDiMuro
      By FrancescoDiMuro
      Good morning
      I was looking around the forum if there were some customizable solutions about creating a PDF from "0" to something like a report...
      What I'd like to do is something with a header ( 2 logos and a title ), with a table which contains data read from a file
      At the moment, I was working with HTML, since I know it and it's very simple to do a table with some data inside...
      But know, I'm a bit stuck about the exporting the HTML page to PDF... And, here too, if someone knows how to do it, please, I'm here listening
      Thanks guys
       
    • FrancescoDiMuro
      By FrancescoDiMuro
      Good morning guys
      I'd like to know if there is a way to convert a PDF in CSV or, eventually, in TXT, in order to read from it, like a database...
      I have a PDF and I think ( I dind't search a lot on the forum ) with AutoIt, but I'd like work with Excel styles...
      Does anyone know a good program which convert PDF to CSV? 
      PS: the PDF file is 5 MB, and it contains 439 pages...
      Thanks everyone for the help