Jump to content
Sign in to follow this  
JonF

Modify properties of an existing PDF file

Recommended Posts

JonF

There's several topics and UDFs I've found that create PDF files, but I don't see anything that changes the properties of an existing PDF. I've tried PDFTK and neither update_info or update_info_utf8 change the properties as displayed in Acrobat.

We want to file a whole bunch of PDFs of journal articles as references, with the Title property set to the title of the paper, the Author property set to the author(s) names, the Subject property set to the journal reference in an extremely fixed format, and the Keywords set to the complete citation ready for copying and pasting. For example:

Title: Tests for skewness, kurtosis, and normality for time series data

Author: Bai J, Ng S

Subject: J Bus Econ Stat. 2005 Jan;23(1):49-60

Keywords: Bai J, Ng S. Tests for skewness, kurtosis, and normality for time series data. J Bus Econ Stat. 2005 Jan;23(1):49-60.

Then we name the file with the authors, a hyphen, and the title:

Bai J, Ng S-Tests for skewness, kurtosis, and normality for time series data.pdf

Collecting this information is a pain, and often one wants to scroll through the article while entering this info into a dialog box. Obviously an AutoIT program can create the keywords field and the filename from the other properties, and there may be opportunities to help out putting the rest of the information together.

Any way to write these properties to a PDF, or do I have to open Acrobat and the Properties dialog and punch the values in? (I do have full Acrobat).

Share this post


Link to post
Share on other sites
ReFran

Mmmh,

I shortly tested it with PdfTk. It can dump out the data, but it seems the update works only for userdefined, not general, meta data / InfoKeys.

However Adobe (full Version can do - have a look at the Acro-JS help file),

mbtPdfAsm can do - you may also download the GUI = BeCyPdfAsm for quick testing -

and also BeCyPdfMetaEdit can do it. With all you can work with batch jobs.

Urls you can find with search here or in the Intranet. I'm just a little bit in hurry.

HTH, Reinhard

Share this post


Link to post
Share on other sites
taietel

IF your pdf has not outlines (those have /Title also), here's a fast hack, no error checking:

#include <Array.au3>;just for display the arrays

_Test()

Func _Test()
Local $sFile = @ScriptDir & "\PDFReference13.pdf"

Local $aOldData = _PDF_GetProperties($sFile)
_ArrayDisplay($aOldData)

Local $aNewData[6][2] = [["Title", "New Title"],["Producer", "New Producer"],["Author", "New Author"],["Creator", "New Creator"],["Subject", "New Subject"],["Keywords", "New keywords"]]

Local $sNewFile = _PDF_SetProperties($sFile, $aOldData, $aNewData)
Local $aCheck = _PDF_GetProperties($sNewFile)
_ArrayDisplay($aCheck)
EndFunc ;==>_Test

Func _PDF_GetProperties($sFile)
Local $a_Prop[6][2] = [["Title", ""],["Producer", ""],["Author", ""],["Creator", ""],["Subject", ""],["Keywords", ""]]
Local $hFile = FileOpen($sFile)
Local $sTxt = FileRead($hFile)
FileClose($hFile)
Local $title = StringRegExp($sTxt, "(?i)(/Title) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[0][1] = "no match"
Else
$a_Prop[0][1] = $title[1]
EndIf
Local $producer = StringRegExp($sTxt, "(?i)(/Producer) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[1][1] = "no match"
Else
$a_Prop[1][1] = $producer[1]
EndIf
Local $author = StringRegExp($sTxt, "(?i)(/Author) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[2][1] = "no match"
Else
$a_Prop[2][1] = $author[1]
EndIf
Local $creator = StringRegExp($sTxt, "(?i)(/Creator) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[3][1] = "no match"
Else
$a_Prop[3][1] = $creator[1]
EndIf
Local $subject = StringRegExp($sTxt, "(?i)(/Subject) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[4][1] = "no match"
Else
$a_Prop[4][1] = $subject[1]
EndIf
Local $keywords = StringRegExp($sTxt, "(?i)(/Keywords) {0,1}\((.*?)\)", 1)
If @error = 1 Then
$a_Prop[5][1] = "no match"
Else
$a_Prop[5][1] = $keywords[1]
EndIf
Return $a_Prop
EndFunc ;==>_PDF_GetProperties

Func _PDF_SetProperties($sFile, $aOld, $aNew)
Local $hFile = FileOpen($sFile)
Local $sTxt = FileRead($hFile)
FileClose($hFile)
For $i = 0 To UBound($aOld) - 1
If $aOld[$i][1] <> "no match" Or $aOld[$i][1] <> "" Then
$sTxt = StringRegExpReplace($sTxt, "(?i)(/" & $aOld[$i][0] & ") {0,1}\((.*?)\)", "/" & $aOld[$i][0] & " (" & $aNew[$i][1] &") ", 1)
EndIf
Next
Local $sFileName = StringRegExpReplace($sFile, ".*\\(.*).{4}", "$1")
Local $sNewFile = StringReplace($sFile, $sFileName, $sFileName & "_mod.pdf")
Local $hNew = FileOpen($sNewFile, 18)
FileWrite($hNew, $sTxt)
FileClose($hNew)
Return $sNewFile
EndFunc ;==>_PDF_SetProperties

It will create a new pdf with the name original_mod.pdf and with the new properties.

I suck at RegExp, but this script is tested on several pdf's, WITHOUT outlines (in this case the Title property is altered, the rest is ok).

If you use parentheses within a field, e.g. John (Cheese) Doe, the right replacement is John \(Cheese\) Doe.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

  • Similar Content

    • Mannyfresh31
      By Mannyfresh31
      First of all IDK if it's OK to post this topic here and if not I'm asking to move it to the right place.
      Anyway I ask for help in the PDF UDF main thread but nobody has answer me yet and I really need help on this.
      See the problem started when I upgraded Autoit to its new version you can see the thread in the following link 
       
    • Iznogoud
      By Iznogoud
      Hi,
      I am trying to find more information on generating PDF files from AutoIT and found the UDF topic 
      Also found some info on creating .FDF file and use PDFTK for merging, but i can't find info about a custom based layout.
      What i trying to find out is, if it is possible to generate a variable layout. For an example i created a script which is based out 20 fields and every field can contain information. If one of those fields are not filled, it shouldn't be printed on the PDF either.
      To make it more difficult, the PDF should contain a heading, the first 10 fields and then a horizontal line across the PDF and then a new heading and then show the last 10 fields.
      But only the fields which are filled.
      Is this possible?
    • Gowrisankar
      By Gowrisankar
      Hello everyone,
      I'm working on a task where, a PDF file is opened (in IE browser) when I click a link in a website.
      I have to read the first page of the PDF to find particular strings. Can you please share some ideas?
    • mLipok
      By mLipok
      Here:
      https://github.com/nachbar/TRichViewToPdfUsingDebenu/blob/master/Unit1.cpp
      I found a code in C++ for file format conversion from RTF to PDF with using Debenu QuickPDF.
      I know how to use Debenu QuickPDF in AutoIt .
      My question is about RTF part of this code:
       
      HDC hdcNew = debenu->GetCanvasDC( RTFPRINTINGDOTSPERINCH * RTFPAGEWIDTHININCHES, RTFPRINTINGDOTSPERINCH * RTFPAGEHEIGHTININCHES); canvas = new TCanvas; canvas->Handle = hdcNew; RVReportHelper1->DrawPage( PageCounter, canvas, true, RTFPRINTINGDOTSPERINCH * RTFPAGEHEIGHTININCHES); // LastPageHeight);  
      First there is hdcNew declaration , and this is not the problem.
      My problem is in converting the following code snippet, to AutoIt
      canvas = new TCanvas; canvas->Handle = hdcNew; RVReportHelper1->DrawPage( PageCounter, canvas, true, RTFPRINTINGDOTSPERINCH * RTFPAGEHEIGHTININCHES); // LastPageHeight);
      If you remember, I created RTFPrinter sometime ago. But it was some time ago , and created by trial and error, rather than in-depth analysis.
      Also, it was just a modification of another script, not my own work from scratch.
      So now I'm looking for help how to adapt this code snippet to AutoIt.
      Any tips ?
       
      Regards,
      mLIpok
       
    • Kiran_L
      By Kiran_L
      Hi guys,
       
      I am trying to read a pdf file with unstructured data. I dontot know how to handle pdf activities in AutoIt,
      Can you help me with any UDF to open the PDF and read the doc.
       
      Thanks for your time.
       
×

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.