Jump to content

PDF or PDF/A file? - (Moved)


Go to solution Solved by bdr529,

Recommended Posts

  • protfromkpax changed the title to PDF or PDF/A file?
  • Developers
Posted (edited)

English please!!!

Moved to the appropriate AutoIt General Help and Support forum, as the Developer General Discussion forum very clearly states:

Quote

General development and scripting discussions.


Do not create AutoIt-related topics here, use the AutoIt General Help and Support or AutoIt Technical Discussion forums.

Moderation Team

Edited by Jos

SciTE4AutoIt3 Full installer Download page   - Beta files       Read before posting     How to post scriptsource   Forum etiquette  Forum Rules 
 
Live for the present,
Dream of the future,
Learn from the past.
  :)

Link to post
Share on other sites
Posted (edited)
1 hour ago, Danp2 said:

I assume you tried running the code. What were the results?

The code checks if the file is in PDF format (at the beginning of PDF file is version 1.5, 1.7...), but doesn't know the difference between PDF and PDF/A.
I tried to run, result: "PDF-File detected" 🙂

Edited by protfromkpax
Link to post
Share on other sites

Validating that a document is PDF/A is kind of a black magic test. There are test suites available that check the contents of a PDF against the standard but the standard is kind of nebulous in its definitions. The most reliable (IMHO) is from VeraPDF (https://verapdf.org/software/). It's java based but comes with a decent command line tool.

The command line:

verapdf.bat --format text "c:\path\to\yourfile.PDF"

will simplify the output to pass/fail.

 

Link to post
Share on other sites
14 hours ago, rsn said:

Validating that a document is PDF/A is kind of a black magic test. There are test suites available that check the contents of a PDF against the standard but the standard is kind of nebulous in its definitions. The most reliable (IMHO) is from VeraPDF (https://verapdf.org/software/). It's java based but comes with a decent command line tool.

The command line:


verapdf.bat --format text "c:\path\to\yourfile.PDF"

will simplify the output to pass/fail.

 

Thanks, the GUI works but the command prompt says "access denied"...
I'm writing sw for a specific professional group of users to sign a set of PDF files. Due to a change in the law we have to use PDF/A so I need a check. Since my sw runs on many other PCs, I'm looking for a solution where I can write some code in AutoIT and send it as an update...
I can see that it probably won't be easy 🙂

Link to post
Share on other sites

Not sure why you'd get an access denied in any of it. The applet isn't really "installed," just kind of copied to your user profile. Maybe a java issue?

 

The VeraPDF test suite is actually open source (GPL3/MPL2) so if you had the time and talent (unlike me! :sweating:)  you could compile your own version of the test or convert it to your language of choice. See https://github.com/verapdf. Now that I think on it, since it's so liberally licensed, you might even be able to bundle it with your app. As long as some form of java interpreter is present on the PC as well (a custom/mini build of OpenJDK would work to get around Oracle's fees for business/enterprise use).

Link to post
Share on other sites
  • Solution
#include <String.au3>
msgbox("","",check_pdfa("AutoIt_Featured_640x480.pdf"))
func check_pdfa($file_init_pdf)
    dim $fileopen=fileopen($file_init_pdf,16)
    dim $fileread=BinaryToString(FileRead ($fileopen))
    fileclose($fileopen)
    dim $versione_pdf=stringmid($fileread,2,7)
    dim $_StringBetween_part=_StringBetween($fileread,"pdfaid:part='","'")
    dim $_StringBetween_conformance=_StringBetween($fileread,"pdfaid:conformance='","'")
    if not isarray($_StringBetween_part) or not isarray($_StringBetween_conformance) then
        $_StringBetween_part=_StringBetween($fileread,'pdfaid:part="','"')
        $_StringBetween_conformance=_StringBetween($fileread,'pdfaid:conformance="','"')
    endif
    if not isarray($_StringBetween_part) or not isarray($_StringBetween_conformance) then
        $_StringBetween_part=_StringBetween($fileread,"pdfaid:part>","<")
        $_StringBetween_conformance=_StringBetween($fileread,"pdfaid:conformance>","<")
    endif
    if isarray($_StringBetween_part) and isarray($_StringBetween_conformance) and ($_StringBetween_part[0]="1" or $_StringBetween_part[0]="2" or $_StringBetween_part[0]="3") and _
                ($_StringBetween_conformance[0]="a" or $_StringBetween_conformance[0]="b" or $_StringBetween_conformance[0]="u") Then
        if $_StringBetween_part[0]&$_StringBetween_conformance[0]<>"1u" then
            return seterror(0,0,$versione_pdf&"   PDF/A-"&$_StringBetween_part[0]&$_StringBetween_conformance[0])
        Else
            return seterror(2,0,$versione_pdf)
        endif
    Else
        return seterror(1,0,$versione_pdf)
    endif
EndFunc

 

AutoIt_Featured_640x480.pdf

Sono io a ringraziare la community di autoit

Link to post
Share on other sites
15 hours ago, bdr529 said:
#include <String.au3> 
msgbox ( "" , "" , check _ pdfa ( "AutoIt_Featured_640x480.pdf " ) ) 
func check _ pdfa ( $ file_init_pdf ) 
    dim  $ fileopen = fileopen ( $ file_init_ToString ) $ 1 ( FileRead ( $fileopen ) ) fileclose ( $fileopen ) dim $ versione_pdf =
      
    
     stringmid ( $fileread , 2 , 7 ) 
    dim  $_StringBetween_part = _StringBetween ( $fileread , "pdfaid:part='" , "'" ) 
    dim  $_StringBetween_conformance = _StringBetween ( $filereadcon , "pdfaid " : ) pokud není isarray ( $_StringBetween_part ) nebo není isarray ( $_StringBetween_conformance ) , pak $_StringBetween_part
          
        = _StringBetween ( $fileread , 'pdfaid:part="' , '"' ) 
        $_StringBetween_conformance = _StringBetween ( $fileread , 'pdfaid:conformance="' , '"' ) 
    endif 
    if  ne  isarray ( $ _Stringray )  Between nebo  ne  $_StringBetween_conformance ) potom $_StringBetween_part = _StringBetween ( $fileread , "pdfaid:part>" , "<" ) 
        
        $_StringBetween_conformance = _StringBetween ( $fileread , "pdfaid:conformance>" , "<" ) 
    endif 
    if  isarray ( $_StringBetween_part )  a  isarray ( $_StringBetween_conformance )  a  ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) a ( $_StringBetween_conformance ) = [ 0 ] nebo $ _StringBetween_part = [ 0 ] "2" nebo $_StringBetween_part [ 0 ] = "3" ) a _ (      
                $_StringBetween_conformance [ 0 ] = "a"  nebo  $_StringBetween_conformance [ 0 ] = "b"  nebo  $_StringBetween_conformance [ 0 ] = "u" )  Potom 
        , pokud  $_StringBetween_part [ 0 ] & $_StringBetween_conform [ vrátit potom 1 " > < u> seterror ( 0 , 0 , $versione_pdf & " PDF/A-" &$_StringBetween_part 
             [ 0 ] & $_StringBetween_conformance [ 0 ] ) 
        Else 
            return  seterror ( 2 , 0 , $ versione_pdf ) 
        endif 
    Else 
        return  seterror ( 1 , 0 , $ versione_pdf ) 
    endif 
EndFunc

 

AutoIt_Featured_640x480.pdf 18,07 kB · 7 stažení

You are a great magician! This is exactly what I was looking for. I only understand a little code, but I will learn everything.
Thank you so much, I want to dance with joy! (now many nights await me on your code 🙂 )

Link to post
Share on other sites

@bdr529 Until I read your code, I never thought to read the metadata. I open a pdf in HxD and there it is: the versions of the PDF and which levels of conformance. I didn't realize that some of the meta data is excluded from the viewable properties. Great work!

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...