Jump to content
Kiran_L

Read data from PDF files

Recommended Posts

Hi guys,

 

I am trying to read a pdf file with unstructured data. I dontot know how to handle pdf activities in AutoIt,

Can you help me with any UDF to open the PDF and read the doc.

 

Thanks for your time.

 

Share this post


Link to post
Share on other sites

Hi @Kiran_L,

Try checking this old thread and this link. You might get an idea somehow. Else, can you post your made code so far so that anyone can easily help.


Programming is "To make it so simple that there are obviously no deficiencies" or "To make it so complicated that there are no obvious deficiencies" by C.A.R. Hoare.

Share this post


Link to post
Share on other sites

You can use mupdf

https://mupdf.com/downloads/

or other commercial solutions like QuickPDF (look in my signature for QuickPDF.au3 UDF)

 


Signature beginning:   Wondering who uses AutoIT and what it can be used for ?
* GHAPI UDF - modest beginning - communication with GitHub REST API Forum Rules *
ADO.au3 UDF     POP3.au3 UDF     XML.au3 UDF    How to use IE.au3  UDF with  AutoIt v3.3.14.x  for other useful stuff click the following button

Spoiler

Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind. 

My contribution (my own projects): * Debenu Quick PDF Library - UDF * Debenu PDF Viewer SDK - UDF * Acrobat Reader - ActiveX Viewer * UDF for PDFCreator v1.x.x * XZip - UDF * AppCompatFlags UDF * CrowdinAPI UDF * _WinMergeCompare2Files() * _JavaExceptionAdd() * _IsBeta() * Writing DPI Awareness App - workaround * _AutoIt_RequiredVersion() * Chilkatsoft.au3 UDF * TeamViewer.au3 UDF * JavaManagement UDF * VIES over SOAP * WinSCP UDF * GHAPI UDF - modest begining - comunication with GitHub REST APIErrorLog.au3 UDF - A logging Library *

My contribution to others projects or UDF based on  others projects: * _sql.au3 UDF  * POP3.au3 UDF *  RTF Printer - UDF * XML.au3 UDF * ADO.au3 UDF SMTP Mailer UDF * Dual Monitor resolution detection * * 2GUI on Dual Monitor System * _SciLexer.au3 UDF * SciTE - Lexer for console pane

Useful links: * Forum Rules * Forum etiquette *  Forum Information and FAQs * How to post code on the forum * AutoIt Online Documentation * AutoIt Online Beta Documentation * SciTE4AutoIt3 getting started * Convert text blocks to AutoIt code * Games made in Autoit * Programming related sites * Polish AutoIt Tutorial * DllCall Code Generator * 

Wiki: Expand your knowledge - AutoIt Wiki * Collection of User Defined Functions * How to use HelpFile * Good coding practices in AutoIt * 

IE Related:  * How to use IE.au3  UDF with  AutoIt v3.3.14.x * Why isn't Autoit able to click a Javascript Dialog? * Clicking javascript button with no ID * IE document >> save as MHT file * IETab Switcher (by LarsJ ) * HTML Entities * _IEquerySelectorAll() (by uncommon) * IE in TaskScheduler

I encourage you to read: * Global Vars * Best Coding Practices * Please explain code used in Help file for several File functions * OOP-like approach in AutoIt * UDF-Spec Questions *  EXAMPLE: How To Catch ConsoleWrite() output to a file or to CMD *

"Homo sum; humani nil a me alienum puto" - Publius Terentius Afer
"Program are meant to be read by humans and only incidentally for computers and execute" - Donald Knuth, "The Art of Computer Programming"
:naughty:  :ranting:, be  :) and       \\//_.

Anticipating Errors :  "Any program that accepts data from a user must include code to validate that data before sending it to the data store. You cannot rely on the data store, ...., or even your programming language to notify you of problems. You must check every byte entered by your users, making sure that data is the correct type for its field and that required fields are not empty."

Signature last update: 2019-10-01

Share this post


Link to post
Share on other sites

I Would recommend U Using xpdf Just Search In Forum For Xpdf Some One Had Posted It Earlier it Will Allow U To read Data From Pdf As Text

 

Share this post


Link to post
Share on other sites

@rsalois please don't resurrect old threads, especially just to say you have no idea how to answer the OP's question...


"Profanity is the last vestige of the feeble mind. For the man who cannot express himself forcibly through intellect must do so through shock and awe" - Spencer W. Kimball

How to get your question answered on this forum!

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • By MakzNovice
      Hello Experts,
      I have Zero experience with Autoit + Adobe Acrobat, and I really in need to get this working as PoC.
      I am trying to automate some manual actions below are the steps I would like to do.
      INPUT to script : 
      1. PDF file to open
      2. String that I would like to add as \\Server\Directory name
      Steps : 
      1. Open the file in Adobe Acrobat Pro
      2. Browse to View > Tools > Send For Review > Open (see image 1)
      3. On the launched tool bar click on "Send for Shared Connecting" (see image 2)
      4. Next select option "Automatically Collect comments on my..." in dropdown and click 'Next' (see image 3)
      5. Select radiobutton "Network folder" and paste the input "\\Server\Directory" in text field and click 'Next' (see image 4)
      Experts, I would really appreciate a quick script which I can run and get rolling.
      Please note, I would not likwe to rely on MouseClick and/or cordinates match approach.
      PLEASE SUPPORT!!!!
      Makz
      **********************************************************************************************************
      Image 1

       
      Image 2

       
      Image 3

       
      Image 4

    • By jitendriya
      Hi every one .
      I want to read a pdf file and write into a excel using autoit , so how can i do this with out using third party server please tell me .
      Thank you..
    • By mLipok
      ; #INDEX# =======================================================================================================================
      ; Title .........: UDF for "Debenu Quick PDF Library"
      ; AutoIt Version : 3.3.10.2++
      ; Language ......: English
      ; Description ...: A collection of functions for Debenu Quick PDF Library
      ; Author(s) .....: mLipok
      ; Modified ......:
      ; ===============================================================================================================================
      Release note:
       
       
      Erratum v0.7:
       
      Forum link:
       
       
    • By mLipok
      Today, in the end as well, worked out using the Acrobat Reader ActiveX COM Object "AcroPDF.PDF.1"
      #include-once #include <Constants.au3> #include <GUIConstantsEx.au3> #include <WindowsConstants.au3> #include <Misc.au3> #include <MenuConstants.au3> #include <WinAPI.au3> ;~ Thanks to BrewManNH ;~ http://www.autoitscript.com/forum/topic/134878-guiregistermsg-replacement-for-guictrlsetonevent-and-guigetmsg/ ;~ Thanks to mikell ;~ http://www.autoitscript.com/forum/topic/161985-how-to-close-gui-with-guiregistermsg/ ; Install a custom error handler Global $oMyError = ObjEvent("AutoIt.Error", "_ComErrFunc") Global $__hExampleGUI Global $__idOPEN Global $_fExit Global $__hACROBAT_GUI = '' Global $__idACROBAT_GUI_CTRL_AX = '' Global $__oACROBAT_READER = '' #include <GUIConstantsEx.au3> ;~ GUIRegisterMsg($WM_ERASEBKGND, "_WM_EXTRACTOR") ;~ GUIRegisterMsg($WM_PAINT, "_WM_EXTRACTOR") ;~ GUIRegisterMsg($WM_ACTIVATE, "_WM_EXTRACTOR") ;~ GUIRegisterMsg($WM_CAPTURECHANGED, "_WM_EXTRACTOR") ;~ GUIRegisterMsg($WM_DEVICECHANGE, "_WM_EXTRACTOR") GUIRegisterMsg($WM_EXITSIZEMOVE, "_WM_EXTRACTOR") GUIRegisterMsg($WM_COMMAND, "_WM_EXTRACTOR") GUIRegisterMsg($WM_SYSCOMMAND, "_WM_EXTRACTOR") GUIRegisterMsg($WM_HSCROLL, "_WM_EXTRACTOR") _ExampleProgram_Gui() While 1 Sleep(10) If $_fExit Then _ACROBAT_GUI_DELETE() DeleteGui() Exit EndIf WEnd Func DeleteGui() GUIDelete($__hExampleGUI) EndFunc ;==>DeleteGui Func _ExampleProgram_Gui() ; Create a GUI with various controls. $__hExampleGUI = GUICreate("Example") $__idOPEN = GUICtrlCreateButton("&Open", 310, 370, 85, 25) ; Display the GUI. GUISetState(@SW_SHOW, $__hExampleGUI) EndFunc ;==>_ExampleProgram_Gui #Region ACROBAT FUNCTION Func _AcrobatInit() $__oACROBAT_READER = ObjCreate("AcroPDF.PDF.1"); Return $__oACROBAT_READER.GetVersions EndFunc ;==>_AcrobatInit Func _Acrobat_Events(ByRef $aMSG) If $aMSG[1] = $__hACROBAT_GUI Then Switch $aMSG[0] Case $GUI_EVENT_CLOSE _ACROBAT_GUI_DELETE() EndSwitch EndIf EndFunc ;==>_Acrobat_Events Func _ACROBAT_Destroy() $__oACROBAT_READER = "" ;~ MsgBox(1,'test','destroyed') EndFunc ;==>_ACROBAT_Destroy Func _AcrobatShow($sFile, $sTitle = "PDF ", $iLeft = 50, $iTop = 0, $iWidth = 1000, $iHeight = 700) If FileExists($sFile) Then _AcrobatInit() ; Set option $__oACROBAT_READER.src = $sFile $__oACROBAT_READER.SetLayoutMode(4) $__oACROBAT_READER.SetPageMode(1) $__oACROBAT_READER.SetShowToolbar(0) $__oACROBAT_READER.SetView(1) ; Create GUI $__hACROBAT_GUI = GUICreate($sTitle, $iWidth, $iHeight, $iLeft, $iTop, BitOR($GUI_SS_DEFAULT_GUI, $WS_SIZEBOX, $WS_MAXIMIZEBOX)) $__idACROBAT_GUI_CTRL_AX = GUICtrlCreateObj($__oACROBAT_READER, 5, 5, $iWidth - 20, $iHeight - 10) GUICtrlSetStyle($__idACROBAT_GUI_CTRL_AX, $WS_VISIBLE) GUISetState() EndIf EndFunc ;==>_AcrobatShow Func _ACROBAT_Refresh() If IsObj($__oACROBAT_READER) Then Local $hPreviouslyGui = GUISwitch($__hACROBAT_GUI) GUISetState(@SW_LOCK) Local $iGUI_PDFWidth = WinGetPos($__hACROBAT_GUI)[2] - 20 Local $iGUI_PDFHeight = WinGetPos($__hACROBAT_GUI)[3] - 40 Local $sFile = $__oACROBAT_READER.src ; this below line do not works with Acro Reader ; Local $iCurrentPage = $__oACROBAT_READER.GetNumber Local $iCurrentPage = 0 _ACROBAT_Destroy() GUICtrlDelete($__idACROBAT_GUI_CTRL_AX) _AcrobatInit() $__idACROBAT_GUI_CTRL_AX = GUICtrlCreateObj($__oACROBAT_READER, 5, 5, $iGUI_PDFWidth, $iGUI_PDFHeight) $__oACROBAT_READER.src = $sFile ;~ $__oACROBAT_READER.SetCurrentPage($iCurrentPage) GUISetState(@SW_UNLOCK) GUISwitch($hPreviouslyGui) EndIf EndFunc ;==>_ACROBAT_Refresh Func _ACROBAT_GUI_DELETE() _ACROBAT_Destroy() if IsHWnd($__hACROBAT_GUI) then GUIDelete($__hACROBAT_GUI) EndFunc ;==>_ACROBAT_GUI_DELETE #EndRegion ACROBAT FUNCTION #Region MSG and ERROR HANDLER Func _WM_EXTRACTOR($hWnd, $iMsg, $wParam, $lParam) ;~ ConsoleWrite('! $hWnd = ' & $hWnd & ' $iMsg = ' & $iMsg & '('&HEX($iMsg)&')'& ' $wParam = ' & $wParam & ' $lParam = ' & $lParam & @CRLF) If $hWnd = ControlGetHandle($__hACROBAT_GUI, '', $__idACROBAT_GUI_CTRL_AX) Then ConsoleWrite('! -------------- $hWnd = ' & $hWnd & ' $iMsg = ' & $iMsg & '(' & Hex($iMsg) & ')' & ' $wParam = ' & $wParam & ' $lParam = ' & $lParam & @CRLF) EndIf If $hWnd = $__hACROBAT_GUI Then Switch $iMsg Case $WM_COMMAND #cs Case $WM_ACTIVATE Local $test = BitAND($wParam, 0x00000004) if $test <> 0 then MsgBox(1,'$WM_ACTIVATE','test') _ACROBAT_Refresh() EndIf Case $WM_ERASEBKGND WinGetHandle("[ACTIVE]") if $__hACROBAT_GUI <> _WinAPI_GetWindow ( $__hACROBAT_GUI, $GW_HWNDPREV ) then ConsoleWrite('! Case $WM_ERASEBKGND' & @CRLF) _ACROBAT_Refresh() _WinAPI_RedrawWindow($__hACROBAT_GUI,0,0,$RDW_NOERASE) EndIf Case $WM_PAINT _WinAPI_RedrawWindow($__hACROBAT_GUI,0,0,$RDW_NOERASE) Case $WM_CAPTURECHANGED _ACROBAT_Refresh() Case $WM_DEVICECHANGE _ACROBAT_Refresh() #ce Case $WM_EXITSIZEMOVE _ACROBAT_Refresh() Case $WM_SYSCOMMAND ;~ Local $test = BitAND($wParam, 0xFFF0) Local $test = BitAND($wParam, 0x0000FFFF) Switch $test Case $SC_CLOSE _ACROBAT_GUI_DELETE() Case $SC_CONTEXTHELP Case $SC_DEFAULT Case $SC_HOTKEY Case $SC_HSCROLL Case $SC_KEYMENU Case $SC_MAXIMIZE _ACROBAT_Refresh() Case $SC_MINIMIZE Case $SC_MONITORPOWER Case $SC_MOUSEMENU Case $SC_MOVE ;~ _ACROBAT_Refresh() Case $SC_NEXTWINDOW ;~ _ACROBAT_Refresh() Case $SC_PREVWINDOW ;~ _ACROBAT_Refresh() Case $SC_RESTORE _ACROBAT_Refresh() Case $SC_SCREENSAVE Case $SC_SIZE Case $SC_TASKLIST Case $SC_VSCROLL EndSwitch EndSwitch EndIf If $hWnd = $__hExampleGUI Then Switch $iMsg Case $WM_COMMAND Local $nID = BitAND($wParam, 0x0000FFFF) Local $hCtrl = $lParam Switch $nID Case $__idOPEN if not IsObj($__oACROBAT_READER) then _AcrobatShow(FileOpenDialog("Choose PDF", "C:\Temp", "PDF Files(*.pdf)", 3)) ; put your own start folder here) EndIf EndSwitch Case $WM_SYSCOMMAND Local $test = BitAND($wParam, 0xFFF0) Switch $test Case $SC_CLOSE $_fExit = True Case $SC_CONTEXTHELP Case $SC_DEFAULT Case $SC_HOTKEY Case $SC_HSCROLL Case $SC_KEYMENU Case $SC_MAXIMIZE Case $SC_MINIMIZE Case $SC_MONITORPOWER Case $SC_MOUSEMENU Case $SC_MOVE Case $SC_NEXTWINDOW Case $SC_PREVWINDOW Case $SC_RESTORE Case $SC_SCREENSAVE Case $SC_SIZE Case $SC_TASKLIST Case $SC_VSCROLL EndSwitch EndSwitch EndIf Return $GUI_RUNDEFMSG EndFunc ;==>_WM_EXTRACTOR Func _ComErrFunc() Local $HexNumber = Hex($oMyError.number, 8) MsgBox(0, "AutoItCOM Test", _ "We intercepted a COM Error !" & @CRLF & @CRLF & _ "err.description is: " & @TAB & $oMyError.description & @CRLF & _ "err.windescription:" & @TAB & $oMyError.windescription & @CRLF & _ "err.number is: " & @TAB & $HexNumber & @CRLF & _ "err.lastdllerror is: " & @TAB & $oMyError.lastdllerror & @CRLF & _ "err.scriptline is: " & @TAB & $oMyError.scriptline & @CRLF & _ "err.source is: " & @TAB & $oMyError.source & @CRLF & _ "err.helpfile is: " & @TAB & $oMyError.helpfile & @CRLF & _ "err.helpcontext is: " & @TAB & $oMyError.helpcontext _ ) SetError(1) EndFunc ;==>_ComErrFunc #EndRegion MSG and ERROR HANDLER Any comments are welcome.
      Cheers
      mLipok
×
×
  • Create New...