Jump to content

pdf to text?


 Share

Recommended Posts

hi,

I know that people have this running;

I need a "quick and dirty" "Save pdf as text" to add to a script..

Has anyone done this? -; I'm exhausted with ideas, researching methods...

Randall

give me a minute and i'll try to whip something up. are we talking about a pdf that's being displayed, or just any arbitrary pdf file?
Link to comment
Share on other sites

  • Moderators

give me a minute and i'll try to whip something up. are we talking about a pdf that's being displayed, or just any arbitrary pdf file?

It's been 33 minutes already, you have 30 + ones done already? :lmao:

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

  • Moderators

i actually haven't started yet because i asked for elaboration regarding what exactly the script should do. the last thing i want to do is write a script that doesn't do what it's supposed to. (again)

I resemble that remark!

Common sense plays a role in the basics of understanding AutoIt... If you're lacking in that, do us all a favor, and step away from the computer.

Link to comment
Share on other sites

You might be able to incorporate a third-party program such as http://www.pdf-to-html-word.com/pdf-to-text/

http://www.pdfzone.com/article2/0,1895,1864583,00.asp

Another route is to add a printer:

Port is "FILE:"

Manufacturer is "Generic"

Model is "Generic / Text Only"

But I don't know of a command line option with Adobe Reader (or Foxit Reader) for printing a PDF document.

Use Mozilla | Take a look at My Disorganized AutoIt stuff | Very very old: AutoBuilder 11 Jan 2005 prototype I need to update my sig!
Link to comment
Share on other sites

Hi,

Great!

Is the exe AutoIt? - ? source

But is it really that complicated?

You couldn't use this in your own poogram for copyright reasons, I presume; so how is it "free"?

We have at least one object....?

$oPDF = ObjCreate("AcroPDF.PDF.1");

$Version=$oPDF.GetVersions

Is it really hard to do with some obj calls?

It is going to be part of (or already is in) Word12 beta, so will be fairly easily accessible soon for many of us.

Randall

Link to comment
Share on other sites

My "PDF to TXT" program is written in PowerBasic. The DLL for manipulating PDF files can be distributed with a program that uses it, as long as the license key is not exposed (the source code initializes the library with a function call that passes the developer's license key as a parameter).

Let me clarify that the primary purpose of the library is creating PDF files and forms. I just took advantage of a feature that supports text extraction from an existing PDF.

Unfortunately, I don't think there is a free COM-based solution for converting a PDF to text--I looked far and wide. The Adobe COM libraries are commercial--in fact, they require an installation of Adobe Acrobat or equivalent.

There are other free executables (as opposed to COM or standard DLL libraries) for converting PDFs to text. Based on testing by myself and others, the one I developed with the ISEDQUICKPDF library works as well as any of them. Moreover, it permits convenient batch conversions, including converting all PDFs linked to an Internet web page.

Link to comment
Share on other sites

As far as I know, Adobe Reader allows one to open or print a PDF with command line parameters, but not save to text (doing that requires operation of the user interface). I have tried automating save as text by sending keystrokes, but found that it does not work reliably--at least for converting a batch of PDFs to text. Apparently, Adobe Reader is programmed to defy such automation attempts. No matter what delays I tried between keystrokes, etc., a lock up would occur. Also, Adobe Reader does not use standard menus, so the menu automation technique would not work either.

If anyone can demonstrate an automation solution with Adobe Reader, I would also be interested. In the meantime, my suggestion, based on experience, is to use a 3rd party command-line utility (for a free approach).

Link to comment
Share on other sites

  • 13 years later...
  • Jos locked this topic
Guest
This topic is now closed to further replies.
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...