syno Posted January 15, 2009 Posted January 15, 2009 Hi guys I am trying to put together a script that searches a document for a given string and when it finds it returns some result. The code I have to do this so far is: File=FileRead("C:\Documents and Settings\My Documents\Software Testing\Autoit Scripts\Project\Calculation.pdf") $res=StringInStr($File,"income") if $res=0 then MsgBox(0,"No String Found","Please try again!") Exit EndIf MsgBox(0,"Yahoo","We have found one, It can be found at "&$res) However even if the string 'income' excists in the PDF document, nothing is found. Is it possible to use StringInStr to search strings in a PDF document? Thanks
FireFox Posted January 15, 2009 Posted January 15, 2009 @syno -try to write txt file with your pdf and after your search done you can rewrite the pdf if there are changes. -try to use string replace and if it return no error its that the string youre searching for exists : $read = FileRead("yourfile.ext") StringReplace($read, "income", "income") If Not @error then MsgBox(64, "income", "found !") EndIf Cheers, FireFox.
SpookMeister Posted January 15, 2009 Posted January 15, 2009 (edited) I don't think so. PDF documents have a "proprietary" format, and can not be read directly like strings of a text file. It might be possible to do something like open the file in Adobe (or another PDF viewer) and perform a search... not sure how much help you will be able to find here on it though. Edited January 15, 2009 by SpookMeister [u]Helpful tips:[/u]If you want better answers to your questions, take the time to reproduce your issue in a small "stand alone" example script whenever possible. Also, make sure you tell us 1) what you tried, 2) what you expected to happen, and 3) what happened instead.[u]Useful links:[/u]BrettF's update to LxP's "How to AutoIt" pdfValuater's Autoit 1-2-3 Download page for the latest versions of Autoit and SciTE[quote]<glyph> For example - if you came in here asking "how do I use a jackhammer" we might ask "why do you need to use a jackhammer"<glyph> If the answer to the latter question is "to knock my grandmother's head off to let out the evil spirits that gave her cancer", then maybe the problem is actually unrelated to jackhammers[/quote]
trancexx Posted January 15, 2009 Posted January 15, 2009 Hi guys I am trying to put together a script that searches a document for a given string and when it finds it returns some result. The code I have to do this so far is: File=FileRead("C:\Documents and Settings\My Documents\Software Testing\Autoit Scripts\Project\Calculation.pdf") $res=StringInStr($File,"income") if $res=0 then MsgBox(0,"No String Found","Please try again!") Exit EndIf MsgBox(0,"Yahoo","We have found one, It can be found at "&$res) However even if the string 'income' excists in the PDF document, nothing is found. Is it possible to use StringInStr to search strings in a PDF document? ThanksText inside PDF file is compressed (Flate or LZW algorithm depending on application that created that file), that's why you cannot find that string. Text portion(s) are easily detected but as I said before, compressed. ♡♡♡ . eMyvnE
syno Posted January 16, 2009 Author Posted January 16, 2009 Ok, that makes sense. In that case is there not some other way of converting the PDF document into a text file using Autoit for this purpose. If not then it looks like I will need to use a pdf to doc converter first... Thanks for your help...
ptrex Posted January 17, 2009 Posted January 17, 2009 @allOf course everything is searchable in a PC, if you use the right tools and approach.For your needs you need to fall back on the native MS Indexing service and extend it with the DF Filter.Read more over here MS Indexing ServiceI hope this gets you started.regardsptrex Contributions :Firewall Log Analyzer for XP - Creating COM objects without a need of DLL's - UPnP support in AU3Crystal Reports Viewer - PDFCreator in AutoIT - Duplicate File FinderSQLite3 Database functionality - USB Monitoring - Reading Excel using SQLRun Au3 as a Windows Service - File Monitor - Embedded Flash PlayerDynamic Functions - Control Panel Applets - Digital Signing Code - Excel Grid In AutoIT - Constants for Special Folders in WindowsRead data from Any Windows Edit Control - SOAP and Web Services in AutoIT - Barcode Printing Using PS - AU3 on LightTD WebserverMS LogParser SQL Engine in AutoIT - ImageMagick Image Processing - Converter @ Dec - Hex - Bin -Email Address Encoder - MSI Editor - SNMP - MIB ProtocolFinancial Functions UDF - Set ACL Permissions - Syntax HighLighter for AU3ADOR.RecordSet approach - Real OCR - HTTP Disk - PDF Reader Personal Worldclock - MS Indexing Engine - Printing ControlsGuiListView - Navigation (break the 4000 Limit barrier) - Registration Free COM DLL Distribution - Update - WinRM SMART Analysis - COM Object Browser - Excel PivotTable Object - VLC Media Player - Windows LogOnOff Gui -Extract Data from Outlook to Word & Excel - Analyze Event ID 4226 - DotNet Compiler Wrapper - Powershell_COM - New
trancexx Posted January 17, 2009 Posted January 17, 2009 Ok, that makes sense. In that case is there not some other way of converting the PDF document into a text file using Autoit for this purpose. If not then it looks like I will need to use a pdf to doc converter first...Thanks for your help...I've made request for LZW algorithm in machine code thread by Ward mainly for this purposes. Got no response, but I guess you never know. I'm still waiting. ♡♡♡ . eMyvnE
ReFran Posted January 17, 2009 Posted January 17, 2009 With the free commandline tool PDFTK.exe you can uncompress and then search for text. That only doesn't work with images, scanned documents, .... normal text is nor problem. You can also search for text using autoit and Adobe Reader using the menuitem find, or using some Adobe JS-code to get the page(s) where it is on. Which way you go depends also on the results you want - Page Numbers, or only occurence, or, ... So you may state a little bit more. Best regards, Reinhard
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now