Max82 Posted November 29, 2008 Share Posted November 29, 2008 Someone can help me please? I am a novice in Autoit programming. I have to do for my teacher a little program that checks all the files inside a folder (about 10000 pdf files), for each one checks sequentially if it's an image file (a simple scanned image without text) or if it's a real pdf file with text inside it (OCR), and moves every image file (without text) in another folder for further processing (with an OCR program). Doing the same routine manually would be an endless task, so the teacher asks me to program this little software to do so. It's possible with Autoit? Maybe creating an array? And how could I check the presence of text inside each file? Any suggestions of anybody will be greatly appreciated. Max from Rome (Italy) Link to comment Share on other sites More sharing options...
IKilledBambi Posted November 29, 2008 Share Posted November 29, 2008 (edited) For checking if there is text in the file you could try checking the size maybe? - DirGetSize ( "path" [, flag] ) This also might be useful? FileRead ( filehandle or "filename" [, count] ) P.S. if this is useful i'm pretty sure it would be filehandle. P.P.S. In the scite script editor try pressing the F1 key and using the help file. ~Bambeh Edited November 29, 2008 by IKilledBambi Link to comment Share on other sites More sharing options...
TehWhale Posted November 29, 2008 Share Posted November 29, 2008 _FileListToArray(), FileRead, FileMove. Link to comment Share on other sites More sharing options...
goldenix Posted November 29, 2008 Share Posted November 29, 2008 (edited) you have 1 folder & it contains many files with no extensions? or some are picture files that are renamed to *.pdf ? I mean, maybe you can use windows sorting option, Right click & arrange icons by Type then select all pictures & move to other dir. Edited November 29, 2008 by goldenix My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list] Link to comment Share on other sites More sharing options...
Max82 Posted November 29, 2008 Author Share Posted November 29, 2008 Thank you so much TehWhale, unfortunely I am a real novice with Autoit as I already said. So I have to start the script with _FileListArray() to populate the array with a single file in the directory, right? And the FileRead is useful to check if the single file can be opened for text reading? This function can handle also pdf files? So I have to create the array with _FileListArray, launch a loop function to check each file (maybe with For Next?) check if the file can be opened with FileRead, if an error message is returned move the file to the other folder with FileMove? Thank you so much if you wanna answer me in your spare timeMax _FileListToArray(), FileRead, FileMove. Link to comment Share on other sites More sharing options...
TehWhale Posted November 29, 2008 Share Posted November 29, 2008 (edited) Thank you so much TehWhale, unfortunely I am a real novice with Autoit as I already said. So I have to start the script with _FileListArray() to populate the array with a single file in the directory, right? And the FileRead is useful to check if the single file can be opened for text reading? This function can handle also pdf files? So I have to create the array with _FileListArray, launch a loop function to check each file (maybe with For Next?) check if the file can be opened with FileRead, if an error message is returned move the file to the other folder with FileMove? Thank you so much if you wanna answer me in your spare time MaxThat's pretty much it. #Include <File.au3> $Array = _FileListToArray("C:\docs\") $Detector = "image";something that is only IN image files For $i=1 To $Array[0] If StringInStr(FileRead(C:\docs\" & $Array[$i]]), $Detector) Then FileMove("C:\docs\" & $Array[$i], "C:\images\" & $Array[$i]) Next This should help you out a bit. You need to go through a few image files (open them in notepad) and see if theres something in common about all image files and put that in the $Detector variable. Edited November 29, 2008 by TehWhale Link to comment Share on other sites More sharing options...
Max82 Posted November 29, 2008 Author Share Posted November 29, 2008 Thank you so much TehWhale I'll follow your advices. Maybe is better to check all the pdf files (with text inside it), for a generic string (a single letter as "a" for example, present in documents in any language) so if this control is not found surely the file is an image file (with no text inside) and can be moved in the other folder. Thank you so much, I'll give you a feedback if I'll succed on it. I owe you a Pizza (or espresso if you prefer) when you'll go in Roma Max That's pretty much it. #Include <File.au3> $Array = _FileListToArray("C:\docs\") $Detector = "image";something that is only IN image files For $i=1 To $Array[0] If StringInStr(FileRead(C:\docs\" & $Array[$i]]), $Detector) Then FileMove("C:\docs\" & $Array[$i], "C:\images\" & $Array[$i]) Next This should help you out a bit. You need to go through a few image files (open them in notepad) and see if theres something in common about all image files and put that in the $Detector variable. Link to comment Share on other sites More sharing options...
ReFran Posted November 29, 2008 Share Posted November 29, 2008 If you find in a PDF the key: "/Subtype /Image" there will be a Image in. HTH, Reinhard Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now