Sign in to follow this  
Followers 0
Max82

PLEASE HELP ME :)

8 posts in this topic

Someone can help me please? I am a novice in Autoit programming. I have to do for my teacher a little program that checks all the files inside a folder (about 10000 pdf files), for each one checks sequentially if it's an image file (a simple scanned image without text) or if it's a real pdf file with text inside it (OCR), and moves every image file (without text) in another folder for further processing (with an OCR program). Doing the same routine manually would be an endless task, so the teacher asks me to program this little software to do so. It's possible with Autoit? Maybe creating an array? And how could I check the presence of text inside each file? Any suggestions of anybody will be greatly appreciated.

Max from Rome (Italy)

Share this post


Link to post
Share on other sites



#2 ·  Posted (edited)

For checking if there is text in the file you could try checking the size maybe? - DirGetSize ( "path" [, flag] )

This also might be useful?

FileRead ( filehandle or "filename" [, count] ) P.S. if this is useful i'm pretty sure it would be filehandle.

P.P.S. In the scite script editor try pressing the F1 key and using the help file.

~Bambeh

Edited by IKilledBambi

Share this post


Link to post
Share on other sites

_FileListToArray(), FileRead, FileMove.

Share this post


Link to post
Share on other sites

#4 ·  Posted (edited)

you have 1 folder & it contains many files with no extensions? or some are picture files that are renamed to *.pdf ?

I mean, maybe you can use windows sorting option, Right click & arrange icons by Type then select all pictures & move to other dir.

Edited by goldenix

My Projects:[list][*]Guide - ytube step by step tut for reading memory with autoitscript + samples[*]WinHide - tool to show hide windows, Skinned With GDI+[*]Virtualdub batch job list maker - Batch Process all files with same settings[*]Exp calc - Exp calculator for online games[*]Automated Microsoft SQL Server 2000 installer[*]Image sorter helper for IrfanView - 1 click opens img & move ur mouse to close opened img[/list]

Share this post


Link to post
Share on other sites

Thank you so much TehWhale, unfortunely I am a real novice with Autoit as I already said. So I have to start the script with _FileListArray() to populate the array with a single file in the directory, right? And the FileRead is useful to check if the single file can be opened for text reading? This function can handle also pdf files? So I have to create the array with _FileListArray, launch a loop function to check each file (maybe with For Next?) check if the file can be opened with FileRead, if an error message is returned move the file to the other folder with FileMove?

Thank you so much if you wanna answer me in your spare time

Max

_FileListToArray(), FileRead, FileMove.

Share this post


Link to post
Share on other sites

#6 ·  Posted (edited)

Thank you so much TehWhale, unfortunely I am a real novice with Autoit as I already said. So I have to start the script with _FileListArray() to populate the array with a single file in the directory, right? And the FileRead is useful to check if the single file can be opened for text reading? This function can handle also pdf files? So I have to create the array with _FileListArray, launch a loop function to check each file (maybe with For Next?) check if the file can be opened with FileRead, if an error message is returned move the file to the other folder with FileMove?

Thank you so much if you wanna answer me in your spare time

Max

That's pretty much it.

#Include <File.au3>
$Array = _FileListToArray("C:\docs\")
$Detector = "image";something that is only IN image files
For $i=1 To $Array[0]
If StringInStr(FileRead(C:\docs\" & $Array[$i]]), $Detector) Then FileMove("C:\docs\" & $Array[$i], "C:\images\" & $Array[$i])
Next

This should help you out a bit. You need to go through a few image files (open them in notepad) and see if theres something in common about all image files and put that in the $Detector variable.

Edited by TehWhale

Share this post


Link to post
Share on other sites

Thank you so much TehWhale I'll follow your advices. Maybe is better to check all the pdf files (with text inside it), for a generic string (a single letter as "a" for example, present in documents in any language) so if this control is not found surely the file is an image file (with no text inside) and can be moved in the other folder. Thank you so much, I'll give you a feedback if I'll succed on it. I owe you a Pizza (or espresso if you prefer) when you'll go in Roma :)

Max

That's pretty much it.

#Include <File.au3>
$Array = _FileListToArray("C:\docs\")
$Detector = "image";something that is only IN image files
For $i=1 To $Array[0]
If StringInStr(FileRead(C:\docs\" & $Array[$i]]), $Detector) Then FileMove("C:\docs\" & $Array[$i], "C:\images\" & $Array[$i])
Next

This should help you out a bit. You need to go through a few image files (open them in notepad) and see if theres something in common about all image files and put that in the $Detector variable.

Share this post


Link to post
Share on other sites

If you find in a PDF the key:

"/Subtype /Image"

there will be a Image in.

HTH, Reinhard

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0