Jump to content
Miliardsto

Tesseract OCR - Reading text

Recommended Posts

What have you tried, where is your script? 


If I posted any code, assume that code was written using the latest release version unless stated otherwise. Also, if it doesn't work on XP I can't help with that because I don't have access to XP, and I'm not going to.
Give a programmer the correct code and he can do his work for a day. Teach a programmer to debug and he can do his work for a lifetime - by Chirag Gude
How to ask questions the smart way!

I hereby grant any person the right to use any code I post, that I am the original author of, on the autoitscript.com forums, unless I've specifically stated otherwise in the code or the thread post. If you do use my code all I ask, as a courtesy, is to make note of where you got it from.

Back up and restore Windows user files _Array.au3 - Modified array functions that include support for 2D arrays.  -  ColorChooser - An add-on for SciTE that pops up a color dialog so you can select and paste a color code into a script.  -  Customizable Splashscreen GUI w/Progress Bar - Create a custom "splash screen" GUI with a progress bar and custom label.  -  _FileGetProperty - Retrieve the properties of a file  -  SciTE Toolbar - A toolbar demo for use with the SciTE editor  -  GUIRegisterMsg demo - Demo script to show how to use the Windows messages to interact with controls and your GUI.  -   Latin Square password generator

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • By PuneetTewani
      #include <IE.au3>
      #include <Tesseract.au3>
      #include <MsgBoxConstants.au3>
      #include <Math.au3>
      #include <FileConstants.au3>
      #include <StringConstants.au3>
      #include <File.au3>
      #include <ScreenCapture.au3>
      #include <sound.au3>
      #Include <WinAPI.au3>
      #include <Date.au3>
       
      $OCR_Result = _TesseractScreenCapture(0,"",1,2,220,660,500,730,1)
      $OCR_Result1 = _TesseractScreenCapture(0,"",1,2,220,660,500,730,1)
      $OCR_Result2 = _TesseractScreenCapture(0,"",1,2,220,660,500,730,1)
      $OCR_Result3 = _TesseractScreenCapture(0,"",1,2,220,660,500,730,1)

      $sound = _SoundStatus("C:\ExpertAdvisorBuyAlert.wav")
      while _nowtime < 3.30 pm
          If $sound = True Then
             if $OCR_Result1 > $OCR_Result2
             
          EndIf
      EndIf
      Wend
      Trying to ocr some values on chart in real time(once per minute) and buy/sell securities on basis of alert generated in my software.
      I am struck onto few steps.
      1. On Tesseract Screen Capture indentation parameters. How can we determine the exact parameters if I just want numeric values only.
      2. The Tesseract Screen Capture generates and error Obj1 on line 185 which needs to be resolved.
      3. Sometimes lines get overlapped with values. What to do in that case.
      3. Detecting the sound as and when it approaches and then comparing the ocr values to decide on either buy or sell.
      The values that needs to be fetched are encircled.

    • By mdepot
      I have a situation where I am repeatedly capturing a region of the screen and feeding it into Tesseract OCR.  Since the OCR is a relatively slow operation, I would like to create an in memory cache of the ocr results.  An ideal hash key for this cache would be a checksum of the captured image.  With this I could capture the region, checksum it, and then only if I don't get a cache hit I would write the image out to disk for external OCR.
      Now I know I can do this by saving the captured image out to disk, and then summing the disk file with _Crypt_HashFile().  But that's still slower than I would like, and it shouldn't be necessary.  Ideally, it should be possible to checksum the image data directly in memory so I don't have to go to disk at all.  In order to do that, I need a way to dump a representation of the image into a string  (or some equivalent).  Then I could use the _Crypt_HashData() function against that string to create my cache hash key.
      Googling around I found an article here that shows a way to convert an image object to a byte array using System.Drawing.  This was the closest thing I found to what I'm trying to do.  I don't know if that method could be used from within AutoIT, or if perhaps there may be a better way I don't know about.  If someone could give me a shove in the right direction it would be a big help.  Thanks!
    • By newITman
      HI All!
      Im new here and interested in  tesseract ocr.
      There are many examples in the forum but too difficult to me .
      I just want to see how its working in few line cod .
      I have installed  tesseract and microsoft office 2003 .
      My cod:
      $ImageToReadPath = @MyDocumentsDir & "\GDIPlus_Image10.jpg"
      $ResultTextPath = @MyDocumentsDir & "\Result"
      $OutPutPath = $ResultTextPath & "auto.txt"
      ;$TesseractExePath = @ProgramsDir & "\Tesseract.exe"
      $TesseractExePath =@ProgramFilesDir & "\Tesseract-OCR\tesseract.exe"
      ShellExecuteWait($TesseractExePath, '"' & $ImageToReadPath & '" "' & $ResultTextPath & '"', "", "", @SW_HIDE)
      If @error Then
          Exit MsgBox(0, "Error", @error)
      EndIf
      MsgBox(0, "Result", FileRead($OutPutPath))
      FileDelete($OutPutPath)
       
      Please help me.
      my picture:

    • By nbg15
      Hello everybody..
       
      i have this picture here *attached* and this script here: 
       
      $ImageToReadPath = @MyDocumentsDir & "\GDIPlus_Image2.jpg" $ResultTextPath = @MyDocumentsDir & "\Result" $OutPutPath = $ResultTextPath & ".txt" $TesseractExePath = @MyDocumentsDir & "\Tesseract.exe" ShellExecuteWait($TesseractExePath, '"' & $ImageToReadPath & '" "' & $ResultTextPath & '"', "", "", @SW_HIDE) If @error Then Exit MsgBox(0, "Error", @error) EndIf MsgBox(0, "Result", FileRead($OutPutPath)) FileDelete($OutPutPath)  
      but tesseract doesnt recognized the correct word... and gives me trash back...

      this is the image >> 
      and the result was >> "samm" 

      the image was an normal jpg and generated with this code here:
       
      _ScreenCapture_Capture(@MyDocumentsDir & "\GDIPlus_Image2.jpg", 712,268,853,284)
      Could anybody give me a hint what i can do better to get this easy image to text?
       
      thank u very much!!!
       
       
      Edit: i also tried to capture the screen as bmp with a higher resolution... nothing changed... 
       
       
      _ScreenCapture_SetBMPFormat(4) _ScreenCapture_Capture(@MyDocumentsDir & "\GDIPlus_Image.bmp", 712,279,853,295)  
×
×
  • Create New...