Jump to content

OCRSpace UDF


MrKm
 Share

Recommended Posts

AutoIT-OCRSpace-UDF1.3.zip

This tiny yet powerful UDF will help you to convert Images to text with the help of  OCRSpace API version 3.50 .

Detect text from a local file.

; =========================================================
; Example 2 : Gets text from an image from a local path reference
;           : Searchable PDF is not requested by default.
;           : Processes it using a basic OCR logic.
; =========================================================

$b_Create_Searchable_PDF = True

; Use a table logic for receipt OCR
$b_Table = True

; Set your key here.
$v_OCRSpaceAPIKey = ""

$OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, $b_Table, True, "eng", True, Default, Default, $b_Create_Searchable_PDF)
$sText_Detected = _OCRSpace_ImageGetText($OCROptions, @scriptdir & "\receipt.jpg", 0, "SEARCHABLE_URL")

ConsoleWrite( _
        " Detected text   : " & $sText_Detected & @CRLF & _
        " Error Returned  : " & @error & @CRLF & _
        " PDF URL         : " & Eval("SEARCHABLE_URL") & @CRLF)

 

Detect text from a URL reference.

; =========================================================
; Example 1 : Gets text from an image using a url reference
;           : Searchable PDF is not requested.
;           : Processes it using a basic OCR logic.
; =========================================================

$v_OCRSpaceAPIKey = ""

; SetUp some preferences..
$OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, False, True, "eng", True, Default, Default, False)
; Make the request..
$sText_Detected = _OCRSpace_ImageGetText($OCROptions, "https://i.imgur.com/vbYXwJm.png", 0)

ConsoleWrite( _
        " Detected text   : " & $sText_Detected & @CRLF & _
        " Error Returned  : " & @error & @CRLF)
 

 

Detect text from a URL reference to an array

#include "OCRSpaceUDF\_OCRSpace_UDF.au3"
#include <array.au3>

; Set your key here.
$v_OCRSpaceAPIKey = ""
    
$OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, $b_Table, True, "eng", True, Default, Default, False)
    
; Below, the return type is set to 1 to return an array containing the coordinates of the bounding boxes for each word detected,
; in the format : #WordDetected , #Left , #Top , 3Height, #Width
    
$aText_Detected = _OCRSpace_ImageGetText($OCROptions, "https://i.imgur.com/Z1enogD.jpeg", 1)

_ArrayDisplay($aText_Detected, "")

 

Download Latest Version

https://github.com/MurageKabui/AutoIT-OCRSpace-UDF

Edited by MrKm
Updated project URL.
Link to comment
Share on other sites

  • 5 months later...

hi, thank you for this! I have had problems with accuracy with Tesseract (specifically numbers). I found this to be more reliable. Although sometimes it works better with Engine 1, sometimes Engine 2.

Also, sometimes I get what I assume to be a timeout message. Any idea on how to handle this:

"C:\Program Files (x86)\AutoIt3\Include\_OCRSpace_UDF.au3" (508) : ==> Unknown function name.:
If Not _WinAPI_IsInternetConnected() Then
If Not ^ ERROR

Link to comment
Share on other sites

  • Moderators

Hi,

Please make sure you read the  OCRSpace API link before you get too involved in this UDF. This is NOT a stand-alone OCR solution and requires internet linkage to the host server with various limits on frequency and size for the free subscription. I do not want to put anyone off, but this is not clear from the OP.

M23

 

Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:

Spoiler

ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area

 

Link to comment
Share on other sites

On 1/28/2022 at 10:02 PM, drrehak said:

hi, thank you for this! I have had problems with accuracy with Tesseract (specifically numbers). I found this to be more reliable. Although sometimes it works better with Engine 1, sometimes Engine 2.

Also, sometimes I get what I assume to be a timeout message. Any idea on how to handle this:

"C:\Program Files (x86)\AutoIt3\Include\_OCRSpace_UDF.au3" (508) : ==> Unknown function name.:
If Not _WinAPI_IsInternetConnected() Then
If Not ^ ERROR

You're welcome and thanks too for feedback !

_WinAPI_IsInternetConnected() is declared in  WinAPIDiag.au3. My bad, I forgot to include it in the udf. You can include it in your script or in the OCRSpace UDF by :

#include <WinAPIDiag.au3>
Link to comment
Share on other sites

  • 2 months later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...