Jump to content

UWPOCR - Windows Platform Optical character recognition API Implementation

Recommended Posts

Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available).



Edited by KaFu
Link to post
Share on other sites
  • Replies 53
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

Hello guys.  I recently saw some posts that Windows 10 provides OCR API. So I decided to create a UDF.     What's UWPOCR? UWPOCR UDF is a simple library to use Universal Windows Pl

Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available).    

Here is a script to compare UWPOCR vs Tesseract. (it just load image process and show result in editboxes and display the processed image to compare visually.  #include <WinAPI

Posted Images

On 4/17/2022 at 11:55 AM, KaFu said:

Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available).



Yes, I tried fa-IR, farsi-IR, Per-IR, persian-IR, fa, persian, per.

None of them worked. I've also installed additional persian lang. pack on my Windows 11.

Link to post
Share on other sites

I've installed Persian and tested it.

The GetText throws an error 7 here: _UWPOCR_Log("FAIL __UWPOCR_GetText -> WaitForAsync IOcrResult")

So the OCR engine does not seem to respond, that's where it lost me :), sorry, have no further clue.

Here's a test sentence:


Edited by KaFu
Link to post
Share on other sites
  • 1 month later...

I used this udf to OCR the content in Command Prompt. After I adjusted the ClearType settings in Control Panel. The recognition rate becomes very poor. Now I can't get back to to the initial result even I disabled it. Is there any requirement stated in the API reference?

Noticed the OCR on number string are not very accurate.

Cleartype off.jpg

Link to post
Share on other sites

Don't know about the ClearType setting, but maybe using a different font type and size for the command prompt will increase accuracy? Create a shortcut to cmd.exe, in the right-click properties you can adjust the layout settings.

Link to post
Share on other sites
  • 4 weeks later...

Is there a min. size the picture has to be? Because for small pictures(139x26 in my case) it doesnt work.
However if i make the picture bigger without changing the size of the text, it will detect it properly.

I have attached both files.
The first one (Test.jpg) doesnt work.
The second bigger one (Test2.jpg) does work

Any clue whats going on?



Edited by Patrik96
Link to post
Share on other sites
  • 2 months later...



I am interested in trying this out on my own program however I have a quick question.


I will be trying to use this on an application (would prefer not to take screenshots) so I would use the second example from @mLipok

1) if I am trying to find the location of a given text, ex “hello” and then get those location details to eventually left click center of that word, how would I add that?

Edited by Nick3399
Link to post
Share on other sites
  • 2 weeks later...

Superb toolset, many thanks.

In my process, I'm trying to read small black text/white background boxes placed on a page wide graphic. The decode is about 75% reliable and I'm looking for tips to improve that. The code is still way to messy to post here. The process is, in summary:

1. Use a WebCapture routine to capture the full page to a 1280x768 bitmap on a hidden window.
2. Convert bmp, using handle, to image using _GDIPlus_BitmapCreateFromHBITMAP($hBmp)
3. Crop image to extract the required box, using _GDIPlus_BitmapCloneArea()
4. Create a 100x200 blank white canvas and merge the cropped image into the middle of it, using _GDIPlus_ImageGetGraphicsContext() and  
 _GDIPlus_GraphicsDrawImage(). I do this because the OCR is unhappy about small images (but will detect small text on a large enough file!)
5. Finally using _UWPOCR_GetText() to extract the text

I have tried enlarging the cropped image to a larger size, using _GDIPlus_ImageResize() instead of step 4 but this introduced extraneous noise (randomly coloured pixels) around the character edges, which affected decode reliability.

Any suggestions on process techniques to maximise the OCR reliability, whilst retaining the inherent simplicity of using the built in Win10 OCR capabilities?

I'm not, on this occasion, looking for coding; it's more about suggestions on whether, for example, I should try capturing a bigger web page first, or if there's a way of specifying the font/size/content type which would give the OCR module a tighter focus, etc. 



Win10 x64
Autoit (compiling to x86)

Link to post
Share on other sites

It would be great to see the input image to be sure what suggestions to give you.




Link to post
Share on other sites

Thanks Dany,

Typical source snip attached as file TIM2.bmp, with GDIPlus_ImageT.jpg showing how it appears just before submitting to UWPOCR for processing, as follows:

#Include <UWPOCR.au3>

    Local $sTIMTextResult = _UWPOCR_GetText(@ScriptDir & "\GDIPlus_ImageT.jpg", "en-GB", False);True)
    msgbox(0,"Capture Time", $sTIMTextResult)

    $sTIMTextResult = StringReplace($sTIMTextResult," ","")
    $sTIMTextResult = StringReplace($sTIMTextResult,":","")
    $sTIMTextResult = StringReplace($sTIMTextResult,".","")
    $sTIMTextResult = StringReplace($sTIMTextResult,"O","0")
    $sTIMTextResult = StringReplace($sTIMTextResult,"C","0")
    $sTIMTextResult = StringReplace($sTIMTextResult,"I","1")

    $sTIMTextResult = StringLeft($sTIMTextResult,2) & ":" & StringMid($sTIMTextResult,3,2) & ":"  & StringMid($sTIMTextResult,5,2)

    msgbox(0,"Modified Capture Time", $sTIMTextResult)

I've used the cleanup code with some success to allow for misreads of the colons and 0/1 as O,C or I.

Since posting yesterday, I further experimented, a bit more systematically and found that:

- So long as the overall image was big enough, UWPOCR would at least try to decode the image
- The size of the text in the image, or the image itself, made little difference to the success
- Language setting and "UseOCRLine" parameters made no observable difference

What seems to be most promising at the moment is that I have just introduced a filter to force each pixel in the source image to either black or white, based on R,G and B being all greater than 240 being white, otherwise black. The decode reliability has increased dramatically. I can get away with the extra time used because the function is called infrequently and has a long window of opportunity to complete; also the images involved are comparatively small. So I think I have a solution to the immediate problem.

I believe that my source image may be not pure black and white and that's what is at the root of the poor decoding. To me, that points to the Windows 10 native OCR being weak - the UDF is certainly working well and is remarkably easy to understand, use and integrate. I'd certainly be interested in your thoughts.




//Edit: Well that theory has just been blown out of the water. I realised that the attached images were captured after I had applied the filter. So I turned it off to run new images... and the OCR is working fine !?! So new images attached without filter described above.






Edited by g0gcd
Updated information
Link to post
Share on other sites

Hello @g0gcd What I would do is to append your image to an image with a similar text pattern so that the OCR engine can get a better result.

So you will end up with a joined image like this one:


then process it with the OCR.

Test Code:

#include <ScreenCapture.au3>
#include <GDIPlus.au3>
#include "..\UWPOCR.au3"


Func _Example()

    ;hImage/hBitmap GDI
    Local $hTimer = TimerInit()
    Local $sImageFilePath = @ScriptDir & "\JoinedImage.jpg"
    Local $sImageTIM2FilePath = @ScriptDir & "\TIM2.bmp"
    Local $sText = "0123456789"
    Local Const $iW = 270, $iH = 40
    Local $hImageToProcess = _GDIPlus_ImageLoadFromFile($sImageTIM2FilePath)
    Local $hBitmap = _GDIPlus_BitmapCreateFromScan0($iW, $iH) ;create an empty bitmap
    Local $hBmpCtxt = _GDIPlus_ImageGetGraphicsContext($hBitmap) ;get the graphics context of the bitmap
    _GDIPlus_GraphicsSetSmoothingMode($hBmpCtxt, $GDIP_SMOOTHINGMODE_HIGHQUALITY)
    _GDIPlus_GraphicsClear($hBmpCtxt, 0xFFFFFFFF) ;clear bitmap with color white
    _GDIPlus_GraphicsDrawString($hBmpCtxt, $sText, 0, 0, "Arial", 18)  ;draw some text to the bitmap
    _GDIPlus_GraphicsDrawImage($hBmpCtxt, $hImageToProcess,140, -16)
    _GDIPlus_ImageSaveToFile($hBitmap, $sImageFilePath) ;save bitmap to disk
    Local $sOCRTextResult = _UWPOCR_GetText($sImageFilePath)
    MsgBox(0, "Time Elapsed: " & TimerDiff($hTimer),StringStripWS(StringReplace( $sOCRTextResult,$sText,""), $STR_STRIPALL))
EndFunc   ;==>_Example




Link to post
Share on other sites

That's inspired! ✔️

I'll try that approach and let you know how it goes.

Brilliant, thanks



Thank you so much DanyFirex!

By experiment I have found:

1. The canvas (combined image) itself needs to be of a significant size. I found that 200x200 pixels was the minimum; any less than this caused intermittent decoding irrespective of the image quality. My solution uses 900 wide by 300 tall.

2. The helper text provided assistance wherever it was placed on the canvas but best improvement came with placing it immediately to the left of the source image text, with about 1 or 2 "spaces" gap between the helper text and the source image text. No further improvement came from adding alphabetic characters or punctuation to the helper text. (Note: en-gb used)

3. The helper text helped immensely whatever font and size was used for it, but in my case, the best improvement came from using the same font and size as the source (Arial, 27pt).

4. I found that the source image text font size should be between 12 and 30. Too big and the OCR misreads and too small, the OCR doesn't see small characteristic differences. My source was 9pt from the capture process, which I enlarged to 27pt.

5. With poor source images, it does help to force pixels into pure black/white before merging into the canvas. There may be a UDF out there that does that but I did it by hand as I needed to find the edge of my source "white" box within a coloured image anyway.

6. The OCR seems to "like" a substantial white border around the two elements. If the helper text was too close to any edge, I had problems decoding. Between 50 and 100 pixels seemed to be a minimum acceptable border. (Note, this may also explain 1.)

7. I inserted __UWPOCR_Initialize() before each _UWPOCR_GetText() function call. (I process several boxes on each run). I can't see why this might be helpful but, whilst running repeated, frequent, testing I found several results were corrupted with previous or non-sensical values. I will continue to hunt my code for an error on my part, where I haven't cleared a variable, or have inadvertently re-used it! 

I hope these notes are helpful to anyone else who is struggling with decoding a less than perfect source image. My infinite thanks to DanyFirex for the pointer towards using "helper" text, as that unlocked a massive improvement in reliability.

Best Regards / Saludos




Edited by g0gcd
Update with findings
Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By matthewjs
      I am looking to code IsRunningAsUwp() detection for AutoIt Apps published via the Windows Bridge to UWP borrowing from code here in C#: DesktopBridgeHelpers/Helpers.cs at master · qmatteoq/DesktopBridgeHelpers · GitHub More info also here: GetCurrentPackageFullName function (appmodel.h) - Win32 apps | Microsoft Docs
      The P/Invoke equivalent looks to be a pain in AutoIt and I am sure that DllStructCreate|GetData|GetPtr etc are required so if anyone one else finds this of interest and useful to them they are most welcome to contribute: I hacked a workaround as IsRunningAsUwp() (I think its only the "\VFS\" that matches!) whereas IsRunningAsUwpToDo() is to be fixed and coded up properly using DLLStruct functions as I mentioned and I figure that there will be a Guru around here with this stuff as I have also heard that the AutoIt Devs are planning a move to UWP and the below is going to be pretty fundamental (at least until then although similar will likely wind up in the libraries eventually anyways..). 
      OutputDebugString() is here:
      #Include-once Func OutputDebugString($lpOutputString)     DllCall("kernel32.dll", "NONE", "OutputDebugString", "STR", $lpOutputString) EndFunc The script to be fixed is here: 
      #Include <OutputDebugString.au3> Const $APPMODEL_ERROR_NO_PACKAGE = 15700 Const $ERROR_INSUFFICIENT_BUFFER = 122 Func IsRunningAsUwp() If IsWindows7OrLower Then Return False EndIf Return StringinStr(@ScriptDir, "\WindowsApps\") > 0 Or StringInStr(@ScriptDir, "\VFS\") > 0 EndFunc Func IsRunningAsUwpToDo() If IsWindows7OrLower Then Return False EndIf Local $packageFullNameLength = 0; Local $packageFullName[$packageFullNameLength]; Local $result = DllCall("kernel32.dll", "LONG", "GetCurrentPackageFullName", "UINT32*", $packageFullNameLength, "PWSTR", $packageFullName) OutputDebugString("$result=" & String($result)) OutputDebugString("packageFullNameLength=" & String($packageFullNameLength)) OutputDebugString("packageFullName=" & String($packageFullName)) Local $packageFullName[$packageFullNameLength]; Local $result = DllCall("kernel32.dll", "LONG", "GetCurrentPackageFullName", "UINT32*", $packageFullNameLength, "PWSTR", $packageFullName) OutputDebugString("$result=" & String($result)) OutputDebugString("packageFullNameLength=" & String($packageFullNameLength)) OutputDebugString("packageFullName=" & String($packageFullName)) Return $result <> $APPMODEL_ERROR_NO_PACKAGE And $packageFullNameLength > 0 EndFunc Func IsWindows7OrLower() Local $objWMIService = ObjGet("winmgmts:\\localhost\root\CIMV2") Local $colItems = $objWMIService.ExecQuery("SELECT * FROM Win32_OperatingSystem", "WQL", 0x30) If IsObj($colItems) Then For $objItem In $colItems Local $version = $objItem.Version OutputDebugString("Win32_OperatingSystem.Version=" & $version) Return Number($version) <= 6.1 Next Else Msgbox(0, "", "No WMI Object for Version found in WMI Class Win32_OperatingSystem") Exit(-1) Endif Return False EndFunc Kindest Regards, Matthew 
    • By mLipok
      This is TeamViewer.au3 UDF for TeamViewer API.
      ; #INDEX# ======================================================================== ; Title .........: TeamViewer.au3 ; AutoIt Version : ; Language ......: English ; Description ...: A collection of function for use with TeamViewer API ; Author ........: mLipok ; Modified ......: ; URL ...........: ; URL ...........: https://www.teamviewer.com/ ; URL ...........: https://www.teamviewer.com/en/integrations/ ; URL ...........: https://integrate.teamviewer.com/en/develop/api/get-started/ ; URL ...........: https://downloadeu1.teamviewer.com/integrate/TeamViewer_API_Documentation.pdf ; Remarks .......: This UDF was created based on TeamViewer_API_Documentation.pdf v 1.4.1 ; Remarks .......: This UDF is using Free Chilkat component look here https://www.autoitscript.com/forum/files/file/433-chilkat-udf/ ; Remarks .......: Documentation is "work in progress" ; Date ..........: 2017/02/08 ; Version .......: 0.1.1 BETA - Work in progress ; ================================================================================ in TeamViewer_Example.au3 you can see few examples:

      Func _Example() ; If not exist then create new INI file from template If Not FileExists('TeamViewer_Example.ini') Then FileCopy('TeamViewer_Example — Template.ini', 'TeamViewer_Example.ini') ; Read Access Token from INI Local $sTV_AccessToken = IniRead('TeamViewer_Example.ini', 'Settings', 'AccessToken', '') If $sTV_AccessToken = '' Then ; Your Access Token, can be left empty when OAuth (below) is configured. ; ClientId = <----------------- Create an app in your TeamViewer Management Console and insert the client ID to the INI ; ClientSecret = <------------- Insert your client secret to the INI ; AuthorizationCode = <-------- Visit https://webapi.teamviewer.com/api/v1/oauth2/authorize?response_type=code&client_id=YOUR$i_ClientIdHERE ; Login, grant the permissions (popup) and put the code shown in the AuthorizationCode variable to the INI Local $sTVOAuth_ClientID = IniRead('TeamViewer_Example.ini', 'OAuth2', 'ClientID', '') Local $sTVOAuth_ClientSecret = IniRead('TeamViewer_Example.ini', 'OAuth2', 'ClientSecret', '') _IECreate('https://webapi.teamviewer.com/api/v1/oauth2/authorize?response_type=code&client_id=' & $sTVOAuth_ClientID) ; Local $sTVOAuth_AuthorizationCode = IniRead('TeamViewer_Example.ini', 'OAuth2', 'authorizationCode', '') Local $sTVOAuth_AuthorizationCode = InputBox('AuthorizationCode', 'Please provide TV OAuth2 AuthorizationCode') If @error Then Return If $sTVOAuth_ClientID Then $sTV_AccessToken = _TVAPI_RequestOAuth2_AccessToken($sTVOAuth_ClientID, $sTVOAuth_ClientSecret, $sTVOAuth_AuthorizationCode) EndIf If $sTV_AccessToken Then _TVAPI_AccessToken($sTV_AccessToken) If _TVAPI_Ping() = True Then ; ping API to check connection and $sTV_AccessToken _Example_TeamViewer__1_Devices_SaveToFile() ;~ _Example_TeamViewer__2_Devices_ChangeDetails() ;~ _Example_TeamViewer__3_Devices_GetDevicesSingleID() ;~ _Example_TeamViewer__4_Reports_GetAllConnections() ;~ _Example_TeamViewer__5_Users_GetUserInfomation() ;~ _Example_TeamViewer__6_Groups_ListGroups() ;~ _Example_TeamViewer__7_Devices_AddDeleteDevice() Else MsgBox(0, '_TVAPI_Ping', "$v_Token or connection problem.") EndIf EndFunc ;==>_Example You can download it here:
      I'm using TeamViewer_Example.ini to store my secret tokens/keys.
      [Settings] AccessToken= [OAuth2] ClientID= ClientSecret= authorizationCode=  
    • By MrKm
      This tiny yet powerful UDF will help you to convert Images to text with the help of  OCRSpace API version 3.50 .
      Detect text from a local file.
      ; ========================================================= ; Example 2 : Gets text from an image from a local path reference ; : Searchable PDF is not requested by default. ; : Processes it using a basic OCR logic. ; ========================================================= $b_Create_Searchable_PDF = True ; Use a table logic for receipt OCR $b_Table = True ; Set your key here. $v_OCRSpaceAPIKey = "" $OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, $b_Table, True, "eng", True, Default, Default, $b_Create_Searchable_PDF) $sText_Detected = _OCRSpace_ImageGetText($OCROptions, @scriptdir & "\receipt.jpg", 0, "SEARCHABLE_URL") ConsoleWrite( _ " Detected text : " & $sText_Detected & @CRLF & _ " Error Returned : " & @error & @CRLF & _ " PDF URL : " & Eval("SEARCHABLE_URL") & @CRLF)  
      Detect text from a URL reference.
      ; ========================================================= ; Example 1 : Gets text from an image using a url reference ; : Searchable PDF is not requested. ; : Processes it using a basic OCR logic. ; ========================================================= $v_OCRSpaceAPIKey = "" ; SetUp some preferences.. $OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, False, True, "eng", True, Default, Default, False) ; Make the request.. $sText_Detected = _OCRSpace_ImageGetText($OCROptions, "https://i.imgur.com/vbYXwJm.png", 0) ConsoleWrite( _ " Detected text : " & $sText_Detected & @CRLF & _ " Error Returned : " & @error & @CRLF)    
      Detect text from a URL reference to an array
      #include "OCRSpaceUDF\_OCRSpace_UDF.au3" #include <array.au3> ; Set your key here. $v_OCRSpaceAPIKey = "" $OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, $b_Table, True, "eng", True, Default, Default, False) ; Below, the return type is set to 1 to return an array containing the coordinates of the bounding boxes for each word detected, ; in the format : #WordDetected , #Left , #Top , 3Height, #Width $aText_Detected = _OCRSpace_ImageGetText($OCROptions, "https://i.imgur.com/Z1enogD.jpeg", 1) _ArrayDisplay($aText_Detected, "")  
      Download Latest Version : 
    • By nacerbaaziz
      hello guys, please i need your help
      am trying to work with CreateWindowEx api, i created the window with it controls, also i setup the call back function
      i'am using WinMSGLoop to focus with the keyboard.
      here i have a problem, i hope that you can help me.
      on the controls i used the UDF that comme with the autoit, such as _GUIButton_Create, _GUIListBox_Create....
      but i can't find a STATIC control UDF, for that i used this

      local $h_ssrvlbl = _WinAPI_CreateWindowEx(0, "STATIC", "الخادم", BitOr($WS_VISIBLE, $WS_CHILD, $WS_CLIPSIBLINGS, $WS_CLIPCHILDREN), 250, 10, 100, 20, $hWnd)
      as you can see here, there is an arabic text, so here is the problem, the arabic text isn't show normally, what is the problem here?
      also i have  an other question about keyboard focus, when i used WinMSGLoop, it worked, but if i press alt+tab to switch windows or focus an other window and return back to my window, the focus of control is kill.
      can any one help me to solve that please?
      my code will be as file here with the include files
      i hope can any one help me here
      thanks in advance
      speed Test win.zip
    • By rcmaehl
      Hi all, 

      Recently my work swapped from Cisco CTIOS to Finesse. This completely threw me off as I had been automating the Win32 application and I had never done IUIAutomation before. As such I've been messing around with the API and will be adding code as I figure it out. While I do have Supervisor access, I will likely not be adding functions for those features yet.
      Currently Available Functions:
      User API - Query and Set User Info
      Dialog API - Query and Set Call and other Dialog Info
      Queue API - Query Assigned Queues
      Team API - Query Users in a Team

      Support for this UDF can be obtained in my Discord Server
  • Create New...