Jump to content

UWPOCR - Windows Platform Optical character recognition API Implementation


Recommended Posts

Link to post
Share on other sites
  • Replies 50
  • Created
  • Last Reply

Top Posters In This Topic

Top Posters In This Topic

Popular Posts

Hello guys.  I recently saw some posts that Windows 10 provides OCR API. So I decided to create a UDF.     What's UWPOCR? UWPOCR UDF is a simple library to use Universal Windows Pl

Here is a script to compare UWPOCR vs Tesseract. (it just load image process and show result in editboxes and display the processed image to compare visually.  #include <WinAPI

Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available).    

Posted Images

On 4/17/2022 at 11:55 AM, KaFu said:

Try “fa-IR”? Also check if it is supported by your OS and install additional language pack if required (and available).

 

 

Yes, I tried fa-IR, farsi-IR, Per-IR, persian-IR, fa, persian, per.

None of them worked. I've also installed additional persian lang. pack on my Windows 11.

Link to post
Share on other sites

I've installed Persian and tested it.

The GetText throws an error 7 here: _UWPOCR_Log("FAIL __UWPOCR_GetText -> WaitForAsync IOcrResult")

So the OCR engine does not seem to respond, that's where it lost me :), sorry, have no further clue.

Here's a test sentence:

Persian.png

Edited by KaFu
Link to post
Share on other sites
  • 1 month later...

I used this udf to OCR the content in Command Prompt. After I adjusted the ClearType settings in Control Panel. The recognition rate becomes very poor. Now I can't get back to to the initial result even I disabled it. Is there any requirement stated in the API reference?

Noticed the OCR on number string are not very accurate.

Cleartype off.jpg

Link to post
Share on other sites

Don't know about the ClearType setting, but maybe using a different font type and size for the command prompt will increase accuracy? Create a shortcut to cmd.exe, in the right-click properties you can adjust the layout settings.

Link to post
Share on other sites
  • 4 weeks later...

Is there a min. size the picture has to be? Because for small pictures(139x26 in my case) it doesnt work.
However if i make the picture bigger without changing the size of the text, it will detect it properly.

I have attached both files.
The first one (Test.jpg) doesnt work.
The second bigger one (Test2.jpg) does work

Any clue whats going on?

Test.jpg

Test2.jpg

Edited by Patrik96
Link to post
Share on other sites
  • 2 months later...

Hello,

 

I am interested in trying this out on my own program however I have a quick question.

 

I will be trying to use this on an application (would prefer not to take screenshots) so I would use the second example from @mLipok
 

1) if I am trying to find the location of a given text, ex “hello” and then get those location details to eventually left click center of that word, how would I add that?

Edited by Nick3399
Link to post
Share on other sites
  • 2 weeks later...

Superb toolset, many thanks.

In my process, I'm trying to read small black text/white background boxes placed on a page wide graphic. The decode is about 75% reliable and I'm looking for tips to improve that. The code is still way to messy to post here. The process is, in summary:

1. Use a WebCapture routine to capture the full page to a 1280x768 bitmap on a hidden window.
2. Convert bmp, using handle, to image using _GDIPlus_BitmapCreateFromHBITMAP($hBmp)
3. Crop image to extract the required box, using _GDIPlus_BitmapCloneArea()
4. Create a 100x200 blank white canvas and merge the cropped image into the middle of it, using _GDIPlus_ImageGetGraphicsContext() and  
 _GDIPlus_GraphicsDrawImage(). I do this because the OCR is unhappy about small images (but will detect small text on a large enough file!)
5. Finally using _UWPOCR_GetText() to extract the text

I have tried enlarging the cropped image to a larger size, using _GDIPlus_ImageResize() instead of step 4 but this introduced extraneous noise (randomly coloured pixels) around the character edges, which affected decode reliability.

Any suggestions on process techniques to maximise the OCR reliability, whilst retaining the inherent simplicity of using the built in Win10 OCR capabilities?

I'm not, on this occasion, looking for coding; it's more about suggestions on whether, for example, I should try capturing a bigger web page first, or if there's a way of specifying the font/size/content type which would give the OCR module a tighter focus, etc. 

Thanks

John

Win10 x64
Autoit 3.3.14.5 (compiling to x86)

Link to post
Share on other sites

It would be great to see the input image to be sure what suggestions to give you.

 

Saludos

 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Similar Content

    • By matthewjs
      I am looking to code IsRunningAsUwp() detection for AutoIt Apps published via the Windows Bridge to UWP borrowing from code here in C#: DesktopBridgeHelpers/Helpers.cs at master · qmatteoq/DesktopBridgeHelpers · GitHub More info also here: GetCurrentPackageFullName function (appmodel.h) - Win32 apps | Microsoft Docs
      The P/Invoke equivalent looks to be a pain in AutoIt and I am sure that DllStructCreate|GetData|GetPtr etc are required so if anyone one else finds this of interest and useful to them they are most welcome to contribute: I hacked a workaround as IsRunningAsUwp() (I think its only the "\VFS\" that matches!) whereas IsRunningAsUwpToDo() is to be fixed and coded up properly using DLLStruct functions as I mentioned and I figure that there will be a Guru around here with this stuff as I have also heard that the AutoIt Devs are planning a move to UWP and the below is going to be pretty fundamental (at least until then although similar will likely wind up in the libraries eventually anyways..). 
      OutputDebugString() is here:
      #Include-once Func OutputDebugString($lpOutputString)     DllCall("kernel32.dll", "NONE", "OutputDebugString", "STR", $lpOutputString) EndFunc The script to be fixed is here: 
      #Include <OutputDebugString.au3> Const $APPMODEL_ERROR_NO_PACKAGE = 15700 Const $ERROR_INSUFFICIENT_BUFFER = 122 Func IsRunningAsUwp() If IsWindows7OrLower Then Return False EndIf Return StringinStr(@ScriptDir, "\WindowsApps\") > 0 Or StringInStr(@ScriptDir, "\VFS\") > 0 EndFunc Func IsRunningAsUwpToDo() If IsWindows7OrLower Then Return False EndIf Local $packageFullNameLength = 0; Local $packageFullName[$packageFullNameLength]; Local $result = DllCall("kernel32.dll", "LONG", "GetCurrentPackageFullName", "UINT32*", $packageFullNameLength, "PWSTR", $packageFullName) OutputDebugString("$result=" & String($result)) OutputDebugString("packageFullNameLength=" & String($packageFullNameLength)) OutputDebugString("packageFullName=" & String($packageFullName)) Local $packageFullName[$packageFullNameLength]; Local $result = DllCall("kernel32.dll", "LONG", "GetCurrentPackageFullName", "UINT32*", $packageFullNameLength, "PWSTR", $packageFullName) OutputDebugString("$result=" & String($result)) OutputDebugString("packageFullNameLength=" & String($packageFullNameLength)) OutputDebugString("packageFullName=" & String($packageFullName)) Return $result <> $APPMODEL_ERROR_NO_PACKAGE And $packageFullNameLength > 0 EndFunc Func IsWindows7OrLower() Local $objWMIService = ObjGet("winmgmts:\\localhost\root\CIMV2") Local $colItems = $objWMIService.ExecQuery("SELECT * FROM Win32_OperatingSystem", "WQL", 0x30) If IsObj($colItems) Then For $objItem In $colItems Local $version = $objItem.Version OutputDebugString("Win32_OperatingSystem.Version=" & $version) Return Number($version) <= 6.1 Next Else Msgbox(0, "", "No WMI Object for Version found in WMI Class Win32_OperatingSystem") Exit(-1) Endif Return False EndFunc Kindest Regards, Matthew 
    • By mLipok
      This is TeamViewer.au3 UDF for TeamViewer API.
      ; #INDEX# ======================================================================== ; Title .........: TeamViewer.au3 ; AutoIt Version : 3.3.10.2++ ; Language ......: English ; Description ...: A collection of function for use with TeamViewer API ; Author ........: mLipok ; Modified ......: ; URL ...........: ; URL ...........: https://www.teamviewer.com/ ; URL ...........: https://www.teamviewer.com/en/integrations/ ; URL ...........: https://integrate.teamviewer.com/en/develop/api/get-started/ ; URL ...........: https://downloadeu1.teamviewer.com/integrate/TeamViewer_API_Documentation.pdf ; Remarks .......: This UDF was created based on TeamViewer_API_Documentation.pdf v 1.4.1 ; Remarks .......: This UDF is using Free Chilkat component look here https://www.autoitscript.com/forum/files/file/433-chilkat-udf/ ; Remarks .......: Documentation is "work in progress" ; Date ..........: 2017/02/08 ; Version .......: 0.1.1 BETA - Work in progress ; ================================================================================ in TeamViewer_Example.au3 you can see few examples:

       
      Func _Example() ; If not exist then create new INI file from template If Not FileExists('TeamViewer_Example.ini') Then FileCopy('TeamViewer_Example — Template.ini', 'TeamViewer_Example.ini') ; Read Access Token from INI Local $sTV_AccessToken = IniRead('TeamViewer_Example.ini', 'Settings', 'AccessToken', '') If $sTV_AccessToken = '' Then ; Your Access Token, can be left empty when OAuth (below) is configured. ; ClientId = <----------------- Create an app in your TeamViewer Management Console and insert the client ID to the INI ; ClientSecret = <------------- Insert your client secret to the INI ; AuthorizationCode = <-------- Visit https://webapi.teamviewer.com/api/v1/oauth2/authorize?response_type=code&client_id=YOUR$i_ClientIdHERE ; Login, grant the permissions (popup) and put the code shown in the AuthorizationCode variable to the INI Local $sTVOAuth_ClientID = IniRead('TeamViewer_Example.ini', 'OAuth2', 'ClientID', '') Local $sTVOAuth_ClientSecret = IniRead('TeamViewer_Example.ini', 'OAuth2', 'ClientSecret', '') _IECreate('https://webapi.teamviewer.com/api/v1/oauth2/authorize?response_type=code&client_id=' & $sTVOAuth_ClientID) ; Local $sTVOAuth_AuthorizationCode = IniRead('TeamViewer_Example.ini', 'OAuth2', 'authorizationCode', '') Local $sTVOAuth_AuthorizationCode = InputBox('AuthorizationCode', 'Please provide TV OAuth2 AuthorizationCode') If @error Then Return If $sTVOAuth_ClientID Then $sTV_AccessToken = _TVAPI_RequestOAuth2_AccessToken($sTVOAuth_ClientID, $sTVOAuth_ClientSecret, $sTVOAuth_AuthorizationCode) EndIf If $sTV_AccessToken Then _TVAPI_AccessToken($sTV_AccessToken) If _TVAPI_Ping() = True Then ; ping API to check connection and $sTV_AccessToken _Example_TeamViewer__1_Devices_SaveToFile() ;~ _Example_TeamViewer__2_Devices_ChangeDetails() ;~ _Example_TeamViewer__3_Devices_GetDevicesSingleID() ;~ _Example_TeamViewer__4_Reports_GetAllConnections() ;~ _Example_TeamViewer__5_Users_GetUserInfomation() ;~ _Example_TeamViewer__6_Groups_ListGroups() ;~ _Example_TeamViewer__7_Devices_AddDeleteDevice() Else MsgBox(0, '_TVAPI_Ping', "$v_Token or connection problem.") EndIf EndFunc ;==>_Example You can download it here:
      I'm using TeamViewer_Example.ini to store my secret tokens/keys.
      [Settings] AccessToken= [OAuth2] ClientID= ClientSecret= authorizationCode=  
    • By MrKm
      AutoIT-OCRSpace-UDF1.3.zip
      This tiny yet powerful UDF will help you to convert Images to text with the help of  OCRSpace API version 3.50 .
      Detect text from a local file.
      ; ========================================================= ; Example 2 : Gets text from an image from a local path reference ; : Searchable PDF is not requested by default. ; : Processes it using a basic OCR logic. ; ========================================================= $b_Create_Searchable_PDF = True ; Use a table logic for receipt OCR $b_Table = True ; Set your key here. $v_OCRSpaceAPIKey = "" $OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, $b_Table, True, "eng", True, Default, Default, $b_Create_Searchable_PDF) $sText_Detected = _OCRSpace_ImageGetText($OCROptions, @scriptdir & "\receipt.jpg", 0, "SEARCHABLE_URL") ConsoleWrite( _ " Detected text : " & $sText_Detected & @CRLF & _ " Error Returned : " & @error & @CRLF & _ " PDF URL : " & Eval("SEARCHABLE_URL") & @CRLF)  
      Detect text from a URL reference.
      ; ========================================================= ; Example 1 : Gets text from an image using a url reference ; : Searchable PDF is not requested. ; : Processes it using a basic OCR logic. ; ========================================================= $v_OCRSpaceAPIKey = "" ; SetUp some preferences.. $OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, False, True, "eng", True, Default, Default, False) ; Make the request.. $sText_Detected = _OCRSpace_ImageGetText($OCROptions, "https://i.imgur.com/vbYXwJm.png", 0) ConsoleWrite( _ " Detected text : " & $sText_Detected & @CRLF & _ " Error Returned : " & @error & @CRLF)    
      Detect text from a URL reference to an array
      #include "OCRSpaceUDF\_OCRSpace_UDF.au3" #include <array.au3> ; Set your key here. $v_OCRSpaceAPIKey = "" $OCROptions = _OCRSpace_SetUpOCR($v_OCRSpaceAPIKey, 1, $b_Table, True, "eng", True, Default, Default, False) ; Below, the return type is set to 1 to return an array containing the coordinates of the bounding boxes for each word detected, ; in the format : #WordDetected , #Left , #Top , 3Height, #Width $aText_Detected = _OCRSpace_ImageGetText($OCROptions, "https://i.imgur.com/Z1enogD.jpeg", 1) _ArrayDisplay($aText_Detected, "")  
       
       
       
       
      https://github.com/KabueMurage/AutoIT-OCRSpace-UDF
      Download Latest Version : 
       
    • By nacerbaaziz
      hello guys, please i need your help
      am trying to work with CreateWindowEx api, i created the window with it controls, also i setup the call back function
      i'am using WinMSGLoop to focus with the keyboard.
      here i have a problem, i hope that you can help me.
      on the controls i used the UDF that comme with the autoit, such as _GUIButton_Create, _GUIListBox_Create....
      but i can't find a STATIC control UDF, for that i used this

      local $h_ssrvlbl = _WinAPI_CreateWindowEx(0, "STATIC", "الخادم", BitOr($WS_VISIBLE, $WS_CHILD, $WS_CLIPSIBLINGS, $WS_CLIPCHILDREN), 250, 10, 100, 20, $hWnd)
      as you can see here, there is an arabic text, so here is the problem, the arabic text isn't show normally, what is the problem here?
      also i have  an other question about keyboard focus, when i used WinMSGLoop, it worked, but if i press alt+tab to switch windows or focus an other window and return back to my window, the focus of control is kill.
      can any one help me to solve that please?
      my code will be as file here with the include files
      i hope can any one help me here
      thanks in advance
       
      speed Test win.zip
    • By rcmaehl
      Hi all, 

      Recently my work swapped from Cisco CTIOS to Finesse. This completely threw me off as I had been automating the Win32 application and I had never done IUIAutomation before. As such I've been messing around with the API and will be adding code as I figure it out. While I do have Supervisor access, I will likely not be adding functions for those features yet.
      Currently Available Functions:
      User API - Query and Set User Info
      Dialog API - Query and Set Call and other Dialog Info
      Queue API - Query Assigned Queues
      Team API - Query Users in a Team


      Changelog:
       
      Download:
       
      Support:
      Support for this UDF can be obtained in my Discord Server
×
×
  • Create New...