Jump to content

Using AutoIT for visual pattern recognition

Recommended Posts

I am working on an application to aid in and evaluate pattern recognition for those suffering from trauma or dementia. Since the project is in its early stages, I plan to test the resulting application and wanted to make every accommodation for a test platform while in the design/pseudo coding stage.

Can AutoIT be used to scan images and look for specific visual patterns? One of my use-cases would be showing random people's photographs and identifying which ones were family.

I would generate these images and randomly place them and display them as a single image. Still, I would want to use software to search an example output to identify the people as a family or identify specific celebratory events and match them with a photograph.  The testing scope seemed too small to bring in an AI package for such identification; it appeared that I could do it by some relatively simplistic image matching logic, but I am not sure about AutoIT's capabilities or how easily it could be extended.  

I did not mean to be too verbose,  but usually, test planning is part of the design process, and I want to make sure I have the right tool for the job--and if so, some general advice.   

Thanks in advance!






Link to post
Share on other sites
  • Moderators


Welcome to the AutoIt forums.

That sounds like a very interesting and worthwhile project. I have little experience with image recognition in AutoIt, but there certainly are a number of libraries available to do that - although there seems to be a varying degree of success with all of them.

However, if you are creating a single composite image from a number of separate images, it would not be difficult to determine which of the images was under the cursor at any time, nor which had been clicked upon. Given that you would already know which image was which, would that be sufficient to allow you to do what you require?

Rather then being too verbose, I think you need to explain in a little more detail what you envisage happening once you have this composite image - then we can offer suggestions as to how you might go about both generating it and reacting to user interaction with it.


Public_Domain.png.2d871819fcb9957cf44f4514551a2935.png Any of my own code posted anywhere on the forum is available for use by others without any restriction of any kind

Open spoiler to see my UDFs:


ArrayMultiColSort ---- Sort arrays on multiple columns
ChooseFileFolder ---- Single and multiple selections from specified path treeview listing
Date_Time_Convert -- Easily convert date/time formats, including the language used
ExtMsgBox --------- A highly customisable replacement for MsgBox
GUIExtender -------- Extend and retract multiple sections within a GUI
GUIFrame ---------- Subdivide GUIs into many adjustable frames
GUIListViewEx ------- Insert, delete, move, drag, sort, edit and colour ListView items
GUITreeViewEx ------ Check/clear parent and child checkboxes in a TreeView
Marquee ----------- Scrolling tickertape GUIs
NoFocusLines ------- Remove the dotted focus lines from buttons, sliders, radios and checkboxes
Notify ------------- Small notifications on the edge of the display
Scrollbars ----------Automatically sized scrollbars with a single command
StringSize ---------- Automatically size controls to fit text
Toast -------------- Small GUIs which pop out of the notification area


Link to post
Share on other sites

Thanks M23, your handle reminds me of Messier 23.

I envision the user tapping or using a mouse to click on the object they identify as family--a family version of "Where is Waldo" for a portion of the application.

But to achieve this from a decoupled testing perspective, I need to parse the resulting image (which may have portions of the family member deliberately obscured) and have my scripting code attempt to identify it, itself.  If I knew where the image was, to begin with, which I would know from a coding perspective, I do not believe I could be sure that the family member's rendered image was discernible.

I do appreciate the conversation; thanks much for your time.






Link to post
Share on other sites
  • 1 month later...

Could this kind of thingy be used to click specific icons which are in different location in window depending on the screen resolution and scaling setting?

In Delphi toolbar buttons are not as default "visible" for AutoIT to click nicely. And depending on the coordinates is not very reliable solution (I think). But is someone has good an reliable solution for this without image recognition would be way faster (I think).


Link to post
Share on other sites

An alternative aspect to possibly solve your problem:

Maybe review your assumption that images should be combined to one image, and then you don't have to reverse the process with slow image recognition software to extract them again.

Set up zones on your screen to display each image, keeping the unique value of each image in a grid/table/array, then match the final click to the original zone value to get back the image that is shown there.

Just think of it as placing playing cards or chess pieces on a table in a grid, and asking somebody to pick one up. Each card remains separate - just you keep track of where they are.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Recently Browsing   0 members

    No registered users viewing this page.

  • Create New...