Jump to content

Easiest text recognition


Recommended Posts

Hello.

I'm trying to create a bot that when you press a key it would scan a window and if it finds certain words, the mouse would go to that position and it would press on the word.
The windows I use is SCRCPY, a mirror that lets you use your phone on your computer, connected via USB. For this reason, the window helper doesn't show any visible / hidden text.
I've been looking for scripts around this forum for a couple of days but everything seems very complicated and I barely understand anything. Any ideas how I can achieve this while also keeping the code simple and make it scan fast enough (i can only give it a couple of seconds before the text disappears)?

Link to comment
Share on other sites

From what I read, UWPOCR can only read text from image / bitmap. I'd like it to do it in realtime on my screen since the text only appears for a couple of seconds. Either that, or I don't understand how it works, which may be very possible as well.

No idea how MODI OCR or Tesseract work as well. I'm a complete newbie to this. The most I've been able to do was 'when you press <button>, mouse goes there, clicks, scrolls, etc', basic stuff.

Link to comment
Share on other sites

I see, as I said, complete beginner, no idea what bitmap / GDI etc really meant.

Looking at the code I can see that it's gonna read the screen, but how do I configure it?

I want it to scan the active window, check for 9 words and if any of these appear click on them, if not click elsewhere. I kinda know the 'logistics' of how it should work - script takes a screenshot, scans it for words, finds a word, gets coords, inputs coords into mousemove and clicks, sleeps the specified time to account for any lags in the program, returns to scanning for the rest of the words. I just have absolutely 0 idea how to put it into code.

Link to comment
Share on other sites

Link to comment
Share on other sites

https://prnt.sc/26dcu1r - This is a screenshot of the window I'm working with. As a back story, this is SCRPY, a software that allows you to see & control your phone from your computer via USB cable (with USB debugging and ADB). I run multiple of these windows at once. The app is called Snapchat, a popular chatting mobile app. We use it to interact with our followers and clients and I want to semi-automate my process - when people send a message, I will be opening their messages and depending on what they have to say I have premade messages to send them. Automatically sending messages at the push of a button was already done. The next step is:

Snapchat automatically deletes all messages in a chat when you leave that chat, unless you save them. In order to save them, you have to simply tap on the message and it gets saved (the second message in the screenshot is saved - the background is darker and the red line on the left thicker, showing it is saved). Now, unless you unsave it, it will remain there and won't ever be deleted.

After we send our premade messages, I want to push a button, the script would search for that message (for example, in our screenshot, if we want to save the first message it would search for the word 'test'), it would drive the mouse there and click on it. However, we usually send more premade messages one after the other so once it saved the first message we need it to continue scanning for other messages in case there is anything left to be saved.

We also need this script to run only on the specified window. I already did this in my script with 'AutoItSetOption("MouseCoordMode",0)' but I don't know if it will work. We need this because, as stated earlier, I personally run 4 phones connected at the same time with 4 windows opened. I need the script to only save the messages when I press the button and only on the window that's active in that moment.

In summary: Button pressed, script scans image for words, find words, presses on them, continues scanning for others, if others = save them too, else = stop (i don't want the script to exit, just to pause and wait for other keys to be pressed, for example: after we send the messages and save them in a conversation, we move to another conversation and repeat)

I hope this answers your questions, if not, let me know.

Link to comment
Share on other sites

Any help?

Another idea:
When you send a message there's a red line next to your message on the left side. Wouldn't it be easier to make a script that would search the first column of pixels and once it finds a red pixel, it would move the mouse to the right of it and click?

Also, since our messages always look the same, I can also screenshot the message and make the script search for it instead of reading the text. Would that be easier?

Link to comment
Share on other sites

10 minutes ago, goldieczr said:

Any help?

Not sure what type of help you're expecting.  Start coding with one of the options I already gave you.  When you have a script, we certainly will try to help you.  Otherwise it is just a chat convo....

Link to comment
Share on other sites

56 minutes ago, Nine said:

Not sure what type of help you're expecting.  Start coding with one of the options I already gave you.  When you have a script, we certainly will try to help you.  Otherwise it is just a chat convo....

As I said, I have absolutely no clue where to begin with. Zero experience in making scripts for AutoIt, zero ideas on how to make what I want possible. I searched for tutorials and threads for hours and hours and didn't find anything that could guide me where I need to go. Even UWPOCR, for me it's just a bunch of text with some words that sometimes I understand. I have no clue how to make it work for the active window, how to make it scan my screen, how to make it search for words and drive my mouse there, nothing.

Link to comment
Share on other sites

OP - What I guess Nine is suggesting to you, is learn to walk before you run.

Get to know the basics of coding first, and like good folk we will help you.

A bit like someone teaching another to fish, but not doing the fishing for them.

Kind of - we help you to help yourself.

All of us here are unpaid volunteers, with a varying level of expertise, and don't worry about looking foolish, as we have all been there ... a beginner who doesn't know or understand much.

This site is not a Code It For You site, it is more a teaching interactive site. We work with you to get to where you want to go.

So make a start. :) 

Make sure brain is in gear before opening mouth!
Remember, what is not said, can be just as important as what is said.

Spoiler

What is the Secret Key? Life is like a Donut

If I put effort into communication, I expect you to read properly & fully, or just not comment.
Ignoring those who try to divert conversation with irrelevancies.
If I'm intent on insulting you or being rude, I will be obvious, not ambiguous about it.
I'm only big and bad, to those who have an over-active imagination.

I may have the Artistic Liesense ;) to disagree with you. TheSaint's Toolbox (be advised many downloads are not working due to ISP screwup with my storage)

userbar.png

Link to comment
Share on other sites

1 hour ago, TheSaint said:

Get to know the basics of coding first

Absolutely agree.

@goldieczr You may want to read the FREE book on AutoIt:

Or if you are a visual learner, you can watch this series on YouTube: https://www.youtube.com/playlist?list=PLNpExbvcyUkOJvgxtCPcKsuMTk9XwoWum

Best of luck in your coding journey :)

 

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

I don't want this to come out as rude but I work 3 jobs, barely get any sleep, I don't really have time to learn a programming language for scratch. I don't expect anyone to just write me a script but I was expecting a more direct way of doing it, something more along the lines of 'yeah check this tutorial copy that code then do this and that' rather than 'oh just start learning programming and hopefully after a year or two you'll be able to do it'. This isn't a hobby of mine, it's a necessity to make my work easier. This would shave off around an hour off my daily work so I don't see any worth in learning something for months just for that.

Link to comment
Share on other sites

2 hours ago, goldieczr said:

yeah check this tutorial copy that code then do this and that'

Unfortunately programming doesn't quite work like that, even in the best case scenario you'd have to write code to bring it all together, there is no magical solution here.

2 hours ago, goldieczr said:

I don't see any worth in learning something for months just for that.

This is a misconception you have. Depending on the level of skill you have with computers you can easily get started within a couple of weeks assming you dedicate some time each day towards learning it.

AutoIt is especially easy to use even for beginners.

Learning to code is also a useful skill in general.

Having said that, if you still don't want to take any time for learning, then there are other no code automation solutions out there, I have't used them personally so I can't help you with recommendations in that regard.

EasyCodeIt - A cross-platform AutoIt implementation - Fund the development! (GitHub will double your donations for a limited time)

DcodingTheWeb Forum - Follow for updates and Join for discussion

Link to comment
Share on other sites

You are asking for one of the more complicated things beeing text recognition but let me give you some direction on the easy bits and parts

Even if you google around if snapchat has better API's to do this its still a lot of work.

Easier would be if you do not need the words 

 

Link to comment
Share on other sites

I already completed the first 4 steps in the first 5 minutes after installing autoit because they're really extremely easy to use, however the 5th step with OCR seems like it needs quite some learning of other stuff before I can understand how it works and apply it into my script, which is where I got stuck.

I may actually be interested in the pixelchecksum method, however there's a dilemma: As already said, the message sent by me will have a red line on the left of the screen. However, the location (height-wise) of that red line isn't fixed since it depends on the length of the message before that (if the person that sent us a message wrote a long message, our reply will get showed lower on the screen). Is it possible to make the script do a scan on a rectangular piece of the screen when a key is pressed and once it detects a red pixel, moves mouse a bit to the right and clicks? 

Unfortunately Snapchat is very conservative with their app usage. For that reason, no emulators work for snap and we have to use real phones connected to the computer, so there's no API to help us with that.

Link to comment
Share on other sites

yes, you need the win* functions for that to get the area 

And based on that you can start with the pixel* functions to find your red pixels

To get into all pixel detail you have to use bitblt but would classify this also in area of advanced stuff. Search forum for bitblt and gdi for more advance screen pixel things
Very long ago I did make below maybe there are some things you can reuse

 

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...