Sign in to follow this  
Followers 0
Donovan6

2 Advanced Functions - Wav & Image analyzation

4 posts in this topic

Hi fellow Autoit'ers,

I've had a brainchild for voice recognition software, very simple at first but i hope to develop it into something quite complex over time (saw ironman with the protagonist speaking to his house computer lol). I know commercial versions of this software exist, but whats the fun in that :)

Before i start working on it however, i need to know if anyone knows about or have worked with Audio recognition or if autoit even has that capability.

This will be with recorded sounds first, fixed wave files for simple numbers. I would say record myself saying numbers 1-10 and have the software try and interprit that correctly based on analyzing the sound.

Once i can do this, i would like to experiment with more complex audio structures and possibly whole sentenses.

What are your thoughts? Any hints or tips on where to start?

Share this post


Link to post
Share on other sites



It's quite hard that because when you ananlyze the sound with what you speak this could be out of phase or a big amplitude difference. If you want to start it's good to know that voice recognition is not easy.


When the words fail... music speaks

Share this post


Link to post
Share on other sites

It's quite hard that because when you ananlyze the sound with what you speak this could be out of phase or a big amplitude difference. If you want to start it's good to know that voice recognition is not easy.

Yep, i've browsed the forum a bit and saw other peopl have done it in various methods, but each one has some sort of problem, from timeouts to misrecognition etc.

And it seems any attempt at Voice recognition will require Microsofts VR development kit. If i have to use that much of already established code, i might as well just use commercial software.

Another thing that i wanted to ask but forgot to add in my original post is PATTERN RECOGNITION. How to extract letters and numbers out of an image.

Share this post


Link to post
Share on other sites

Yep, i've browsed the forum a bit and saw other peopl have done it in various methods, but each one has some sort of problem, from timeouts to misrecognition etc.

And it seems any attempt at Voice recognition will require Microsofts VR development kit. If i have to use that much of already established code, i might as well just use commercial software.

Another thing that i wanted to ask but forgot to add in my original post is PATTERN RECOGNITION. How to extract letters and numbers out of an image.

Say we found some sort of really good working open source software and it worked phenominaly for what we'd need it for. What would be the best way to interface it with an AU3 script? Because this thought could turn into something Seriously cool. Just like the computer in ironman..

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0