Sign in to follow this  
Followers 0
Sori

Using Windows Speech Recognition and Autoit together

5 posts in this topic

#1 ·  Posted (edited)

Firstly, I didn't see a forum section that was well suited for this article, so I decided to place it in the help section, since that would be the most read section. Feel free to move this article to the appropriate section.

If you're here at the forums, you understand that Autoit is a very powerful tool for controlling Windows.

You may also be aware of a tool that is included with Windows 7 called, "Windows Speech Recognition" which allows you to use voice commands to run tasks, like... Opening the start menu, or running a program.

What you may not be aware of is, "Windows Speech Recognition Macros" or "WSR Macros" for short.

This is a free program put out by Microsoft that lets you create your own voice commands. You would essentially... create a trigger word or phrase, then you have the option of what you want WSR to do when it hears that phrase.

While that sounds amazing, in practice it's very limited. This is where Autoit comes into play.

Using WSR Macros, you can set a voice command to run a program. Any program.

This allows for a very large array of amazing things that you can do using Autoit.

For example. I use PotPlayer for watching videos. If I say, "Pause" WSR has no clue what I'm talking about. But, creating a simple and very short program in Autoit.

Opt("WinTitleMatchMode", 2)

If WinActive("PotPlayer") <> 0 Then
    Send("{Space}")
EndIf

Now I can tell WSR to run "Pause.exe" if I say, pause. (compiled from the au3)

Using this simple combination of programs, you can create any string of trigger words to do anything your imagination can think of.

 

Helpful Tips:

Using more than 1 word for the trigger will help WSR distinguish between if you are just talking or if you're giving a command. This will also help if you are transcribing, as not to run a command when your intention is to transcribe the word.

Because language is artistic, there can be several ways to complete the same action.

"Open my music folder"
"Let's listen to some music"
"Open my music"
To prevent a ton of file clutter, instead of selecting "New Speech Macro..." for every way you say things, select "Explore Speech Macros..." Then go to the original macro file (WSRMac extension) and alter the contents a bit.

Like this...

(Not incapsulated to maintain color for easier reading)

-----

<?xml version="1.0" encoding="UTF-16"?>
<speechMacros>
   <command>
         <listenFor>
Open my music folder</listenFor>
         <run
command="C:Music" params=""/>
   </command>
   <command>
         <listenFor>
Let's listen to some music</listenFor>
         <run command="C:Music" params=""/>
   </command>
   <command>
         <listenFor>
Open my music</listenFor>
         <run command="C:Music" params=""/>
   </command>

<Signature> 

encrypted coding, very long, useless to the purpose

-----

Notice... You don't need a seperate file, just add another command to the same file. If you really wanted to tidy up, you could actually merge all your different voice commands into 1 giant (and very confusing) file.

Edited by Sori
2 people like this

If you need help with your stuff, feel free to get me on my Skype.

I often get bored and enjoy helping with projects.

Share this post


Link to post
Share on other sites



Helpful Tip:

To save space on Autoit clutter you can use the parameters option.

For instance:

Pressing the arrow keys on the keyboard. Instead of having 4 different autoit programs, you can have 1 take the parameters into account, and send the appropriate command.

Example:

 

WSR

-----

<command>
    <listenFor>
Up</listenFor>
    <run
command="C:UsersSorimachiDocumentsSpeech MacrosArrow Keys.au3" params="1"/>
  </command>
  <command>
    <listenFor>
Down</listenFor>
    <run command="C:UsersSorimachiDocumentsSpeech MacrosArrow Keys.au3" params="2"/>
  </command>
  <command>
    <listenFor>
Left</listenFor>
    <run command="C:UsersSorimachiDocumentsSpeech MacrosArrow Keys.au3" params="3"/>
  </command>
  <command>
    <listenFor>
Right</listenFor>
    <run command="C:UsersSorimachiDocumentsSpeech MacrosArrow Keys.au3" params="4"/>
  </command>

-----

Autoit

$command = $CmdLine[1]

If $command = "1" Then
    UpArrow()
ElseIf $command = "2" Then
    DownArrow()
ElseIf $command = "3" Then
    LeftArrow()
ElseIf $command = "4" Then
    RightArrow()
EndIf

Func UpArrow()
    Send("{UP}")
EndFunc   ;==>UpArrow

Func DownArrow()
    Send("{DOWN}")
EndFunc   ;==>DownArrow

Func LeftArrow()
    Send("{LEFT}")
EndFunc   ;==>LeftArrow

Func RightArrow()
    Send("{RIGHT}")
EndFunc   ;==>RightArrow
1 person likes this

If you need help with your stuff, feel free to get me on my Skype.

I often get bored and enjoy helping with projects.

Share this post


Link to post
Share on other sites

Awesome!!! I was thinking about this yesterday and how amazing it would be to incorporate WSR to control things I've made for work...

Thanks for the info!

Share this post


Link to post
Share on other sites

You're very welcome. I'm glad that I was able to reach out to at least one person to help them.

If you need any assistance with integration, or some advice in how to do a certain thing with voice commands, feel free to ask.


If you need help with your stuff, feel free to get me on my Skype.

I often get bored and enjoy helping with projects.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!


Register a new account

Sign in

Already have an account? Sign in here.


Sign In Now
Sign in to follow this  
Followers 0

  • Similar Content

    • Sori
      By Sori
      Modified version of code here:

      #include <File.au3> #include <Misc.au3> ;Only allow one instance of the program to run. If _Singleton("Voice Commands", 1) = 0 Then Exit EndIf Dim $spokenWords Dim $voiceCommands = _FileListToArray(@WorkingDir & "\Voice Commands") Dim $voiceCommandsCap = $voiceCommands[0] Dim $voiceCommandName Dim $splitCommand Dim $splitRecognition Dim $parameter Dim $sendParameter Dim $count Dim $skipSearch Dim $transcriptionMode Global $h_Context = ObjCreate("SAPI.SpInProcRecoContext") Global $h_Recognizer = $h_Context.Recognizer Global $h_Grammar = $h_Context.CreateGrammar(1) $h_Grammar.Dictationload $h_Grammar.DictationSetState(1) ;Create a token for the default audio input device and set it Global $h_Category = ObjCreate("SAPI.SpObjectTokenCategory") $h_Category.SetId("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\AudioInput\TokenEnums\MMAudioIn\") Global $h_Token = ObjCreate("SAPI.SpObjectToken") $h_Token.SetId("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\AudioInput\TokenEnums\MMAudioIn\") $h_Recognizer.AudioInput = $h_Token Global $i_ObjInitialized = 0 Global $h_ObjectEvents = ObjEvent($h_Context, "SpRecEvent_") If @error Then ConsoleWrite("ObjEvent error: " & @error & @CRLF) $i_ObjInitialized = 0 Else ConsoleWrite("ObjEvent created Successfully!" & @CRLF) $i_ObjInitialized = 1 EndIf While $i_ObjInitialized Sleep(5000) ;Allow the Audio In to finalize processing on the last 5 second capture $h_Context.Pause ;Resume audio in processing $h_Context.Resume ;Reset event function allocation (what is this? I think its garbage collection or something, needs clarification) $h_ObjectEvents = ObjEvent($h_Context, "SpRecEvent_") WEnd Func SpRecEvent_Hypothesis($StreamNumber, $StreamPosition, $Result) ConsoleWrite("Hypothesis(): Hypothized text is: " & $Result.PhraseInfo.GetText & @CRLF) EndFunc ;==>SpRecEvent_Hypothesis Func SpRecEvent_Recognition($StreamNumber, $StreamPosition, $RecognitionType, $Result) ConsoleWrite($RecognitionType & "||" & $Result.PhraseInfo.GetText & @CRLF) $spokenWords = $Result.PhraseInfo.GetText CheckCommands() EndFunc ;==>SpRecEvent_Recognition Func SpRecEvent_SoundStart($StreamNumber, $StreamPosition) ConsoleWrite("Sound Started" & @CRLF) EndFunc ;==>SpRecEvent_SoundStart Func SpRecEvent_SoundEnd($StreamNumber, $StreamPosition) ConsoleWrite("Sound Ended" & @CRLF) EndFunc ;==>SpRecEvent_SoundEnd Func CheckCommands() ;=== Special Voice Commands === ;-- Transcription Mode-- If $spokenWords = "Transcription Mode" Then If $transcriptionMode = 0 Then $transcriptionMode = 1 ConsoleWrite("Transcription Mode On" & @CRLF) $skipSearch = 1 Else $transcriptionMode = 0 ConsoleWrite("Transcription Mode Off" & @CRLF) $skipSearch = 1 EndIf EndIf If $transcriptionMode = 1 Then If $spokenWords <> "Transcription Mode" Then Send($spokenWords) $skipSearch = 1 EndIf Else $skipSearch = 0 EndIf ;--------- ;============================== If $skipSearch = 0 Then ;=== Voice Command Search === ;%% in the file name denotes that whatever is said after the command, should be sent as a parameter $count = 1 While $count <= $voiceCommandsCap ConsoleWrite("count: " & $count & @CRLF) ConsoleWrite($voiceCommands[$count] & @CRLF) If StringInStr($voiceCommands[$count], "%%") <> 0 Then ConsoleWrite("found %%" & @CRLF) $splitCommand = StringSplit($voiceCommands[$count], " %%") If $splitCommand[0] > 2 Then $voiceCommandName = $splitCommand[1] & " " & $splitCommand[2] Else $voiceCommandName = $splitCommand[1] EndIf $splitRecognition = StringReplace($spokenWords, $voiceCommandName & " ", "") ;$splitRecognition = StringSplit($spokenWords, $voiceCommandName) ConsoleWrite("spokenWords: " & $spokenWords & @CRLF) ;ConsoleWrite("split By: " & $voiceCommandName & @CRLF) ConsoleWrite("splitRecognition: " & $splitRecognition & @CRLF) ;$parameter = $splitRecognition[1] $parameter = $splitRecognition $sendParameter = 1 ConsoleWrite("voiceCommandName: " & $voiceCommandName & @CRLF) ConsoleWrite("Parameter: " & $parameter & @CRLF) Else $splitCommand = StringSplit($voiceCommands[$count], ".au3") $voiceCommandName = $splitCommand[1] $sendParameter = 0 EndIf $count = $count + 1 WEnd ConsoleWrite("Checking Command to List" & @CRLF) If StringInStr($spokenWords, $voiceCommandName) <> 0 Then If $sendParameter = 1 Then Run(@WorkingDir & "\Voice Commands\" & $voiceCommandName & " %%.exe " & $parameter) ;Run("AutoIt3.exe " & $voiceCommandName &"%%.au3" & $parameter) Else ShellExecute(@WorkingDir & "\Voice Commands\" & $voiceCommandName & ".exe") ;Run("AutoIt3.exe " & $voiceCommandName &".au3") EndIf EndIf ;============================== EndIf $skipSearch = 0 EndFunc ;==>CheckCommands My issue is with the recognition itself.
      It's too often that the engine does not recognize what I'm saying.
      I've tried searching for the recognition information, training, etc... but I'm not finding what I need.
      I think I can use last hypothesized entry to check the commands, but....
      I'm not sure how to get only the last entry for the hypothesis.
      If you look in the console as you speak, the hypothesis is constantly changing until you stop speaking. So I'm fairly certain I need to check for a pause in speech to use this method.
      Is the voice recognition engine shared amongst all programs?
      Is there a good application to train the voice recognition?
      sidenote:
      The code is terribly inefficient at the moment. It's a work in progress. After I get it to recognize my voice, then I'll work on making the actual commands better optimized.