SAPI (Windows Speech Recognition API) recognition help

Sori · March 1, 2014

Modified version of code here:

#include <File.au3>
#include <Misc.au3>

;Only allow one instance of the program to run.
If _Singleton("Voice Commands", 1) = 0 Then
    Exit
EndIf

Dim $spokenWords
Dim $voiceCommands = _FileListToArray(@WorkingDir & "\Voice Commands")
Dim $voiceCommandsCap = $voiceCommands[0]
Dim $voiceCommandName
Dim $splitCommand
Dim $splitRecognition
Dim $parameter
Dim $sendParameter
Dim $count
Dim $skipSearch
Dim $transcriptionMode

Global $h_Context = ObjCreate("SAPI.SpInProcRecoContext")
Global $h_Recognizer = $h_Context.Recognizer
Global $h_Grammar = $h_Context.CreateGrammar(1)
$h_Grammar.Dictationload
$h_Grammar.DictationSetState(1)

;Create a token for the default audio input device and set it
Global $h_Category = ObjCreate("SAPI.SpObjectTokenCategory")
$h_Category.SetId("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\AudioInput\TokenEnums\MMAudioIn\")
Global $h_Token = ObjCreate("SAPI.SpObjectToken")
$h_Token.SetId("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\AudioInput\TokenEnums\MMAudioIn\")
$h_Recognizer.AudioInput = $h_Token

Global $i_ObjInitialized = 0

Global $h_ObjectEvents = ObjEvent($h_Context, "SpRecEvent_")
If @error Then
    ConsoleWrite("ObjEvent error: " & @error & @CRLF)
    $i_ObjInitialized = 0
Else
    ConsoleWrite("ObjEvent created Successfully!" & @CRLF)
    $i_ObjInitialized = 1
EndIf

While $i_ObjInitialized
    Sleep(5000)
    ;Allow the Audio In to finalize processing on the last 5 second capture
    $h_Context.Pause
    ;Resume audio in processing
    $h_Context.Resume
    ;Reset event function allocation (what is this? I think its garbage collection or something, needs clarification)
    $h_ObjectEvents = ObjEvent($h_Context, "SpRecEvent_")
WEnd

Func SpRecEvent_Hypothesis($StreamNumber, $StreamPosition, $Result)
    ConsoleWrite("Hypothesis(): Hypothized text is: " & $Result.PhraseInfo.GetText & @CRLF)
EndFunc   ;==>SpRecEvent_Hypothesis

Func SpRecEvent_Recognition($StreamNumber, $StreamPosition, $RecognitionType, $Result)
    ConsoleWrite($RecognitionType & "||" & $Result.PhraseInfo.GetText & @CRLF)
    $spokenWords = $Result.PhraseInfo.GetText
    CheckCommands()
EndFunc   ;==>SpRecEvent_Recognition

Func SpRecEvent_SoundStart($StreamNumber, $StreamPosition)
    ConsoleWrite("Sound Started" & @CRLF)
EndFunc   ;==>SpRecEvent_SoundStart

Func SpRecEvent_SoundEnd($StreamNumber, $StreamPosition)
    ConsoleWrite("Sound Ended" & @CRLF)
EndFunc   ;==>SpRecEvent_SoundEnd

Func CheckCommands()
    ;=== Special Voice Commands ===

    ;-- Transcription Mode--
    If $spokenWords = "Transcription Mode" Then
        If $transcriptionMode = 0 Then
            $transcriptionMode = 1
            ConsoleWrite("Transcription Mode On" & @CRLF)
            $skipSearch = 1
        Else
            $transcriptionMode = 0
            ConsoleWrite("Transcription Mode Off" & @CRLF)
            $skipSearch = 1
        EndIf
    EndIf

    If $transcriptionMode = 1 Then
        If $spokenWords <> "Transcription Mode" Then
            Send($spokenWords)
            $skipSearch = 1
        EndIf
    Else
        $skipSearch = 0
    EndIf
    ;---------

    ;==============================

    If $skipSearch = 0 Then
        ;=== Voice Command Search ===
        ;%% in the file name denotes that whatever is said after the command, should be sent as a parameter
        $count = 1
        While $count <= $voiceCommandsCap
            ConsoleWrite("count: " & $count & @CRLF)
            ConsoleWrite($voiceCommands[$count] & @CRLF)
            If StringInStr($voiceCommands[$count], "%%") <> 0 Then
                ConsoleWrite("found %%" & @CRLF)
                $splitCommand = StringSplit($voiceCommands[$count], " %%")
                If $splitCommand[0] > 2 Then
                    $voiceCommandName = $splitCommand[1] & " " & $splitCommand[2]
                Else
                    $voiceCommandName = $splitCommand[1]
                EndIf
                $splitRecognition = StringReplace($spokenWords, $voiceCommandName & " ", "")
                ;$splitRecognition = StringSplit($spokenWords, $voiceCommandName)
                ConsoleWrite("spokenWords: " & $spokenWords & @CRLF)
                ;ConsoleWrite("split By: " & $voiceCommandName & @CRLF)
                ConsoleWrite("splitRecognition: " & $splitRecognition & @CRLF)
                ;$parameter = $splitRecognition[1]
                $parameter = $splitRecognition
                $sendParameter = 1
                ConsoleWrite("voiceCommandName: " & $voiceCommandName & @CRLF)
                ConsoleWrite("Parameter: " & $parameter & @CRLF)
            Else
                $splitCommand = StringSplit($voiceCommands[$count], ".au3")
                $voiceCommandName = $splitCommand[1]
                $sendParameter = 0
            EndIf
            $count = $count + 1
        WEnd

        ConsoleWrite("Checking Command to List" & @CRLF)
        If StringInStr($spokenWords, $voiceCommandName) <> 0 Then
            If $sendParameter = 1 Then
                Run(@WorkingDir & "\Voice Commands\" & $voiceCommandName & " %%.exe " & $parameter)
                ;Run("AutoIt3.exe " & $voiceCommandName &"%%.au3" & $parameter)
            Else
                ShellExecute(@WorkingDir & "\Voice Commands\" & $voiceCommandName & ".exe")
                ;Run("AutoIt3.exe " & $voiceCommandName &".au3")
            EndIf
        EndIf
        ;==============================
    EndIf
    $skipSearch = 0
EndFunc   ;==>CheckCommands

My issue is with the recognition itself.

It's too often that the engine does not recognize what I'm saying.

I've tried searching for the recognition information, training, etc... but I'm not finding what I need.

I think I can use last hypothesized entry to check the commands, but....

I'm not sure how to get only the last entry for the hypothesis.

If you look in the console as you speak, the hypothesis is constantly changing until you stop speaking. So I'm fairly certain I need to check for a pause in speech to use this method.

Is the voice recognition engine shared amongst all programs?

Is there a good application to train the voice recognition?

sidenote:

The code is terribly inefficient at the moment. It's a work in progress. After I get it to recognize my voice, then I'll work on making the actual commands better optimized.

Edited March 1, 2014 by Sori

Surya · August 22, 2015

I think you can use my UDF Utter.au3 it will help you in adding specific words that you want the computer recognize and free recognition is also avaliable here is the link https://www.autoitscript.com/forum/topic/175719-utter-utilizing-more-of-sapi/ hope i helped you Sori if you need any more help you can ask me

Sign In

SAPI (Windows Speech Recognition API) recognition help

Recommended Posts

Sori

Surya

Create an account or sign in to comment

Create an account

Sign in

Similar Content

Utter Speech Recognition UDF

Problem with Sapi 5 implementation

Utter - Utilizing more of SAPI (Speech Recognition UDF) 1 2 3

Sound Detection

TinyClipToSpeech v 1.0.0.9 Update of 2012-07-01

Browse

AutoIt Resources

Release

Beta