Search the Community
Showing results for tags 'speech recognition'.
Found 3 results
Utter is a free ware windows API automation script.It can do most of the sapi dll functions."SAPI" stands for Windows Speech Reconition API,SAPI.dll is the file which manages the speech recognition of windows Utter utilises most of the SAPI functions making use of the best potential of SAPI.dll,You can include speech recognition to your project by using utter. Utter zipped and updated (new version with examples) Modified ......: 12/04/2017 Version .......: 126.96.36.199 Author ........: Surya I am new to autoit it sounds great and i love it while i am getting used to it so i want to write my own UDF in autoit first of all i thank all the forum members because i couldnt do it without research,So i wrote UTTER ,Its is a UDF that uses most of the SAPI dll function or in simple words it can do many functions relating to the computers speech recognition if you have any doubt in the code or have any bugs please notify me freely I will be always there to help its my first UDF so please notify me if you found any error Thank you! Utter has been recently updated ,examples included.The zip can be downloaded here at the download section of autoit : Download utter !! CAUTION !! REMEMBER TO SHUTDOWN THE INSTANCE OF CREATED RECOGNITION ENGINE BEFORE STARTING ANOTHER INSTANCE IF YOU START ANOTHER WITHOUT SHUTTING THE PREVIOUS ONE DOWN IT WILL LEAD TO AN ERROR! REMEMBER THAT "|" IS THE DEFAULT GUIDataSeparatorChar CHANGE IT ACCORDING TO YOUR NEEDS AND GRAMMAR DELIMITER IS GUIDataSeparatorChar IF NO GUIDataSeparatorChar IS FOUND IN THE INPUT STRING THEN THE ENTIRE STRING WOULD BE CONSIDERED AS ONE WORD! DO NOT CALL THE INTERNAL FUNCTIONS THEY ARE TO BE CALLED INSIDE THE FUNCTION AND DO NOT CHANGE THE VALUE OF VARIABLES USED IN THE FUNCTION! THE RECIEVING FUNCTIONS SHOULD HAVE ATLEAST ONE PARAMETER TO ACCEPT THE SPEECH COMMANDS FROM THE _Utter_Speech_GrammarRecognize() FUNCTION please report if you have any bugs/complaints
Version 3. 0. 0. 1
949 downloadsUtter is simply a UDF created for the maximum utilization of SAPI (Speech Recognition API) in windows you can add your own words to be recognized by the computer you can set speed,picth and select the voice you want by speech synthesis included in windows.Utter can create a free grammar recognition engine as well as custom made grammar recognition engine suiting according to your need also it is flexible.The shutdown function of the UDF must be called before calling another one to destroy the current engine running when autoit closes the engine will also close many functionalities are included an update will be soon in future
Modified version of code here: #include <File.au3> #include <Misc.au3> ;Only allow one instance of the program to run. If _Singleton("Voice Commands", 1) = 0 Then Exit EndIf Dim $spokenWords Dim $voiceCommands = _FileListToArray(@WorkingDir & "\Voice Commands") Dim $voiceCommandsCap = $voiceCommands Dim $voiceCommandName Dim $splitCommand Dim $splitRecognition Dim $parameter Dim $sendParameter Dim $count Dim $skipSearch Dim $transcriptionMode Global $h_Context = ObjCreate("SAPI.SpInProcRecoContext") Global $h_Recognizer = $h_Context.Recognizer Global $h_Grammar = $h_Context.CreateGrammar(1) $h_Grammar.Dictationload $h_Grammar.DictationSetState(1) ;Create a token for the default audio input device and set it Global $h_Category = ObjCreate("SAPI.SpObjectTokenCategory") $h_Category.SetId("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\AudioInput\TokenEnums\MMAudioIn\") Global $h_Token = ObjCreate("SAPI.SpObjectToken") $h_Token.SetId("HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\AudioInput\TokenEnums\MMAudioIn\") $h_Recognizer.AudioInput = $h_Token Global $i_ObjInitialized = 0 Global $h_ObjectEvents = ObjEvent($h_Context, "SpRecEvent_") If @error Then ConsoleWrite("ObjEvent error: " & @error & @CRLF) $i_ObjInitialized = 0 Else ConsoleWrite("ObjEvent created Successfully!" & @CRLF) $i_ObjInitialized = 1 EndIf While $i_ObjInitialized Sleep(5000) ;Allow the Audio In to finalize processing on the last 5 second capture $h_Context.Pause ;Resume audio in processing $h_Context.Resume ;Reset event function allocation (what is this? I think its garbage collection or something, needs clarification) $h_ObjectEvents = ObjEvent($h_Context, "SpRecEvent_") WEnd Func SpRecEvent_Hypothesis($StreamNumber, $StreamPosition, $Result) ConsoleWrite("Hypothesis(): Hypothized text is: " & $Result.PhraseInfo.GetText & @CRLF) EndFunc ;==>SpRecEvent_Hypothesis Func SpRecEvent_Recognition($StreamNumber, $StreamPosition, $RecognitionType, $Result) ConsoleWrite($RecognitionType & "||" & $Result.PhraseInfo.GetText & @CRLF) $spokenWords = $Result.PhraseInfo.GetText CheckCommands() EndFunc ;==>SpRecEvent_Recognition Func SpRecEvent_SoundStart($StreamNumber, $StreamPosition) ConsoleWrite("Sound Started" & @CRLF) EndFunc ;==>SpRecEvent_SoundStart Func SpRecEvent_SoundEnd($StreamNumber, $StreamPosition) ConsoleWrite("Sound Ended" & @CRLF) EndFunc ;==>SpRecEvent_SoundEnd Func CheckCommands() ;=== Special Voice Commands === ;-- Transcription Mode-- If $spokenWords = "Transcription Mode" Then If $transcriptionMode = 0 Then $transcriptionMode = 1 ConsoleWrite("Transcription Mode On" & @CRLF) $skipSearch = 1 Else $transcriptionMode = 0 ConsoleWrite("Transcription Mode Off" & @CRLF) $skipSearch = 1 EndIf EndIf If $transcriptionMode = 1 Then If $spokenWords <> "Transcription Mode" Then Send($spokenWords) $skipSearch = 1 EndIf Else $skipSearch = 0 EndIf ;--------- ;============================== If $skipSearch = 0 Then ;=== Voice Command Search === ;%% in the file name denotes that whatever is said after the command, should be sent as a parameter $count = 1 While $count <= $voiceCommandsCap ConsoleWrite("count: " & $count & @CRLF) ConsoleWrite($voiceCommands[$count] & @CRLF) If StringInStr($voiceCommands[$count], "%%") <> 0 Then ConsoleWrite("found %%" & @CRLF) $splitCommand = StringSplit($voiceCommands[$count], " %%") If $splitCommand > 2 Then $voiceCommandName = $splitCommand & " " & $splitCommand Else $voiceCommandName = $splitCommand EndIf $splitRecognition = StringReplace($spokenWords, $voiceCommandName & " ", "") ;$splitRecognition = StringSplit($spokenWords, $voiceCommandName) ConsoleWrite("spokenWords: " & $spokenWords & @CRLF) ;ConsoleWrite("split By: " & $voiceCommandName & @CRLF) ConsoleWrite("splitRecognition: " & $splitRecognition & @CRLF) ;$parameter = $splitRecognition $parameter = $splitRecognition $sendParameter = 1 ConsoleWrite("voiceCommandName: " & $voiceCommandName & @CRLF) ConsoleWrite("Parameter: " & $parameter & @CRLF) Else $splitCommand = StringSplit($voiceCommands[$count], ".au3") $voiceCommandName = $splitCommand $sendParameter = 0 EndIf $count = $count + 1 WEnd ConsoleWrite("Checking Command to List" & @CRLF) If StringInStr($spokenWords, $voiceCommandName) <> 0 Then If $sendParameter = 1 Then Run(@WorkingDir & "\Voice Commands\" & $voiceCommandName & " %%.exe " & $parameter) ;Run("AutoIt3.exe " & $voiceCommandName &"%%.au3" & $parameter) Else ShellExecute(@WorkingDir & "\Voice Commands\" & $voiceCommandName & ".exe") ;Run("AutoIt3.exe " & $voiceCommandName &".au3") EndIf EndIf ;============================== EndIf $skipSearch = 0 EndFunc ;==>CheckCommands My issue is with the recognition itself. It's too often that the engine does not recognize what I'm saying. I've tried searching for the recognition information, training, etc... but I'm not finding what I need. I think I can use last hypothesized entry to check the commands, but.... I'm not sure how to get only the last entry for the hypothesis. If you look in the console as you speak, the hypothesis is constantly changing until you stop speaking. So I'm fairly certain I need to check for a pause in speech to use this method. Is the voice recognition engine shared amongst all programs? Is there a good application to train the voice recognition? sidenote: The code is terribly inefficient at the moment. It's a work in progress. After I get it to recognize my voice, then I'll work on making the actual commands better optimized.