Jump to content

Another Voice Recognition Topic


Recommended Posts

I recently transitioned to Windows 7 as my primary operating system and I absolutely love the voice recognition features. I have used speech recognition in XP with a number of different programs and have never come across anything which worked so well. However, being a developer I naturally am looking to extend the voice recognition to be able to do some more complex and advance things.

Upon investigating I found it very difficult to find out how to properly implement some some of the speech recognition solution. I've been through MSDN documentation, posts throughout this forum and all around the google. In all honesty, the best I could do was make slight modifications to code that I copied, so it really isn't worth pasting my code in. I did implement the speech synthesis portion, so my knowledge of com implementation isn't totally lacking.

So if anybody could provide any input into how to setup SAPI on Windows 7 (or Vista since they are probably using the same SAPI engine) to listen for a set of words or phrases I would appreciate it greatly. I'm not event exactly sure which objects/functions are relevant or how they work.

"Everything is vague to a degree you do not realize till you have tried to make it precise." - Bertrand Russell [The Philosophy of Logical Atomism]

Link to comment
Share on other sites

Not a direct answer to your question, but I find I can do most of what I want with Vocola -- see here: http://vocola.net/

It is amazing. Was originally build to on top of Dragon Naturally Speaking and a Python tool called NatLink. Version 3 supports WSR without NatLink. It is a scripting language for VR and allows extension and external calls... any of my efforts at programming VR will be extensions to Vocola.

Dale

Edited by DaleHohm

Free Internet Tools: DebugBar, AutoIt IE Builder, HTTP UDF, MODIV2, IE Developer Toolbar, IEDocMon, Fiddler, HTML Validator, WGet, curl

MSDN docs: InternetExplorer Object, Document Object, Overviews and Tutorials, DHTML Objects, DHTML Events, WinHttpRequest, XmlHttpRequest, Cross-Frame Scripting, Office object model

Automate input type=file (Related)

Alternative to _IECreateEmbedded? better: _IECreatePseudoEmbedded  Better Better?

IE.au3 issues with Vista - Workarounds

SciTe Debug mode - it's magic: #AutoIt3Wrapper_run_debug_mode=Y

Doesn't work needs to be ripped out of the troubleshooting lexicon. It means that what you tried did not produce the results you expected. It begs the questions 1) what did you try?, 2) what did you expect? and 3) what happened instead?

Reproducer: a small (the smallest?) piece of stand-alone code that demonstrates your trouble

Link to comment
Share on other sites

Not a direct answer to your question, but I find I can do most of what I want with Vocola -- see here: http://vocola.net/

It is amazing. Was originally build to on top of Dragon Naturally Speaking and a Python tool called NatLink. Version 3 supports WSR without NatLink. It is a scripting language for VR and allows extension and external calls... any of my efforts at programming VR will be extensions to Vocola.

Dale

Well I guess that is a fallback, but I am really interested in using this as an interface layer for my applications which I program already so I am looking for a bit more power and some stronger integration. Also, since SAPI 5.3 commands are standard on Windows Vista and Windows 7 it would be much easier to deal rather than adding other stuff. In regard to personal shortcuts, vocola could be a decent solution there.

"Everything is vague to a degree you do not realize till you have tried to make it precise." - Bertrand Russell [The Philosophy of Logical Atomism]

Link to comment
Share on other sites

There used to be a program called "Say-Now", you could set it to execute a program/script upon the successful match of a word, or sentence. I would tie it to AutoIt scripts, and make it carry out Automated procedures. I'll be back with some links and more info... :)

P.S. I'm pretty sure it had an API to work with (Which could be used with AutoIt), but you had to purchase it...

Edit: It appears as if "Say-Now" is some sort of phone service now... Anyhow, here is the old "Say-Now"

http://www.say-now.com/

I can't find the API... x.x

Edited by BinaryBrother

SIGNATURE_0X800007D NOT FOUND

Link to comment
Share on other sites

There used to be a program called "Say-Now", you could set it to execute a program/script upon the successful match of a word, or sentence. I would tie it to AutoIt scripts, and make it carry out Automated procedures. I'll be back with some links and more info... :)

P.S. I'm pretty sure it had an API to work with (Which could be used with AutoIt), but you had to purchase it...

Edit: It appears as if "Say-Now" is some sort of phone service now... Anyhow, here is the old "Say-Now"

http://www.say-now.com/

I can't find the API... x.x

The site says that the software runs in Windows 2000 and XP this would mean that it uses SAPI 5.1 (XP/2000/ME) rather than SAPI 5.3 (included in VISTA and 7). Much of the reason that I was inspired to start including this into my software is the incredible improvement from SAPI 5.1 to SAPI 5.3, from speech in XP to speech in Vista and 7, so I would prefer to stay away from that.

The other issue comes from the idea that I don't want to have to load programs in order to carry out commands. It is slow and inefficient to load a executable for each command (especially considering that I believe AutoIt's compiling is still essentially smashing a script and an interpreter together and therefore would have large overhead). And without the API it has the same issues as the above.

However even with the API...it claims to use SAPI which stands for Speech Application Programming Interface. So both of the solutions posed are strange in that they themselves use speech recognition APIs. So the real question is why use an API for another API...an interface for programming an interface for programming speech recognition. It would seem more logical to use SAPI rather than an interface for an interface for speech.

Unless anybody has any knowledge or experience in this I may just have to wait until 7 is publicly released at which point I'd imagine there would be more resources to read through.

"Everything is vague to a degree you do not realize till you have tried to make it precise." - Bertrand Russell [The Philosophy of Logical Atomism]

Link to comment
Share on other sites

  • 10 months later...

I've just created a SAPIListBox UDF that will allow an AutoIT script to respond to a spoken words / phrases from a predefined list of items. Click on this link to access the UDF. But it does use the SAPI 5.1 SDK, not 5.3, so I guess it's not what you are after. Unless there is a similar ActiveX listbox type object in SAPI 5.3?

Cheers, Sean.

See my other UDFs:

Chrome UDF - Automate Chrome | SAP UDF - Automate SAP | Java UDF - Automate Java Applications & Applets | Tesseract (OCR) UDF - Capture text from applications, controls and the desktop | Textract (OCR) UDF - Capture text from applications and controls | FileSystemMonitor UDF - File, Folder, Drive and Shell Monitoring | VLC (Media Player) UDF - Creating and controlling a VLC control in AutoIT | Google Maps UDF - Creating and controlling Google Maps (inc. GE) in AutoIT | SAPIListBox (Speech Recognition) UDF - Speech Recognition via the Microsoft Speech (SAPI) ListBox | eBay UDF - Automate eBay using the eBay API | ChildProc (Parallel Processing) UDF - Parallel processing functions for AutoIT | HyperCam (Screen Recording) UDF - Automate the HyperCam screen recorder | Twitter UDF - Automate Twitter using OAuth and the Twitter API | cURL UDF - a UDF for transferring data with URL syntax

See my other Tools:

Rapid Menu Writer - Add menus to DVDs in seconds | TV Player - Automates the process of playing videos on an external TV / Monitor | Rapid Video Converter - A tool for resizing and reformatting videos | [topic130531]Rapid DVD Creator - Convert videos to DVD fast and for free | ZapPF - A tool for killing processes and recycling files | Sean's eBay Bargain Hunter - Find last minute bargains in eBay using AutoIT | Sean's GUI Inspector - A scripting tool for querying GUIs | TransLink Journey Planner with maps - Incorporating Google Maps into an Australian Journey Planner | Automate Qt and QWidgets | Brisbane City Council Event Viewer - See what's going on in Brisbane, Australia
Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...