Processing Speech with Java - Developer. That, in combination with the . This class is identical to the previous one except that it instantiates the Speakable. Shares class. It is also included with the code download for this chapter. The Java Speech API allows Java applications to incorporate speech technology into their user. Conversa Web is a voice-enabled Web browser that provides a range of facilities for voice-navigation of the web by speech recognition and text-to-speech. This topic provides an overview and examples for implementing speech recognition in a Windows Forms application. See the other topics in this section for more information and examples. A speech recognition application will typically perform the following basic operations: Initialize the speech. Would you like to add speech synthesis and speech recognition to your app? Read futher to discover how. Java Microsoft &.NET Mobile Android Open Source Cloud Database Architecture. The result of running this example is both audible and visual. First you will hear 1. ![]() For example, some products might recognize a period as a sentence ending instead of as a . By embedding these controls in the JSML file, it is possible to place the specification of the pronunciation details outside the program and into the data. Speech Recognition. The other side of Java Speech is speech recognition. As you might have predicted, the state of the art in recognition is not nearly as advanced as speech synthesis.
The reason for this is simple; it is a harder topic. If you are an English speaker, your ear might be finely tuned to the nuances of the language as the native speakers pronounce them from your region of the United States or Canada. If you relocate to another part of your country, or to another English speaking country, such as Australia, your ability to understand the language is diminished for a while. Over time, your brain learns the subtleties of the new dialect, and you once again become a fluent listener. For a computer, the problem is similar. Recognizers receive information electronically via microphones. They then must try to determine what set of syllables to create from the set of phonemes (sounds) just received. These syllables must then be combined into words. Recognition Grammars. A grammar simplifies the job of the speech recognizer by limiting the number of possible words and phrases that it has to consider when trying to determine what a speaker has said. There are two kinds of grammars: rule grammars and dictation grammars. Rule grammars are composed of tokens and rules. When a user speaks, the input is compared to the rules and tokens in the grammar to determine the identity of the word or phrase. An application provides a rule grammar to a recognizer, normally during initialization. Dictation grammars are built in to the recognizer itself. They define thousands of words that can be spoken in a free form fashion. Dictation grammars come closer to our ultimate goal of unrestricted speech, but, at present, they are slower than rule grammars and more prone to errors. Note - There are four basic error types that recognizers suffer from regardless of the grammar employed: Failure to recognize a valid word Misinterpreting a word to be another valid word Detecting a word where none was present Failure to recognize that a word was spoken. Java Speech supports dynamic grammars. This means that grammars can be modified at runtime. After a change is made to the grammar, it must be committed using the commit. Changes() method of the recognizer. When these changes are committed, they are committed atomically, meaning all at once. Listing 1. 2. 9 shows a simple grammar. Listing 1. 2. 9 A Simple Grammargrammar javax. Hello world . A recognizer that is working against this grammar will understand no other words, phrases, or parts of phrases. The reason for this is to simplify the processing and increase the likelihood that an accurate result will be obtained. This rule grammar is formatted in the Java Speech Grammar Format Specification (JSGF). Grammars formatted in JSGF can be converted logically into Rule. Grammar objects and back again. Listing 1. 2. 1. 0 shows a program that will serve as a recognizer for this grammar. Listing 1. 2. 1. 0 The Hello. Recognizer Class/*. Hello. Recognizer. Created on March 1. PM. package unleashed. Stephen Potts. import javax. File. Reader. import java. Locale. public class Hello. Recognizer extends Result. Adapter. . We use the Central class to create both: recognizer = Central. Recognizer(. new Engine. Mode. Desc(Locale. ENGLISH)); Once again, we have chosen English as the language for this example. Once we have a recognizer, we can load the grammar: File. Reader grammar. 1 =. File. Reader(. We create a Rule. Grammar object, and set it to be enabled. The event listener will do the runtime work of the program. We set the event listener here: recognizer. Result. Listener(new Hello. Recognizer()); Next, we complete the initialization of the recognizer by committing the grammar, getting the focus, and putting the recognizer in the RESUMED state: recognizer. Changes(). recognizer. Focus(). recognizer. When a spoken pattern is recognized as part of the grammar, a result event occurs and the Result. Accepted() method is called. The event object contains the information that we need to find out which phrase in the grammar was spoken: Result res = (Result)(re. Source()). Result. Token tokens. This is in reference to the inexact nature of the process of speech recognition. We then extract the string that the recognizer guesses is the correct one: gst = tokens! In addition, you learned about the speech engine that provides services to both of these capabilities. You learned how to synthesize speech using an implementation of the Synthesizer interface provided by the IBM Via. Voice product. We wrote several programs that produced speech from written input. You also learned how to use the Java Speech Markup Language (JSML) to give instructions to the speech engine about how to pronounce the words, dates, and numbers that appear in the text. You saw examples of how you can use XML tags to communicate this information to the synthesizer. Finally, we took a look at the art of recognizing speech with software. We created a simple grammar using the Java Speech Grammar Format (JSGF) and loaded it into a recognizer program. We then spoke into the microphone and watched as our spoken words appeared in the console. In addition, you saw how a command can be tied to a spoken word by the way the word bye was used to close this program. The subject of speech in Java is larger than a single chapter can cover. This chapter provides enough information so that you will be able to get both a synthesizer and a recognizer working on your computer. Hopefully, you will be able to copy and paste these programs and enhance them to meet the requirements of your projects. Authors of this Chapter. Stephen Potts is an independent consultant, author, and Java instructor in Atlanta, Georgia (United States). Steve received his computer science degree in 1. Georgia Tech. He has worked in a number of disciplines during his 2. His previous books include Special Edition Using Visual C++ 4 and Java 1. How- To. He can be reached via e- mail at stevepotts@mindspring. Source of this material. This is Chapter 1. Processing Speech with Java from the book Java 2 Unleashed, Sixth Edition (ISBN: 0- 6. X) written by Stephen Potts, Alex Pestrikov, and Mike Kopack, published by Sams Publishing. All rights reserved. To access the full Table of Contents for the book Other Chapters from Sams Publishing: Web Services and Flows (WSFL)Overview of JXTAIntroduction to EJBs.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
March 2018
Categories |