 |
 |
|
|
|
|
Title: |
US5615299:
Speech recognition using dynamic features
[ Derwent Title ]

|
Country: |
US United States of America

|
| |
Inventor: |
Bahl, Lahit R.; Amawalk, NY
de Souza, Peter V.; San Jose, CA
Gopalakrishnan, Ponani; Yorktown Heights, NY
Picheny, Michael A.; White Plains, NY

|
Assignee: |
International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
News, Profiles, Stocks and More about this company

|
Published / Filed: |
1997-03-25
/ 1994-06-20

|
Application Number: |
US1994000262093

|
IPC Code: |
Advanced:
G10L 11/00;
G10L 15/02;
G10L 15/04;
G10L 15/14;
G10L 15/20;
G10L 19/00;
G10L 21/02;
Core:
G10L 15/00;
G10L 21/00;
more...
IPC-7:
G10L 5/06;
G10L 7/08;

|
ECLA Code: |
G10L15/02; G10L19/00S;

|
U.S. Class: |
Current:
704/254;
704/233;
704/240;
704/242;
704/256;
704/256.8;
704/E15.004;
704/E19.007;
Original:
395/002.63;
395/002.65;
395/002.49;
395/002.51;
395/002.42;

|
Field of Search: |
395/2.4,2.49,2.51,2.6,2.63,2.64,214
381/041,42,43,45

|
Priority Number: |
| 1994-06-20 |
US1994000262093 |

|
Abstract: |
A speech recognition technique utilizes a set of N different principal discriminant matrices. Each principal discriminant matrix is associated with a distinct class. The class is an indication of the proximity of a speech segment to neighboring phones. A technique for speech encoding includes arranging speech signal into a series of frames. A feature vector is derived which represents the speech signal for a speech segment or series of speech segments for each frame. A set of N different projected vectors are generated for each frame, by multiplying the principal discriminant matrices by the vector. This speech encoding technique is capable of being used in speech recognition systems by utilizing models, in which each model transition is tagged with one of the N classes. The projected vector is utilized with the corresponding tag to compute the probability that at least one particular speech port is present in said frame.

|
Attorney, Agent or Firm: |
Perman & Green, LLP ;

|
Primary / Asst. Examiners: |
MacDonald, Allen R.; Smits, Talivaldis Ivais

|
INPADOC Legal Status: |
Show legal status actions
Family Legal Status Report

|
Designated Country: |
DE FR GB

|
Family: |
Show 5 known family members

|
First Claim:
Show all 25 claims |
We claim:
1. A method for speech encoding, comprising the steps of:
- producing a set of N distinct principal discriminant matrices, each principal discriminant matrix being associated with a different class, each class being an indication of the proximity of a speech segment to one or more neighboring speech segments,
- arranging a speech signal into a series of frames;
- deriving a feature vector which represents said speech signal for each frame; and generating a set of N different projected vectors for each frame, by multiplying each of said N distinct principal discriminant matrices by said feature vector.

|
Background / Summary: |
Show background / summary

|
Drawing Descriptions: |
Show drawing descriptions

|
Description: |
Show description

|
Forward References: |
Show 11 U.S. patent(s) that reference this one

|
 |
 |
|
|
|
|
Foreign References: |

|
Other Abstract Info: |
DERABS G1996-041833

|
Other References: |
"Vector Quantization Procedure For Speech Recognition Systems Using Discrete Parameter Phoneme-Based Markov Word Models" IBM Technical Disclosure Bulletin, vol. 32, No. 7, Dec. 1989 pp. 320-321.
"Phoneme Recognition Using Time-Delay Neural Networks", by Waibel, A. et al., IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 37, No. 3 1989 pp. 328-339.
(12 pages)
Cited by 10 patents
"Admissible Strategies for Reducing Search Effort in Real Time Speech Recognition Systems" by L. R. Bahl, et al Elsevier Science Publishers B.V., 1990 pp.1371-1374.
"Application of an Auditory Model to Speech Recognition" by Jordan Cohen, J. Acoust, Soc.Am 85 (6), Jun. 1989 pp. 2623-2629.
(7 pages)
Cited by 3 patents
"An IBM Based Large-Vocabulary Isolated-Utterance Speech Recognizer" by A. Averbuch et al. 1986 IEEE pp. 53-56.
"Syllogistic Reasoning in Fuzzy Logic and its Application to Usuality And Reasoning with Dispositions", by Lofti A. Zadeh, 1985 IEEE, pp. 754-762.
(10 pages)
Cited by 2 patents
"A Method For The Construction of Acoustic Markov Models for Words" by L. R. Bahl et al., 1993 IEEE pp. 443-452 (vol. 1, No. 4).
"Speech Recognition Using Noise-Adaptive Prototypes" by Arthur Nadas, D. Nahamoo, IEEE Transactions on Acoustics, vol. 37, No. 10, Oct. 1989, pp. 1495-1503.
(9 pages)
"Differential Competitive Learning For Centroid Estimation and Phoneme Recognition" by S. Kong and B. Kosko, IEEE Transactions on Neural Networks, vol. 2, No. 1, Jan. 1991, pp. 118-124.
(7 pages)
Cited by 7 patents
[ISI abstract]
"A Maximum Likelihood Approach to Continuous Speech Recognition", by L. R. Bahl, et al., IEEE Transactions on Pattern Analysis and Machine Intelligence vol. PAM1-5, No. 2, Mar. 1983 pp. 179-190.
(12 pages)
Cited by 42 patents
"Multonic Markov Word Models For Large Vocabulary Continuous Speech Recognition" by L. R. Bahl, et al., IEEE Transactions of Speech and Audio Processing, vol. 1, No. 3, Jul. 1993, pp. 334-343.
"Speaker Adaptation via VQ Prototype Modification" by D. Rtischev, et al., IEEE Transactions on Speech and Audio Processing, vol. 2, No. 1, Part 1, Jan. 1994 pp. 94-97.
"A Fast Approximate Acoustic Match For Large Vocabulary Speech Recognition" by L:. Bahl, et al., IEEE Transactions on Speech and Audio Processing vol. 1, No. 1, Jan. 1993, pp. 59-67.

|


|
Nominate this for the Gallery...

|
|