Title: US5895447: Speech recognition using thresholded speaker class model selection or model adaptation
Country: US United States of America

8 pages

Inventor: Ittycheriah, Abraham Poovakunnel; Danbury, CT
Maes, Stephane Herman; Danbury, CT

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
Published / Filed: 1999-04-20 / 1997-01-28

Application Number: US1997000787031

IPC Code: Advanced: G10L 15/06; G10L 21/02;
IPC-7: G01L 9/06;

ECLA Code: G10L15/07; G10L21/0272; S10L25/12;

U.S. Class: Current: 704/231; 704/246; 704/251; 704/E15.011; 704/E21.012;
Original: 704/231; 704/246; 704/251;

Field of Search: 704/231,246,251,275

Priority Number:
1997-01-28  US1997000787031
1996-02-02  US1996000011058P

Abstract:     Clusters of quantized feature vectors are processed against each other using a threshold distance value to cluster mean values of sets of parameters contained in speaker specific codebooks to form classes of speakers against which feature vectors computed from an arbitrary input speech signal can be compared to identify a speaker class. The number of codebooks considered in the comparison may be thus reduced to limit mixture elements which engender ambiguity and reduce system response speed when the speaker population becomes large. A speaker class processing model which is speaker independent within the class may be trained on one or more members of the class and selected for implementation in a speech recognition processor in accordance with the speaker class recognized to further improve speech recognition to level comparable to that of a speaker dependent model. Formation of speaker classes can be supervised by identification of groups of speakers to be included in the class and the speaker class dependent model trained on members of a respective group.

Attorney, Agent or Firm: Whitham, Curtis, Whitham ; Tassinari, Jr., Robert P. ;

Primary / Asst. Examiners: Hudspeth, David R.; Smits, Talivaldis Ivars

Parent Case:

    This application is a continuation-in-part of a provisional U.S. patent application Ser. No. 60/011,058, entitled Speaker Identification System, filed Feb. 2, 1996, priority of which is hereby claimed under 35 U.S.C. §119(e)(1) and which is hereby fully incorporated by reference.

First Claim:
Having thus described our invention, what we claim as new and desire to secure by Letters Patent is as follows:     1. A speech processing system including
  • means for clustering information values representing respective frames of utterances of a plurality of speakers by speaker class in accordance with a threshold value to provide speaker class specific clusters of information,
  • means for comparing information representing frames of an utterance of a speaker with respective clusters of said speaker class specific clusters of information to identify a speaker class, and
  • means for processing speech information with a speaker class dependent model selected in accordance with a speaker class identified by said means for comparing information.

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 42pp USRE31188  1983-03 Pirz et al.  Bell Telephone Laboratories, Incorporated Multiple template speech recognition system
Get PDF - 44pp US4181821  1980-01 Pirz et al.  Bell Telephone Laboratories, Incorporated Multiple template speech recognition system
Get PDF - 22pp US4363102  1982-12 Holmgren et al.  Bell Telephone Laboratories, Incorporated Speaker identification system using word recognition templates
Get PDF - 7pp US5165095  1992-11 Borcherding  Texas Instruments Incorporated Voice telephone dialing
Get PDF - 23pp US5608840  1997-03 Tsuboka  Matsushita Electric Industrial Co., Ltd. Method and apparatus for pattern recognition employing the hidden markov model
Get PDF - 24pp US5608841  1997-03 Tsuboka  Matsushita Electric Industrial Co., Ltd. Method and apparatus for pattern recognition employing the hidden Markov model
Get PDF - 24pp US5638489  1997-06 Tsuboka  Matsushita Electric Industrial Co., Ltd. Method and apparatus for pattern recognition employing the Hidden Markov Model
Foreign References:
Publication Date IPC Code Assignee   Title
Get PDF - 18pp EP0831456A2 1998-03  G10L 3/00 CANON KABUSHIKI KAISHA Speech recognition method and apparatus therefor 
  JP1997409258769 1997-10  G10L 3/00    

Other Abstract Info: DERABS G1999-286725 DERABS G1999-286725 DERABS G1999-492636 DERABS G1999-492854

Other References:
  • Tetsuo Kosaka and Shigeki Sagayama, "Tree-Structured Speaker Clustering for Fast Speaker Adaptation," Proc. ICASSP 94, vol. I, pp. 245-248, May 1994.
  • Ananth Sankar, Francoise Beaufays, and Vassilios Digalakis, "Training Data Clustering for Improved Speech Recognition," Proc. Eurospeech 95, pp. 503-506, Sep. 1995.
  • Mukund Padmanabhan, Lalit R. Bahl, David Nahamoo, and Michael A. Picheny, "Speaker Clustering and Transformation for Speaker Adaptation in Large-Vocabulary Speech Recognition Systems," Proc. ICASSP 96, vol. II, pp. 701-704, May 1996.
  • Lawrence R. Rabiner and Ronald W. Schafer, Digital Processing of Speech Signals, Prentice-Hall, pp. 485-489, 1978.

