Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 15pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US6442519: Speaker model adaptation via network of similar users
[ Derwent Title ]
>> View Certificate of Correction for this publication

Country: US United States of America

View Images High


15 pages

Inventor: Kanevsky, Dimitri; Ossining, NY
Libal, Vit V.; Tarnvald, Czech Republic
Sedivy, Jan; Prague, Czech Republic
Zadrozny, Wlodek W.; Tarrytown, NY

Assignee: International Business Machines Corp., Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2002-08-27 / 1999-11-10

Application Number: US1999000437646

IPC Code: Advanced: G10L 15/06;
IPC-7: G10L 15/06; G10L 15/14;

ECLA Code: G10L15/07;

U.S. Class: 704/243; 704/270.1;

Field of Search: 704/243,244,245,270,270.1,275,233,246,250,251,255,256 379/88.01,88.02,88.03,88.04

Priority Number:
1999-11-10  US1999000437646

Abstract:     A speech recognition system, method and program product for recognizing speech input from computer users connected together over a network of computers. Speech recognition computer users on the network are clustered into classes of similar users according their similarities, including characteristics nationality, profession, sex, age, etc. Each computer in the speech recognition network includes at least one user based acoustic model trained for a particular user. The acoustic models include an acoustic model domain, with similar acoustic models being clustered according to an identified domain. User characteristics are collected from databases over the network and from users using the speech recognition system and then, distributed over the network during or after user activities. Existing acoustic models are modified in response to user production activities. As recognition progresses, similar language models among similar users are identified on the network. Update information, including information about user activities and user acoustic model data, is transmitted over the network and identified similar language models are updated. Acoustic models improve for users that are connected over the network as similar users use their respective speech recognition system.

Attorney, Agent or Firm: Percello, Louis J.Fitch, Even, Tabin & Flannery ;

Primary / Asst. Examiners: Banks-Harold, Marsha D.; Lerner, Martin

Maintenance Status: CC Certificate of Correction issued
View Certificate of Correction

INPADOC Legal Status: Show legal status actions

Parent Case:

    The present invention is related to U.S. patent application Ser. No. 08/787,031, filed Jan. 28, 1997 entitled "Speaker Recognition Using Thresholded Speaker Class Model Section or Model Adaptation" to Ittycheriah, et al. now issued as U.S. Pat. No. 5,895,447, U.S. patent application Ser. No. 08/788,471, filed Jan. 28, 1997 entitled "Text Independent Speaker Recognition for Transparent Command Ambiguity Resolution and Continuous Access Control" now U.S. Pat. No. 6,073,101 issued Jun. 6, 200, and U.S. patent application Ser. No. 08/787,029, filed Jan. 28, 1997 entitled "Speaker Model Prefetching" both to Stephane Maes now U.S. Pat. No. 6,088,669 issued Jul. 11, 2000, and (Ser. No. 09/422,383) entitled "Language Model Adaptation Via Network of Similar Users" filed Oct. 21, 1999, all assigned to the assignee of the present invention. These patents and patent applications are herein incorporated by reference in their entirety.

Family: None

First Claim:
Show all 26 claims
We claim:     1. A speech recognition system for recognizing speech input from computer users connected together over a network of computers, a plurality of said computers each including at least one acoustic model trained for a particular user, said system comprising:
  • means for comparing acoustic models of one or more computer users, each of said computer users using one of a plurality of computers;
  • means for clustering users on a network of said plurality of computers into clusters of similar users responsive to said comparison of acoustic models;
  • means for modifying each of said acoustic models responsive to user production activities;
  • means for comparing identified similar acoustic models and, responsive to modification of one or more of said acoustic models, modifying one or more compared said identified similar acoustic models; and
  • means for transmitting acoustic model data over said network to other computers connected to said network.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 66 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (66)   |   Backward references (13)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 12pp US5664058  1997-09 Vysotsky  NYNEX Science & Technology Method of training a speaker-dependent speech recognizer with automated supervision of training sufficiency
Get PDF - 6pp US5864807  1999-01 Campbell et al.  Motorola, Inc. Method and apparatus for training a speaker recognition system
Get PDF - 8pp US5895447  1999-04 Ittycheriah et al.  International Business Machines Corporation Speech recognition using thresholded speaker class model selection or model adaptation
Get PDF - 15pp US5897616  1999-04 Kanevsky et al.  International Business Machines Corporation Apparatus and methods for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
Get PDF - 13pp US5950157  1999-09 Heck et al.  SRI International Method for establishing handset-dependent normalizing models for speaker recognition
Get PDF - 11pp US6088669  2000-07 Maes  International Business Machines, Corporation Speech recognition with attempted speaker recognition for speaker model prefetching or alternative speech modeling
Get PDF - 14pp US6141641  2000-10 Hwang et al.  Microsoft Corporation Dynamically configurable acoustic model for speech recognition system
Get PDF - 12pp US6163769  2000-12 Acero et al.  Microsoft Corporation Text-to-speech using clustered context-dependent phoneme-based units
Get PDF - 9pp US6182037  2001-01 Maes  International Business Machines Corporation Speaker recognition over large population with fast and detailed matches
Get PDF - 9pp US6182038  2001-01 Balakrishnan et al.  Motorola, Inc. Context dependent phoneme networks for encoding speech information
Get PDF - 14pp US6253179  2001-06 Beigi et al.  International Business Machines Corporation Method and apparatus for multi-environment speaker verification
Get PDF - 6pp US6327568  2001-12 Joost  U.S. Philips Corporation Distributed hardware sharing for speech processing
Get PDF - 11pp US6363348  2002-03 Besling et al.  U.S. Philips Corporation User model-improvement-data-driven selection and update of user-oriented recognition model of a given type for word recognition at network server
Foreign References: None

Other References:
  • L.R. Bahl, P.V. de Souza, P.S. Gopalakrishnan, D. Nahamoo, M. Picheny, Decision Trees for Phonological Rules in Continuous Speech, Proceeding of the International Conference on Acoustics, Speech, and Signal Processing, Toronto, Canada, May 1991.
  • Frederick Jelinek, Statistical Methods for Speech Recognition, , The MIT Press, Cambridge, Jan. 1999, pp. 165-171.
  • M.J.F. Gales and P.C. Woodland, Means and variance adaptation within the MLLR framework, Computer Speech and Language (1996) 10, 249-264. (16 pages) [ISI abstract]
  • Chin-Hui Lee and J.L. Gauvain, Bayesian Adaptive Learning and Map Estimation of HMM, Automatic Speech and Speaker Recognition, 1996 Kluwer Academic Publishers, Boston, pp. 83-105.
  • Jerome R. Bellegarda, Context-Dependent Vector Clustering for Speech Recognition, Automatic Speech and Speaker Recognition, Kluwer Academic Publishers, Boston, pp. 133-153.
  • D. Matrouf, M. Adda-Decker, L. Lamel, and J. Gauvain, Language Identification Incorporating Lexical Information, Proceedings of the 1998 International Conference on Spoken Language Processing, ICSLP '98, Sydney, Australia, Dec. 1998, pp. 181-184.

  • Inquire Regarding Licensing

    Powered by Verity

    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help