Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 17pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US5729656: Reduction of search space in speech recognition using phone boundaries and phone ranking
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
17 pages

 
Inventor: Nahamoo, David; White Plains, NY
Padmanabhan, Mukund; Ossining, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1998-03-17 / 1994-11-30

Application Number: US1994000347013

IPC Code: Advanced: G10L 15/04; G10L 15/00; G10L 15/02; G10L 15/14;
IPC-7: G10L 5/06;

ECLA Code: G10L15/04; S10L15/08P; S10L15/14M;

U.S. Class: Current: 704/254; 704/255; 704/E15.005;
Original: 395/002.63;

Field of Search: 381/041,42,43 395/2.6,2.62-2.66,2.4

Priority Number:
1994-11-30  US1994000347013

Abstract:     A method for estimating the probability of phone boundaries and the accuracy of the acoustic modelling in reducing a search-space in a speech recognition system. The accuracy of the acoustic modelling is quantified by the rank of the correct phone. The system includes a microphone for converting an utterance into an electrical signal, which is processed by an acoustic processor and label match which finds the best-matched acoustic label prototype. A probability distribution on phone boundaries is produced for every time frame using a first decision tree. These probabilities are compared to a threshold and some time frames are identified as boundaries between phones. An acoustic score is computed for all phones between every given pair of hypothesized boundaries, and the phones are ranked on the basis of this score. A second decision tree is traversed for every time frame to obtain the worst case rank of the correct phone at that time, and a short list of allowed phones is made for every time frame. A fast acoustic word match processor matches the label string from the acoustic processor to produce an utterance signal which includes at least one word. From recognition candidates produced by the fast acoustic match and the language model, the detailed acoustic match matches the label string from the acoustic processor against acoustic word models and outputs a word string corresponding to an utterance.

Primary / Asst. Examiners: MacDonald, Allen R.; Mattson, Robert

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Designated Country: DE FR GB 

Family: Show 5 known family members

First Claim:
Show all 10 claims
We claim:     1. A method of recognizing speech, comprising the steps of:
  • a) inputting a plurality of words of training data;
  • b) training a plurality of first binary decision trees to ask a maximally informative question at each node based upon contextual information in the training data, wherein each first binary decision tree corresponds to a different time in a sequence of the training data;
  • c) traversing one of the first binary decision trees for every time frame of an input sequence of speech to determine a probability distribution for every time frame, the probability distribution being the probability that a node is a phone boundary;
  • d) comparing the probabilities associated with the time frames with a threshold for identifying some time frames as boundaries between phones;
  • e) providing an acoustic score for all phones between every given pair of boundaries to generate a second binary decision tree of such acoustic scores;
  • f) traversing the second binary decision tree of such acoustic scores for all phones to rank the phones from best to worst on the basis of this score; and
  • g) outputting a recognition result in response to the score.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 35 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (35)   |   Backward references (16)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 79pp US4718094  1988-01 Bahl et al.  International Business Machines Corp. Speech recognition system
Get PDF - 13pp US4741036  1988-04 Bahl et al.  International Business Machines Corporation Determination of phone weights for markov models in a speech recognition system
Get PDF - 5pp US4773093  1988-09 Higgins et al.  ITT Defense Communications Text-independent speaker recognition system and method based on acoustic segment matching
Get PDF - 48pp US4803729  1989-02 Baker  Dragon Systems, Inc. Speech recognition method
Get PDF - 24pp US4805219  1989-02 Baker et al.  Dragon Systems, Inc. Method for speech recognition
Get PDF - 14pp US4813074  1989-03 Marcus  U.S. Philips Corp. Method of and device for segmenting an electric signal derived from an acoustic signal
Get PDF - 10pp US4852173  1989-07 Bahl et al.  International Business Machines Corporation Design and construction of a binary-tree system for language modelling
Get PDF - 50pp US4977599  1990-12 Bahl et al.  International Business Machines Corporation Speech recognition employing a set of Markov models that includes Markov models representing transitions to and from silence
Get PDF - 42pp US5027408  1991-06 Kroeker et al.   Speech-recognition circuitry employing phoneme estimation
Get PDF - 9pp US5144671  1992-09 Mazor et al.  GTE Laboratories Incorporated Method for reducing the search complexity in analysis-by-synthesis coding
Get PDF - 12pp US5222146  1993-06 Bahl et al.  International Business Machines Corporation Speech recognition apparatus having a speech coder outputting acoustic prototype ranks
Get PDF - 14pp US5233681  1993-08 Bahl et al.  International Business Machines Corporation Context-dependent speech recognizer using estimated next word context
Get PDF - 12pp US5263117  1993-11 Nadas et al.  International Business Machines Corporation Method and apparatus for finding the best splits in a decision tree for a language model for a speech recognizer
Get PDF - 13pp US5280562  1994-01 Bahl et al.  International Business Machines Corporation Speech coding apparatus with single-dimension acoustic prototypes for a speech recognizer
Get PDF - 15pp US5293584  1994-03 Brown et al.  International Business Machines Corporation Speech recognition system for natural language translation
Get PDF - 29pp US5390278  1995-02 Gupta et al.  Bell Canada Phoneme based speech recognition
       
Foreign References:
Buy
PDF
Publication Date IPC Code Assignee   Title
Get PDF - 12pp EP0313975A2 1989-05  G06F 15/36 IBM Design and construction of a binary-tree system for language modelling 
Get PDF - 33pp EP0387602A2 1990-09  G01L 5/06 IBM Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system. 
Get PDF - 8pp EP0424655A2 1991-05  G10L 5/06 GRUNDIG E.M.V. Elektro-Mechanische Versuchsanstalt Max Grundig holländ. Stiftung & Co. KG. System for the transmission of widescreen videosignals for the displaying on television receivers with a conventional or wide aspect ratio 


Other Abstract Info: DERABS G1996-261876

Other References:
  • L.R. Bahl et al., "Faster Acoustic Match Computation", IBM Technical Disclosure Bulletin, vol. 23, No. 4, Sep. 1980, pp. 1718-1719.
  • P.S. Gopalakrishnan et al., "Channel-Bank-Based Thresholding to Improve Search Time in the Fast-Match", IBM Technical Disclosure Bulletin, vol. 37, No. 02A, Feb. 1994, pp. 113-114.
  • V. Algazi et al., "Transform Representation of the Spectra of Acoustic Speech Segments With Applications I: General Approach and Application to Speech Recognition", IEEE Transactions on Speech and Audio Processing, vol. 1, No. 2, Apr., 1993 pp. 180-195.
  • L. Bahl, "A FAst Approximate Acoustic Match for Large Vocabulary Speech Recognition", IEEE Transactions on Speech and Audio Processing, vol. 1, No. 1, Jan. 1993, pp. 59-67.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help