Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 11pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US5680509: Method and apparatus for estimating phone class probabilities a-posteriori using a decision tree
[ Derwent Title ]

Country: US United States of America

View Images High


11 pages

Inventor: Gopalakrishnan, Ponani S.; Yorktown Heights, NY
Nahamoo, David; White Plains, NY
Padmanabhan, Mukund; Ossining, NY
Picheny, Michael Alan; White Plains, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1997-10-21 / 1994-09-27

Application Number: US1994000312584

IPC Code: Advanced: G10L 15/06; G10L 15/08;
IPC-7: G10L 5/06;

ECLA Code: G10L15/063; G10L15/08;

U.S. Class: Current: 704/240; 704/231; 704/255; 704/E15.008; 704/E15.014;
Original: 395/002.49; 395/002.54; 395/002.64;

Field of Search: 395/002,2.4,2.49,5.54,2.6-2.65,2.51 381/043

Priority Number:
1994-09-27  US1994000312584

Abstract:     A method and apparatus for estimating the probability of phones, a-posteriori, in the context of not only the acoustic feature at that time, but also the acoustic features in the vicinity of the current time, and its use in cutting down the search-space in a speech recognition system. The method constructs and uses a decision tree, with the predictors of the decision tree being the vector-quantized acoustic feature vectors at the current time, and in the vicinity of the current time. The process starts with an enumeration of all (predictor, class) events in the training data at the root node, and successively partitions the data at a node according to the most informative split at that node. An iterative algorithm is used to design the binary partitioning. After the construction of the tree is completed, the probability distribution of the predicted class is stored at all of its terminal leaves. The decision tree is used during the decoding process by tracing a path down to one of its leaves, based on the answers to binary questions about the vector-quantized acoustic feature vector at the current time and its vicinity.

Attorney, Agent or Firm: Tassinari, Jr., Robert P. ;

Primary / Asst. Examiners: Tung, Kee M.;

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 11 claims
We claim:     1. A method of recognizing speech, comprising:
  • (a) inputting a set of training data comprising a plurality of records, each record of the training data comprising a sequence of 2K+1 feature vectors and a member of the class, each feature vector being represented by a label;
  • (b) forming a binary decision tree, the tree comprising a root node and a plurality of child nodes each associated with a binary question, the tree terminating in a plurality of terminal nodes, wherein the step of forming the trees comprises:
    • (i) for each index t in the sequence of feature vectors, wherein the index t refers to the tth label in the sequence of 2K+1 labels in a training record, dividing the labels at each of indexes t-K, . . . , t, . . . t+K, into pairs of sets, respectively, wherein the labels at each of the indexes are divided so as to minimize entropy of the classes associated with the pairs of sets;
    • (ii) selecting from the pairs of sets a lowest entropy pair;
    • (iii) generating a binary question and assigning it to the node, wherein the question asks whether a label to be classified, occurring at index T corresponding to the index of the lowest entropy pair, is a member of the first set or the second set;
  • (c) partitioning the data at the current node into two child nodes in accordance with this question;
  • (d) repeating steps (b)(i)-(b)(iii) for each child node;
  • (e) for each child node, computing a probability distribution of the occurrence of the class members, given the members of the set of labels at that node;
  • inputting a sequence of speech to be recognized;
  • traversing the binary decision tree for every time frame of an input sequence of speech to determine a distribution of most likely phones for each time frame, the most likely phones for each time frame collectively forming a phone sequence;
  • outputting a recognition result based upon the distribution of most likely phones.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 19 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (19)   |   Backward references (5)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 35pp US4759068  1988-07 Bahl et al.  International Business Machines Corporation Constructing Markov models of words from multiple utterances
Get PDF - 10pp US4852173  1989-07 Bahl et al.  International Business Machines Corporation Design and construction of a binary-tree system for language modelling
Get PDF - 31pp US5033087  1991-07 Bahl et al.  International Business Machines Corp. Method and apparatus for the automatic determination of phonological rules as for a continuous speech recognition system
Get PDF - 12pp US5263117  1993-11 Nadas et al.  International Business Machines Corporation Method and apparatus for finding the best splits in a decision tree for a language model for a speech recognizer
Get PDF - 17pp US5267345  1993-11 Brown et al.  International Business Machines Corporation Speech recognition apparatus which predicts word classes from context and words from word classes
Foreign References: None

Other Abstract Info: DERABS G1997-525994 DERABS G1997-525994

Other References:
  • L. R. Bahl et al, "A Fast Approximate Acoustic Match for Large . . . " IEEE Trans. on Speech & Audio Processing. V. 1 #1 Jan. 93 pp. 59-67.
  • A. Nadas et al "An iterative flip-flop approximation of the . . . " Proc. of International Conf. on Acoustics, Speech etc. '91 pp. 565-568.

  • Inquire Regarding Licensing

    Powered by Verity

    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help