Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 35pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US4759068: Constructing Markov models of words from multiple utterances
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
35 pages

 
Inventor: Bahl, Lalit R.; Amawalk, NY
DeSouza, Peter V.; Yorktown Heights, NY
Mercer, Robert L.; Yorktown Heights, NY
Picheny, Michael A.; White Plains, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1988-07-19 / 1985-05-29

Application Number: US1985000738933

IPC Code: Advanced: G10L 15/14;
IPC-7: G10L 5/00;

ECLA Code: G10L15/14; T05K999/99;

U.S. Class: Current: 704/242; 704/240; 704/243; 704/254; 704/256; 704/256.2;
Original: 381/043;

Field of Search: 381/041-43 364/513.5

Priority Number:
1985-05-29  US1985000738933

Abstract:     Speech recognition is improved by splitting each feneme string at a consistent point into a left portion and a right portion. The present invention addresses the problem of constructing fenemic baseforms which take into account variations in pronunciation of words from one utterance thereof to another. Specifically, the invention relates to a method of constructing a fenemic baseform for a word in a vocabulary of word segments including the steps of: (a) transforming multiple utterances of the word into respective strings of fenemes; (b) defining a set of fenemic Markov model phone machines; (c) determining the best single phone machine P1 for producing the multiple feneme strings; (d) determining the best two phone baseform of the form P1P2 or P2P1 for producing the multiple feneme strings; (e) aligning the best two phone baseform against each feneme string; (f) splitting each feneme string into a left portion and a right portion with the left portion corresponding to the first phone machine of the two phone baseform and the right portion corresponding to the second phone machine of the two phone baseform; (g) identifying each left portion as a left substring and each right portion as a right substring; (h) processing the set of left substrings and the set of right substrings in the same manner as the set of feneme strings corresponding to the multiple utterances including the further step of inhibiting further splitting of a substring when the single phone baseform thereof has a higher probability of producing the substring than does the best two phone baseform; and (k) concatenating the unsplit single phones in an order corresponding to the order of the feneme substrings to which they correspond.

Attorney, Agent or Firm: Block, Marc A. ;

Primary / Asst. Examiners: Kemeny, Emanuel S.;

Maintenance Status: E3 Expired  Check current status

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Family: Show 10 known family members

First Claim:
Show all 13 claims
We claim:     1. In a speech recognition system having an acoustic processor, a method of processing multiple utterances of a word in the construction of a fenemic baseform for the word, the method comprising the steps of:
  • (a) providing as input a string of fenemes generated by the acoustic processor in response to an utterance of the word;
  • (b) repeating step (a) for each utterance of the multiple utterances; and
  • (c) locating a consistent point in each input string of fenemes, wherein each string of fenemes is divided by the consistent point thereof into a left portion and a right portion (i) each of the left portions corresponding to a first sound-representing model in a set of sound-representing models and (ii) each of the right portions corresponding to a second sound-representing model in the set of sound-representing models.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 40 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (40)   |   Backward references (10)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 13pp US4038503  1977-07 Moshier  Dialog Systems, Inc. Speech recognition apparatus
Get PDF - 44pp US4181821  1980-01 Pirz et al.  Bell Telephone Laboratories, Incorporated Multiple template speech recognition system
Get PDF - 47pp US4319085  1982-03 Welch et al.  Threshold Technology Inc. Speech recognition apparatus and method
Get PDF - 15pp US4348553  1982-09 Baker et al.  International Business Machines Corporation Parallel pattern verifier with dynamic time warping
Get PDF - 27pp US4481593  1984-11 Bahler  Exxon Corporation Continuous speech recognition
Get PDF - 9pp US4513436  1985-05 Nose et al.  Oki Electric Industry, Co., Ltd. Speech recognition system
Get PDF - 20pp US4587670  1986-05 Levinson et al.  AT&T Bell Laboratories Hidden Markov model speech recognition arrangement
Get PDF - 12pp US4590605  1986-05 Hataoka et al.  Hitachi, Ltd. Method for production of speech reference templates
Get PDF - 28pp US4593367  1986-06 Slack et al.  ITT Corporation Probabilistic learning element
Get PDF - 11pp US4618983  1986-10 Nishioka et al.  Sharp Kabushiki Kaisha Speech recognition with preliminary matching
       
Foreign References:
Buy
PDF
Publication Date IPC Code Assignee   Title
Get PDF - 67pp EP0025685 1981-03  G06K 9/66 INTERSTATE ELECTRONICS CORP Training circuit for audio signal recognition computer 
Get PDF - 53pp EP0033412 1981-08  G10L 1/04 SCOTT INSTR CO Method and apparatus for speech recognition 


Other References:
  • M. Cravero et al. "Phonetic Units for Hidden Markov Models", CSELT Technical Reports, vol. XIV No. 2 Apr. 1986, pp. 121-125.
  • L. R. Rabiner et al., "Recent Developments in the Application of Hidden Markov Models to Speaker-Independent Isolated Word Recognition", AT&T 1985 article, p. 1214.
  • H. Boulard et al., "Speaker Dependent Connected Speech Recognition Via Phonemic Markov Models", 1985 IEEE, pp. 1213-1216.
  • Douglas E. Paul et al., "Training of HMM Recognizers by Simulated Annealing", 1985, IEEE, pp. 13-16.
  • Yves Kamp et al., "State Reduction in Hidden Markov Chains Used for Speech Recognition", 1985, IEEE, pp. 1138-1145. (8 pages) Cited by 3 patents
  • "Isolated Word Recognition Using Hidden Markov Models", K. Sugawara, 1985, IEEE, pp. 1-4.
  • R. Schwartz, "Context-Dependent Modeling for Acoustic-Phonetic Recognition of Continuous Speech", 1985, IEEE, pp. 1205-1208.
  • J. F. Mari et al., "Speaker Independent Connected Digit Recognition Using Hidden Markov Models", 1985, Speech Tech, pp. 127-132.
  • R. Schwartz et al., "Improved Hidden Markov Modeling of Phonemes for Continuous Speech Recognition", 1984, IEEE, pp. 35.6.1-35.6.4.
  • S. E. Levinson et al., "Speaker Independent Isolated Digit Recognition Using Hidden Markov Models", 1983, IEEE, pp. 1049-1052.
  • Jean-Paul Haton et al., "Problems in the Design and Use of a Connected Speech Understanding System", 1982, IEEE, pp. 1616-1620.
  • D. M. Choy et al., "Speech Compression by Phoneme Recognition", 1982, IBM TDB, vol. 25, No. 6, pp. 2884-2886.
  • Bahl, et al., "Interpolation of Estimators Derived from Sparse Data", 1981, IBM TDB, vol. 24, No. 4, pp. 2038-2041.
  • Bahl, et al., "Faster Acoustic Match Computation", 1980, IBM TDB, vol. 23, No. 4, pp. 1718-1719.
  • Das, et al., "System for Temporal Registration of Quasi-Phonemic Utterance Representations", Dec., 1980, IBM TDB, vol. 23, No. 7A, pp. 3047-3050.
  • Bakis et al., "Continuous Speech Recognition Via Centisecond Acoustic States", Apr. 1976, Research Report, pp. 1-8.
  • Bakis et al., "Spoken Word Spotting Via Centisecond Acoustic States", Mar., 1976, IBM TDB, vol. 18, No. 10, pp. 3479-3481.
  • Itakura, "Minimum Prediction Residual Principle Applied to Speech Recognition" Feb., 1975, IEEE, pp. 145-150.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help