Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 11pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US6073091: Apparatus and method for forming a filtered inflected language model for automatic speech recognition
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
11 pages

 
Inventor: Kanevsky, Dimitri; Ossining, NY
Monkowski, Michael Daniel; New Windsor, NY
Sedivy, Jan; Prague, Czech Republic

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2000-06-06 / 1997-08-06

Application Number: US1997000906812

IPC Code: Advanced: G10L 15/18;
IPC-7: G06F 17/28; G10L 5/06; G10L 9/00;

ECLA Code: G10L15/197;

U.S. Class: Current: 704/009; 704/257; 704/E15.023;
Original: 704/009; 704/257;

Field of Search: 704/001,9,240,243,255,256,257 707/530,531

Priority Number:
1997-08-06  US1997000906812

Abstract:     A method of forming a language model for a language having a selected vocabulary of word forms comprises: (a) mapping the word forms into integer vectors in accordance with frequencies of word form occurrence; (b) partitioning the integer vectors into subsets, the subsets respectively having ranges of frequencies of word form occurrence associated therewith, the subsets being arranged in a descending order of frequency ranges; (c) respectively assigning maps to the subsets; (d) filtering a textual corpora using the maps assigned to the subsets in order to generate indexed integers; (e) determining n-gram statistics for the indexed integers; and (f) estimating n-gram language model probabilities from the n-gram statistics to form the language model.

Attorney, Agent or Firm: F.Chau & Associates, LLP ;

Primary / Asst. Examiners: Isen, Forester W.; Edouard, Patrick N.

Maintenance Status: E1 Expired  Check current status

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 27 claims
What is claimed is:     1. A method of forming a language model for a language having a selected vocabulary of word forms, the method comprising the steps of:
  • (a) mapping the word forms into integer vectors in accordance with frequencies of word form occurrence;
  • (b) partitioning the integer vectors into subsets, the subsets respectively having ranges of frequencies of word form occurrence associated therewith, the subsets being arranged in a descending order of ranges;
  • (c) respectively assigning maps to the subsets;
  • (d) filtering a textual corpora using the maps assigned to the subsets in order to generate indexed integers;
  • (e) determining n-gram statistics for the indexed integers;
  • (f) estimating n-gram language model probabilities from the n-gram statistics to form the language model; and
  • (g) determining a probability of a word sequence uttered by a speaker, using said language model.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 22 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (22)   |   Backward references (5)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 21pp US5467425  1995-11 Lau et al.  International Business Machines Corporation Building scalable N-gram language models using maximum likelihood maximum entropy N-gram models
Get PDF - 19pp US5490061  1996-02 Tolin et al.  Toltran, Ltd. Improved translation system utilizing a morphological stripping process to reduce words to their root configuration to produce reduction of database size
Get PDF - 22pp US5680511  1997-10 Baker et al.  Dragon Systems, Inc. Systems and methods for word recognition
Get PDF - 11pp US5828999  1998-10 Bellegarda et al.  Apple Computer, Inc. Method and system for deriving a large-span semantic language model for large-vocabulary recognition systems
Get PDF - 14pp US5835888  1998-11 Kanevsky et al.  International Business Machines Corporation Statistical language model for inflected languages
       
Foreign References: None

Other Abstract Info: DERABS G2000-463859 DERABS G2000-463859

Other References:
  • L. Bahl et al., F. Jelinek, R. Mercer, "A Maximum Likelihood Approach to Continuous Speech Recognition", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, No. 2, Mar. 1983, pp. 179-190, IV, Language Modeling on p. 181. (12 pages) Cited by 42 patents


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help