Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 21pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US5467425: Building scalable N-gram language models using maximum likelihood maximum entropy N-gram models
[ Derwent Title ]
>> View Certificate of Correction for this publication

Country: US United States of America

View Images High


21 pages

Inventor: Lau, Raymond; Cambridge, MA
Rosenfeld, Ronald; Pittsburgh, PA
Roukos, Salim; Scarsdale, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1995-11-14 / 1993-02-26

Application Number: US1993000023543

IPC Code: Advanced: G06F 17/28; G10L 15/06; G10L 15/10; G10L 15/18; G10L 15/28;
IPC-7: G10L 9/00;

ECLA Code: G10L15/197; S10L15/183; S10L15/197;

U.S. Class: Current: 704/243; 704/240; 704/255; 704/E15.023;
Original: 395/002.52; 395/002.49; 395/002.64;

Field of Search: 381/ 395/2.4,2.49,2.45,2.52-2.54,2.59,2.64-2.66

Priority Number:
1993-02-26  US1993000023543

Abstract:     The present invention is an n-gram language modeler which significantly reduces the memory storage requirement and convergence time for language modelling systems and methods. The present invention aligns each n-gram with one of "n" number of non-intersecting classes. A count is determined for each n-gram representing the number of times each n-gram occurred in the training data. The n-grams are separated into classes and complement counts are determined. Using these counts and complement counts factors are determined, one factor for each class, using an iterative scaling algorithm. The language model probability, i.e., the probability that a word occurs given the occurrence of the previous two words, is determined using these factors.

Attorney, Agent or Firm: Sterne, Kessler, Goldstein & Fox ; Tasinari, Robert ;

Primary / Asst. Examiners: MacDonald, Allen R.; Doerrler, Michelle

Maintenance Status: E2 Expired  Check current status
CC Certificate of Correction issued
View Certificate of Correction

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Family: Show 4 known family members

First Claim:
Show all 21 claims
What is claimed is:     1. A computer based language modelling system receiving data in the form of training text divided into a series of n-grams, each n-gram comprising a series of "n" words, each n-gram having an associated count, the history of an n-gram being represented by the initial n-1 words of the n-gram, comprising:
  • a language modelling means for determining a conditional probability of a predicted word given the previous (n-1) words, comprising:
    • a memory means for storing the data;
    • a separating means coupled to said memory means for examining each word within each n-gram and classifying each n-gram into one of one or more classes based upon one or more words in a given n-gram, each class having one or more similar n-grams associated with said class, said similar n-grams having the same predicted word and x previous words, where x varies from (n-1) to zero, to associate each n-gram with exactly one of said one or more classes, each class is identified with one of one or more sets based upon the value of x used when determining the class of the n-gram;
    • a factor means coupled to the output of said separating means and to said memory means for determining a factor for each of said one or more classes, said factor representing the relative strength of predicting said predicted word given the previous (n-1) words, the value of each factor being approximately equal to the ratio of the sum of the counts of each n-gram associated with a given class over the sum of the counts of all (n-1)-grams which when followed by said predicted word would belong to said given class; and
    • a conditional probability means coupled to the output of said factor means for determining said conditional probability of the occurrence of said predicted word given that a particular sequence of (n-1) previous words have occurred using said factors, said conditional probability approximately equal to the ratio of a first factor, said first factor associated with the class that a given n-gram is associated with, said given n-gram equal to said predicted word and the history of said predicted word, the history equal to a particular sequence of (n-1) previous words, over the sum of one or more factors, said one or more factors associated with all of the classes of n-grams obtained by using said particular sequence of (n-1) words followed by any word of the vocabulary.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 76 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (76)   |   Backward references (3)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 30pp US4817156  1989-03 Bahl et al.  International Business Machines Corporation Rapidly training a speech recognizer to a subsequent speaker given training data of a reference speaker
Get PDF - 18pp US4831550  1989-05 Katz  International Business Machines Corporation Apparatus and method for estimating, from sparse data, the probability that a particular one of a set of events is the next event in a string of events
Get PDF - 15pp US5293584  1994-03 Brown et al.  International Business Machines Corporation Speech recognition system for natural language translation
Foreign References: None

Other Abstract Info: DERABS G1995-403732 DERABS G1995-403732 DERABS G1997-414037

Other References:
  • Jelinke et al., "Classifying Words for Improved Statistical Language Models", ICASSP '90, pp. 621-624, 1990.
  • Ney et al., "On Smoothing Techniques for Bigram-Based Natural Language Modelling", ICAASP '91, pp. 825-828, 1991.
  • Passeler et al., "Continuous-Speech Recogniton Using a Stochastic Language Model", ICASSP '89, pp. 719-722, 1989.
  • Lalit R. Bahl, Frederick Jelinek, and Robert L. Mercer, A Maximum Likelihood Approach to Continuous Speech Recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, No. 2, Mar. 1983, pp. 179-190. (12 pages) Cited by 42 patents

  • Inquire Regarding Licensing

    Powered by Verity

    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help