Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 14pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US5835888: Statistical language model for inflected languages
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
14 pages

 
Inventor: Kanevsky, Dimitri; Ossining, NY
Roukos, Salim Estephan; Scarsdale, NY
Sedivy, Jan; Praha, Czech Republic

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1998-11-10 / 1996-06-10

Application Number: US1996000662726

IPC Code: Advanced: G10L 15/18; G10L 15/14;
IPC-7: G06F 17/28;

ECLA Code: G10L15/18; S10L15/14; S10L15/19;

U.S. Class: Current: 704/009; 704/010; 704/243; 704/257; 704/E15.018;
Original: 704/009; 704/010; 704/243; 704/257;

Field of Search: 704/009,1,10,243,241,244,255,257,256,240 707/531,532,533,534

Priority Number:
1996-06-10  US1996000662726

Abstract:     A statistical language model for inflected languages, having very large vocabularies, is generated by splitting words into stems, prefixes and endings, and deriving trigrams for the stems, ending and prefixes. The statistical dependence of endings and prefixes from each stem is also obtained, and the resulting language model is a weighted sum of these scores.

Primary / Asst. Examiners: Thomas, Joseph;

Maintenance Status: E2 Expired  Check current status

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Family: Show 3 known family members

First Claim:
Show all 21 claims
We claim:     1. A method for generating a language model, comprising:
  • providing to a classifier a corpus of words having a given sequence;
  • performing a plurality of transformations on the words in the corpus, while preserving the given sequence, to generate a plurality of classes, each of the plurality of classes having a different stream of word components associated therewith, said different stream of word components being generated in accordance with one of the transformations;
  • estimating statistical data representing each of the classes; and
  • weighting and combining the statistical data to form the language model.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 43 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (43)   |   Backward references (6)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 12pp US4342085  1982-07 Glickman et al.  International Business Machines Corporation Stem processing for data reduction in a dictionary storage file
Get PDF - 18pp US4831550  1989-05 Katz  International Business Machines Corporation Apparatus and method for estimating, from sparse data, the probability that a particular one of a set of events is the next event in a string of events
Get PDF - 23pp US4882759  1989-11 Bahl et al.  International Business Machines Corporation Synthesizing word baseforms used in speech recognition
Get PDF - 21pp US5467425  1995-11 Lau et al.  International Business Machines Corporation Building scalable N-gram language models using maximum likelihood maximum entropy N-gram models
Get PDF - 19pp US5490061  1996-02 Tolin et al.  Toltran, Ltd. Improved translation system utilizing a morphological stripping process to reduce words to their root configuration to produce reduction of database size
Get PDF - 22pp US5680511  1997-10 Baker et al.  Dragon Systems, Inc. Systems and methods for word recognition
       
Foreign References:
Buy
PDF
Publication Date IPC Code Assignee   Title
Get PDF - 26pp EP0282272A1 1988-09  G10L 5/06 FUJITSU LIMITED Voice recognition system 
Get PDF - 47pp EP0376501A2 1990-07  G10L 5/06 DRAGON SYSTEMS INC Speech recognition system 
  JP1994406035902A 1994-02       


Other Abstract Info: DERABS G98-034103

Inquire Regarding Licensing

Powered by Verity


Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help