Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 10pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US5806021: Automatic segmentation of continuous text using statistical approaches
[ Derwent Title ]

Country: US United States of America

View Images High


10 pages

Inventor: Chen, Chengjun Julian; White Plains, NY
Liu, Fu-Hua; Elmsford, NY
Picheny, Michael Alan; White Plains, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1998-09-08 / 1996-09-04

Application Number: US1996000700823

IPC Code: Advanced: G06F 17/27;
Core: more...
IPC-7: G06F 17/20; G06F 17/27;

ECLA Code: G06F17/27;

U.S. Class: Current: 704/009; 704/001; 704/257; 715/202; 715/243; 715/257; 715/260;
Original: 704/009; 704/001; 704/257; 707/531;

Field of Search: 704/001,9,10,241,242,254,255,256,257 707/530,531,534,535,536

Priority Number:
1996-09-04  US1996000700823
1995-10-30  US1995000008120P

Abstract: An automatic segmenter for continuous text segments such text in a rapid, consistent and semantically accurate manner. Two statistical methods for segmentation of continuous text are used. The first method, called "forward-backward matching", is easy and fast but can produce occasional errors in long phrases. The second method, called "statistical stack search segmenter", utilizes statistical language models to generate more accurate segmentation output at an expense of two times more execution time than the "forward-backward matching" method. In some applications where speed is a major concern, "forward-backward matching" can be used, while in other applications where highly accurate output is desired, "statistical stack search segmenter" is ideal.

Attorney, Agent or Firm: Whitham, Curtis & Whitham ; Tassinari, Robert P. ;

Primary / Asst. Examiners: Hudspeth, David R.; Lestina, Matthew J.

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 9 claims
Having thus described our invention, what we claim as new and desire to secure by letters patent is as follows:     1. A computer implemented method of segmenting continuous text comprising the steps of:
  • a) determining a phrase from a string of characters in a first direction;
  • b) determining from a beginning of the phrase a longest possible word beginning at the beginning of the phrase;
  • c) repeating steps a) and b) until the phrase is completed;
  • d) repeating steps a), b) and c) in a direction opposite said first direction, beginning with the end of the phrase and working backwards; and
  • e) choosing a result having a higher likelihood than other possible results.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 58 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (58)   |   Backward references (9)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 13pp US4069393  1978-01 Martin et al.  Threshold Technology, Inc. Word recognition apparatus and method
Get PDF - 13pp US4750122  1988-06 Kaji et al.  Hitachi, Ltd. Method for segmenting a text into words
Get PDF - 17pp US5029084  1991-07 Morohasi et al.  International Business Machines Corporation Japanese language sentence dividing method and apparatus
Get PDF - 16pp US5199077  1993-03 Wilcox et al.  Xerox Corporation Wordspotting for voice editing and indexing
Get PDF - 6pp US5425129  1995-06 Garman et al.  International Business Machines Corporation Method for word spotting in continuous speech
Get PDF - 11pp US5448474  1995-09 Zamora  International Business Machines Corporation Method for isolation of Chinese words from connected Chinese text
Get PDF - 21pp US5655058  1997-08 Balasubramanian et al.  Xerox Corporation Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications
Get PDF - 12pp US5657424  1997-08 Farrell et al.  Dictaphone Corporation Isolated word recognition using decision tree classifiers and time-indexed feature vectors
Get PDF - 14pp US5706397  1998-01 Chow  Apple Computer, Inc. Speech recognition system with multi-level pruning for acoustic matching
Foreign References: None

Other Abstract Info: DERABS G1998-506231 DERABS G1998-506231

Inquire Regarding Licensing

Powered by Verity

Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help