Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 11pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US5423032: Method for extracting multi-word technical terms from text
[ Derwent Title ]

Country: US United States of America

View Images High


11 pages

Inventor: Byrd, Roy J.; Ossining, NY
Justeson, John S.; Poughkeepsie, NY
Katz, Slava M.; Westport, CT

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1995-06-06 / 1992-01-03

Application Number: US1992000816908

IPC Code: Advanced: G06F 17/30;
IPC-7: G06F 17/30;

ECLA Code: G06F17/30T1E;

U.S. Class: Current: 704/001; 706/934; 707/754; 707/917; 707/999.005; 707/E17.084;
Original: 395/600; 395/934; 364/943.41; 364/943.42; 364/974;

Field of Search: 395/600,700,934,51,144 364/943.41,974,282.1,943.42

Priority Number:
1992-01-03  US1992000816908
1991-10-31  US1991000785641

Abstract:     A method and apparatus for extracting multi-word technical terms from a text file in a computer system. Word strings are selected from the text that have at least two words, that have at most a specified maximum number of words, that include none of a special set of selected tokens, and that only include selected characters. Word string which occur less than a specified minimum number of times in the text file are deleted. The remaining strings form a set of word strings very likely to be multi-word technical terms. Improvements on the quality of the set of word strings can be accomplished by deleting word strings which do not satisfy certain grammatical constraints.

Attorney, Agent or Firm: Schlemmer, Jr., Roy ; Cameron, Douglas W. ; Drumheller, Ronald L. ;

Primary / Asst. Examiners: Kriess, Kevin A.; Butler, Dennis M.

Maintenance Status: E1 Expired  Check current status

INPADOC Legal Status: Show legal status actions

Related Applications:
Application Number Filed Patent Pub. Date  Title
US1991000785641 1991-10-31       

Parent Case:

    This application is a continuation-in-part of co-pending application Ser. No. 07/785,641 filed Oct. 31, 1991, the priority of which is retained.

Family: None

First Claim:
Show all 28 claims
Having thus described our invention, what we claim as new and desire to secure by Letters Patents is:     1. Programmed computer apparatus for extracting a list of candidate multi-word technical terms from an input text file, a multi-word technical term being a string of at least two words having a particular meaning in some technical field, said apparatus comprising:
  • means for storing a stoplist of tokens which are assumed to not occur in multi-word technical terms, a token being a word, character or string of characters delimited by blanks and/or punctuation;
  • means for storing a maximum length parameter specifying a maximum number of tokens in any candidate multi-word technical term to be extracted;
  • means responsive to the stored stoplist for extracting text fragments from an input text file by identifying delimiting tokens in the input text file, including means for identifying as a delimiting token each token in the input text file which is the same as a token in the stored stoplist, the identified delimiting tokens defining text fragments therebetween;
  • means for deriving from the extracted text fragments all possible subsequences of tokens having a length of at least two tokens and not more than a maximum number of tokens specified by the stored maximum length parameter;
  • means for testing each of the derived subsequences against at least one filtering condition; and
  • means for creating a sublist of the derived subsequences which pass the at least one filtering condition, the created sublist being the list of candidate multi-word technical terms.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 64 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (64)   |   Backward references (8)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 32pp US4241402  1980-12 Mayper, Jr. et al.  Operating Systems, Inc. Finite state automaton with multiple state types
Get PDF - 5pp US4420816  1983-12 Yoshida  Sharp Kabushiki Kaisha Electronic word retrieval device for searching and displaying one of different forms of a word entered
Get PDF - 10pp US4823306  1989-04 Barbic et al.  International Business Machines Corporation Text search system
Get PDF - 22pp US4916655  1990-04 Ohsone et al.  Hitachi, Ltd. Method and apparatus for retrieval of a search string
Get PDF - 28pp US4972349  1990-11 Kleinberger   Information retrieval system and method
Get PDF - 16pp US5005127  1991-04 Kugimiya et al.  Sharp Kabushiki Kaisha System including means to translate only selected portions of an input sentence and means to translate selected portions according to distinct rules
Get PDF - 10pp US5228133  1993-07 Oppedahl   Method to perform text search in application programs in computer by selecting a character and scanning the text string to/from the selected character offset position
Get PDF - 20pp US5303361  1994-04 Colwell et al.  Lotus Development Corporation Search and retrieval system
Foreign References: None

Other Abstract Info: DERABS G95-214999 DERG95-214999

Other References:
  • "APS Text Search and Retrieval Classroom Manual", The Planning Research Corporation, Jun. 1989, pp. 2-5 to 2-39, 3-4 and B2-5.

  • Inquire Regarding Licensing

    Powered by Verity

    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help