Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced   Derwent    Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 14pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
  Go to:  Derwent  
 Email this to a friend  Email this to a friend 
       
Title: US6006223: Mapping words, phrases using sequential-pattern to find user specific trends in a text database
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
14 pages

 
Inventor: Agrawal, Rakesh; San Jose, CA
Srikant, Ramakrishnan; San Jose, CA
Lent, Brian Scott; Union City, WA

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1999-12-21 / 1997-08-12

Application Number: US1997000909911

IPC Code: Advanced: G06F 17/30;
Core: more...
IPC-7: G06F 17/30;

ECLA Code: G06F17/30T1E;

U.S. Class: Current: 707/005; 707/003; 707/006; 707/102; 707/203; 707/E17.084; 715/236;
Original: 707/005; 707/003; 707/006; 707/102; 707/203; 707/511; 707/532;

Field of Search: 707/001,2,3,6,7,8,9,10,100,203,101,102,103,201,511,531,532,535 705/010 704/001,8,9,10

Priority Number:
1997-08-12  US1997000909911

Abstract:     A method and apparatus for mining text databases, employing sequential pattern phrase identification and shape queries, to discover trends. The method passes over a desired database using a dynamically generated shape query. Documents within the database are selected based on specific classifications and user defined partitions. Once a partition is specified, transaction IDs are assigned to the words in the text documents depending on their placement within each document. The transaction IDs encode both the position of each word within the document as well as representing sentence, paragraph, and section breaks, and are represented in one embodiment as long integers with the sentence boundaries. A maximum and minimum gap between words in the phrases and the minimum support all phrases must meet for the selected time period may be specified. A generalized sequential pattern method is used to generate those phrases in each partition that meet the minimum support threshold. The shape query engine takes the set of phrases for the partition of interest and selects those that match a given shape query. A query may take the form of requesting a trend such as "recent upwards trend", "recent spikes in usage", "downward trends", and "resurgence of usage". Once the phrases matching the shape query are found, they are presented to the user.

Attorney, Agent or Firm: Gray Cary Ware Freidenrich ;

Primary / Asst. Examiners: Amsbury, Wayne; Channavajjala, Srirama

Maintenance Status: CC Certificate of Correction issued

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 16 claims
What is claimed is:     1. A computer executed method for discovering trends in a database, comprising:
  • mapping words in a plurality of words to a data-sequence of data contained in a data field and identifiable by a position identifier, the data-sequence having transactions where a transaction includes a set of items, a word mapped to a single-item transaction in a data-sequence;
  • mapping phrases to a sequential-pattern of data contained in the data field and identifiable by a position identifier, the sequential-pattern of data having sets of items, a phrase mapped to a sequential-pattern having one item in each set of items;
  • generating a time stamp for each word of a plurality of words mapped in a data field, the time stamp specifying a data field location;
  • partitioning the database into discrete portions using the time stamp for each word;
  • determining a support value for a phrase, the support value representing a number of data-sequences in a selected data partition containing the phrase; and
  • outputting trends based upon the support values of the phrases, by:
  • determining frequent phrases using the mapping of each phrase, a phrase being frequent if the presence of the phrase in the selected partition of the database exceeds a minimum required support value; and
  • outputting trends using only the frequent phrases.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 46 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (46)   |   Backward references (6)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Buy PDF- 22pp US5615341  1997-03 Agrawal et al.  International Business Machines Corporation System and method for mining generalized association rules in databases
Buy PDF- 29pp US5675819  1997-10 Schuetze  Xerox Corporation Document information retrieval using global word co-occurrence patterns
Buy PDF- 31pp US5729730  1998-03 Wlaschin et al.  Dex Information Systems, Inc. Method and apparatus for improved information storage and retrieval system
Buy PDF- 21pp US5787386  1998-07 Kaplan et al.  Xerox Corporation Compact encoding of multi-lingual translation dictionaries
Buy PDF- 19pp US5790848  1998-08 Wlaschin  Dex Information Systems, Inc. Method and apparatus for data access and update in a shared file environment
Buy PDF- 45pp US5794178  1998-08 Caid et al.  HNC Software, Inc. Visualization of information using graphical representations of context vector based relationships and attributes
       
Foreign References: None

Other Abstract Info: DERABS G2000-105190 DERABS G2000-105190

Other References:
  • Osmar R Zaiane et al., discovering web access patterns and trends by applying OLAP and data mining technology on web logs, IEEE Apr. 1998, and 19-29.
  • Ming-Syan Chen, et al.,. efficient data mining for traversal patterns, IEEE Apr. 1998 and 209-221. (13 pages) Cited by 2 patents [ISI abstract]
  • Mika Klemettinen et al., a data mining methodology and its application to semi-automatic knowledge acquistion, IEEE Sep. 1997 and 670-677.
  • Feldman R. et al., "Knowledge Discovery in Textural Databases (KDT)", Proc. of the 1st Int'l Conf. on Knowledge Discovery in Databases (KDD) and Data Mining, 1995 and Bar-Ilan University, Israel, Math and Computer Science Dept., KKD-95, pp. 112-117.
  • Feldman, R. et al., "Mining Associations in Text in the Presence of Background Knowledge", Proc. of the 2nd Int'l. Conf. on Knowledge Discovery on Databases and Data Mining, 1996. and Technology Spotlight / Spatial, Temporal & Multimedia Data Mining, pp. 343-346 (undated).
  • Renouf, A., "Making Sense of Text: Automated Approaches to Meaning Extraction", 17th Int'l . On-Line Information Meeting Proceedings / Online Information 93, p. 77-87, England, Dec. 1993.
  • Srikant, R., et al., "Mining Sequential Patterns: Generalizations and Performance Improvements", Proc. of the 5th Int'l. Conf. on Extending Database Technology (EDBT), 1996, pp. 3-17.
  • Deerwester, S. et al., "Indexing by Latent Semantic Analysis", Journal of the American Society for Information Science, 41(6):391-407, 1990. (17 pages) Cited by 38 patents
  • Croft, W., et al. "The Use of Phrases and Structured Queries in Information Retrevial", 14th Int'l. ACCM SIGIR Conf. on Research and Development on Information Retrieval, 1991 and ACM 0-89791-448, pp. 32-45, 1991.
  • Agrawal, R. et al., "Fast Algorithms For Mining Association Rules", Proceedings of the 20th VLDB Conference Santiago, Chile, pp. 487-499, 1994.
  • Agrawal, R. et al., "Active Data Mining", IBM Almaden Research Center, California, 6 pages, (undated Abstract).
  • Agrawal, R. et al., "Querying Shapes of Histories" Proceedings of the 21st VLDB Conference, Zurich, Switzerland, 13 pages, 1995.
  • Agrawal, R. et al., "Mining Sequential Patterns", IEEE (1063-6382), pp. 3-14, 1995.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2010 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help