Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced   Derwent    Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 21pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
  Go to:  Derwent  
 Email this to a friend  Email this to a friend 
       
Title: US5724573: Method and system for mining quantitative association rules in large relational tables
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
21 pages

 
Inventor: Agrawal, Rakesh; San Jose, CA
Srikant, Ramakrishnan; San Jose, CA

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1998-03-03 / 1995-12-22

Application Number: US1995000577945

IPC Code: Advanced: G06F 17/30;
Core: more...
IPC-7: G06F 17/30;

ECLA Code: G06F17/30T;

U.S. Class: Current: 707/006; 707/001; 707/003; 707/E17.058;
Original: 395/606; 395/601; 395/603;

Field of Search: 395/601,603,606

Priority Number:
1995-12-22  US1995000577945

Abstract:     A method and apparatus are disclosed for mining quantitative association rules from a relational table of records. The method comprises the steps of: partitioning the values of selected quantitative attributes into intervals, combining adjacent attribute values and intervals into ranges, generating candidate itemsets, determining frequent itemsets, and outputting an association rule when the support for a frequent itemset bears a predetermined relationship to the support for a subset of the frequent itemset. Preferably, the partitioning step includes determining whether to partition and the number of partitions based on a partial incompleteness measure. The candidate generation includes discarding those itemsets not meeting a user-specified interest level and those having a subset which is not a frequent itemset. The frequent itemsets are determined using super-candidates that include information of the candidate itemsets. Preferably, each super-candidate has a data structure, such as a multi-dimensional tree or array, representing quantitative attributes common to the replaced candidate itemsets.

Attorney, Agent or Firm: Tran, Khanh Q. ; Pintner, James C. ;

Primary / Asst. Examiners: Amsbury, Wayne;

Family: None

First Claim:
Show all 30 claims
What is claimed is:     1. A method for identifying quantitative association rules from a table of records, each record having a plurality of attributes associated therewith, the attributes including quantitative and categorical attributes, each attribute having a value, the method comprising the steps of:
  • partitioning the values of each quantitative attribute from a selected group of quantitative attributes into a respective plurality of intervals;
  • determining a support for each value of the categorical attributes and the non-partitioned quantitative attributes, and a support for each interval of the partitioned quantitative attributes, the support for a value being a number of records in the table whose attribute values include the value, the support for an interval being a number of records in the table whose attribute values are part of the interval;
  • for each quantitative attribute, combining adjacent values of the attribute if the attribute is not partitioned, or adjacent intervals of the attribute if the attribute is partitioned, into ranges, as long as the support for each range is less than a maximum support;
  • identifying items with at least a minimum support, each item representing a quantitative attribute and a range, or a categorical attribute and a value, the items with at least the minimum support making up a seed set;
  • generating candidate itemsets from the seed set, each itemset being a set of items and having a support, the support of the itemset being a number of records in the table which support the itemset;
  • determining frequent itemsets from the candidate itemsets, the frequent itemsets being those itemsets whose support is more than the minimum support, the determined frequent itemsets becoming the next seed set;
  • repeating the steps of generating candidate itemsets and determining frequent itemsets until all the frequent itemsets are found; and
  • outputting an association rule when the support of a selected frequent itemset bears a predetermined relationship to the support of a subset of the selected frequent itemset, thereby satisfying a minimum confidence constraint, the association rule being an expression of the form XY where X and Y are itemsets.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 72 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (72)   |   Backward references (5)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Buy PDF- 29pp US5253361  1993-10 Thurman et al.  Emtek Health Care Systems, Inc. System for accessing a row of time-dependent data by referring to a composite index table indicating page locations of linked row labels
Buy PDF- 30pp US5315709  1994-05 Alston, Jr. et al.  Bachman Information Systems, Inc. Method and apparatus for transforming objects in data models
Buy PDF- 21pp US5504890  1996-04 Sanford   System for data sharing among independently-operating information-gathering entities with individualized conflict resolution rules
Buy PDF- 34pp US5537586  1996-07 Amram et al.  Individual, Inc. Enhanced apparatus and methods for retrieving and selecting profiled textural information records from a database of defined category structures
Buy PDF- 13pp US5614341  1997-03 Agrawal et al.  Xerox Corporation Multilayered photoreceptor with adhesive and intermediate layers
       
Foreign References: None

Other Abstract Info: DERABS G98-178893 DERG98-178893

Other References:
  • DeWitt et al, "The Gamma Database Machine Project", IEEE Trans. Knowledge & Data Engineering, Mar., 31, 1990.
  • Mannila et al, "Improved Methods for Finding Association Rules", Pub. No. c-1993-65, Universith Helsinki, 193, Dec. 31, 1993.
  • Park et al, "Eficient Data Mining for Association Rules", IBM Research Report, R210156, Aug. 31, 1995.
  • R. Agrawal, T. Imielinski, A Swami, Mining Association Rules Between Sets of Items in Large Databases, In Proc. of the ACM SIGMOD Conference on Management of Data, pp. 207-216, Washington, D.C. May 1993.
  • R. Agrawal, R. Srikant, Fast Algorithms for Mining Association Rules, In Proc. of the VLDB Conference, Santiago, Chile, pp. 487-499, Sep. 1944.
  • N. Beckmann, H. Kriegel, R. Schneider, B. Seeger, The R*-tree: An Efficient and Robust Access Method for Points and Rectangles, In Proc. of ACM SIGMOD, pp. 322-331, Atlantic City, NJ, May 1990.
  • R. T. NG, J. Han, Efficient and Effective Clustering Methods for Spatial Data Mining, In Proc. of the VLDB Conference, Santiago, Chile, pp. 144-155, Sep. 1994.
  • J. S. Park, M. Chen, P. S. Yu, An Effective Hash-Based Algorithm for Mining Association Rules, In Proc. of the ACM-SIGMOD Conference on Management of Data, pp. 175-186 San Jose, California May 1995.
  • G. P. Shapiro, Discovery, Analysis, and Presentation of Strong Rules, Knowledge Discovery in Databases, pp. 229-248, AAAI/MIT Press, Menlo Park, CA, 1991 (GTE Lab. Incorporated).
  • M. Houtsma, A. Swami, Set-Oriented Mining for Association Rules, IBM Research Report 9567 (83573), Computer Science, Oct. 22, 1993.
  • R. Srikant, R. Agrawal, Mining Generalized Association Rules, In Proc. of the VLDB Conference, pp. 407-419, Zurich, Switzerland, Sep. 1995.
  • J. Han, Y. Fu, Discovery of Multiple-Level Association Rules from Large Databases, In Proc. of the VLDB Conference, pp. 420-431, Zurich Switzerland, Sep. 1995.
  • A. Savasere, E. Omiecinski, S. Navathe, An Efficient Algorithm for Mining Association Rules in Large Databases, Proceedings of the 21st VLDB Conference pp. 432-444, Zurich, Switzerland, Sep. 1995.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2010 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help