Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced   Derwent    Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 29pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
  Go to:  Derwent  
 Email this to a friend  Email this to a friend 
       
Title: US6230151: Parallel classification for data mining in a shared-memory multiprocessor system
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
29 pages

 
Inventor: Agrawal, Rakesh; San Jose, CA
Ho, Ching-Tien; San Jose, CA
Zaki, Mohammed J.; Rochester, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2001-05-08 / 1998-04-16

Application Number: US1998000061808

IPC Code: Advanced: G06F 9/45; G06F 17/30;
Core: more...
IPC-7: G06F 15/18;

ECLA Code: G06F17/30S8R1; G06F9/45M3;

U.S. Class: Current: 706/012; 707/101;
Original: 706/012; 706/012; 707/101;

Field of Search: 395/706 705/006 706/049,50,12 707/001,101

Priority Number:
1998-04-16  US1998000061808

Abstract:     A method and system for generating a decision-tree classifier in parallel in a shared-memory multiprocessor system is disclosed. The processors first generate in the shared memory an attribute list for each record attribute. Each attribute list is assigned to a processor. The processors independently determine the best splits for their respective assigned lists, and cooperatively determine a global best split for all attribute lists. The attribute lists are reassigned to the processors and split according to the global best split into the lists for child nodes. The split attribute lists are again assigned to the processors and the process is repeated for each new child node until each attribute list for the new child nodes includes only tuples of the same record class or a fixed number of tuples.

Attorney, Agent or Firm: Tran, Khanh Q. ; McSwain, Marc D. ;

Primary / Asst. Examiners: Lintz, Paul R.; Khatri, Anil

Maintenance Status: E2 Expired  Check current status

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 51 claims
What is claimed is:     1. A method for generating a decision-tree classifier in a shared-memory multiprocessor system from a set of records, the tree having a plurality of nodes, the method comprising the steps of:
  • (a) generating cooperatively by the processors, in the shared memory, an attribute list for each attribute of the records, the attribute lists corresponding a current node and including tuples each having information on a record class;
  • (b) assigning each attribute list of the current node to one of the processors;
  • (c) each processor accessing the attribute lists assigned to the processor, in the shared memory, to determine a best split for each attribute list;
  • (d) the processors cooperatively determining, through the shared memory, a global best split for all the attribute lists associated with the current node;
  • (e) reassigning each attribute list of the current node to one of the processors;
  • (f) each processor splitting the attribute lists reassigned to the processor according to the global best split into new attribute lists, the new lists corresponding to child nodes of the current node and residing in the shared memory; and
  • (g) repeating steps (b)-(f) with each newly created child node as the current node, until each attribute list for the newly created child nodes includes only tuples of the same record class.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 23 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (23)   |   Backward references (13)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Buy PDF- 21pp US4825354  1989-04 Agrawal et al.  American Telephone and Telegraph Company, AT&T Bell Laboratories Method of file access in a distributed processing computer network
Buy PDF- 17pp US5463773  1995-10 Sakaibara et al.  Fujitsu Limited Building of a document classification tree by recursive optimization of keyword selection function
Buy PDF- 22pp US5615341  1997-03 Agrawal et al.  International Business Machines Corporation System and method for mining generalized association rules in databases
Buy PDF- 12pp US5668988  1997-09 Chen et al.  International Business Machines Corporation Method for mining path traversal patterns in a web environment by converting an original log sequence into a set of traversal sub-sequences
Buy PDF- 25pp US5742811  1998-04 Agarwal et al.  International Business Machines Corporation Method and system for mining generalized sequential patterns in a large database
Buy PDF- 16pp US5819266  1998-10 Agrawal et al.  International Business Machines Corporation System and method for mining sequential patterns in a large database
Buy PDF- 28pp US5864839  1999-01 Bourgoin  TM Patents, L.P. Parallel system and method for generating classification/regression tree
Buy PDF- 12pp US5875285  1999-02 Chang   Object-oriented data mining and decision making system
Buy PDF- 14pp US5884305  1999-03 Kleingerg et al.  International Business Machines Corporation System and method for data mining from relational data by sieving through iterated relational reinforcement
Buy PDF- 17pp US5884320  1999-03 Agarwal et al.  International Business Machines Corporation Method and system for performing proximity joins on high-dimensional data points in parallel
Buy PDF- 16pp US5899992  1999-05 Iyer et al.  International Business Machines Corporation Scalable set oriented classifier
Buy PDF- 27pp US5960446  1999-09 Schmuck et al.  International Business Machines Corporation Parallel file system and method with allocation map
Buy PDF- 21pp US6003029  1999-12 Agrawal et al.  International Business Machines Corporation Automatic subspace clustering of high dimensional data for data mining applications
       
Foreign References: None

Other References:
  • Holt et al., "Efficient mining of association rules in text databases", CIKM ACM, pp 234-242, Jan. 1999.*
  • Anand et al, "The role of domain kowledge in data mining", CIKM ACM pp 37-43, Jun. 1995.*
  • Cromp et al., "Data mining of multidimensional remotely sensed images", CIKM ACM, pp 471-480, Nov. 1993.*
  • Shafer et al, "Sprint a scalable parallel classifier for data mining", Proc. of the 22nd VLDN conf., pp 544-555, 1996.*
  • Oguchi et al, "Dynamic remote memory acquistion for parallel mining on ATM connecetd PC cluster", ACM ICS, pp 246-252, Jan. 1991.*
  • Weiss, "Strip mining on SIMD architecture", ACM pp 234-243, Jan. 1991.*
  • Goil et al, "High performance multidimensional analysis of large datasets", ACM DOLAP, pp 34-39, Aug. 1998.*
  • Muller et al, "A high perfromnce multi structure file system design", ACM pp 56-67, Mar. 1991.*
  • Callahan et al, "Parallel implementation of a frontal finite element solver on multiple platform", ACM SAC pp 491-495, Apr. 1999.*
  • Kennedy et al, "Optimizing for parallelism and data locality", ACM ICS pp 323-334, Jun. 1992.*
  • Agrawl et al, "Automatic subspace clustring of high dimensional data for data mining applications", ACM SIGMOD, pp 94-105, May 1998.*
  • Li et al, "Free parallel data mining", ACM SIGMOD, pp 541-543, May 1998.*
  • Shintani et al, "Parallel mining algorithms for genearlized association rules with classification hierarchy", ACM SIGMOD pp 25-36, May 1998.*
  • Zaki et al, "Parallel classification for data mining on shared memeory multiprocessors", IEEE, pp 198-205, 1999.*
  • Zaki et al, "Evaluation of sampling for data mining of association rules", IEEE pp 42-50, 1997.*
  • Agrawal, "Parallel mining of association rules", IEEE, vol. 8, No. 6, pp 962-969, Dec. 1996.*
  • Park et al, "Efficient parallel data mining for association rules", ACM CIKM, pp 31-36, Jun. 1995.*
  • Zaki et al., "A localized algorithm for parallel association mining", ACM SPAA, pp 321-330, 1996.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2010 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help