Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced   Derwent    Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 20pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
  Go to:  Derwent  
 Email this to a friend  Email this to a friend 
       
Title: US6269376: Method and system for clustering data in parallel in a distributed-memory multiprocessor system
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
20 pages

 
Inventor: Dhillon, Inderjit Singh; Berkeley, CA
Modha, Dharmendra Shantilal; San Jose, CA

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2001-07-31 / 1998-10-26

Application Number: US1998000179027

IPC Code: Advanced: G06F 17/18; G06K 9/62;
Core: more...
IPC-7: G06F 17/00;

ECLA Code: G06F17/18; G06K9/62B1;

U.S. Class: Current: 707/101; 707/006; 707/102;
Original: 707/101; 707/006; 707/102;

Field of Search: 707/003-6,200-206,101-103 704/009,2-8,241 725/116

Priority Number:
1998-10-26  US1998000179027

Abstract:     A method, apparatus, article of manufacture, and a memory structure for clustering data points in parallel using a distributed-memory multi-processor system is disclosed. The disclosed system has particularly advantageous application to a rapid and flexible k-means computation for data mining. The method comprises the steps of dividing a set of data points into a plurality of data blocks, initializing a set of k global centroid values in each of the data blocks k initial global centroid values, performing a plurality of asynchronous processes on the data blocks, each asynchronous process assigning each data point in each data block to the closest global centroid value within each data block, computing a set of k block accumulation values from the data points assigned to the k global centroid values, and recomputing the k global centroid values from the k block accumulation values.

Attorney, Agent or Firm: Gates & Cooper LLP ;

Primary / Asst. Examiners: Black, Thomas; Jung, David

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 42 claims
What is claimed is:     1. A method of clustering a set of data points into k clusters, comprising:
  • (a) dividing the set of data points into P data blocks of substantially equal size, each data block assigned to one of P processors;
  • (b) selecting k initial global centroids with a first processor and broadcasting the k initial global centroids from the first processor to the remaining P-1 processors;
  • (c) computing the distance from each data point in each data block to the global centroid values by using the processor associated with the data block;
  • (d) assigning each data point in each data block to a global centroid value closest to the data point by using the processor associated with the data block;
  • (e) computing k block accumulation values in each block from the data points assigned thereto; and
  • (f) recomputing the k global centroid values from the k block accumulation values computed for each data block.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 13 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (13)   |   Backward references (9)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Buy PDF- 16pp US5448727  1995-09 Annevelink  Hewlett-Packard Company Domain based partitioning and reclustering of relations in object-oriented relational database management systems
Buy PDF- 15pp US5506801  1996-04 Tawel  California Institute of Technology High-performance ultra-low power VLSI analog processor for data compression
Buy PDF- 15pp US5519789  1996-05 Etoh  Matsushita Electric Industrial Co., Ltd. Image clustering apparatus
Buy PDF- 22pp US6024018  2000-02 Darel et al.  Intex Israel Technologies Corp., Ltd On press color control system
Buy PDF- 19pp US6047282  2000-04 Wilson et al.  AuthenTec, Inc. Apparatus and method for expandable biometric searching
Buy PDF- 22pp US6049777  2000-04 Sheena et al.  Microsoft Corporation Computer-implemented collaborative filtering based method for recommending an item to a user
Buy PDF- 19pp US6049797  2000-04 Guha et al.  Lucent Technologies, Inc. Method, apparatus and programmed medium for clustering databases with categorical attributes
Buy PDF- 20pp US6070159  2000-05 Wilson et al.  AuthenTec, Inc. Method and apparatus for expandable biometric searching
Buy PDF- 27pp US6092072  2000-07 Guha et al.  Lucent Technologies, Inc. Programmed medium for clustering large databases
       
Foreign References: None

Other References:
  • Ding et al., "Necessary conditions on minimal system configuration for general MISO Mamdani Fuzzy Systems as Universal Approximators", Systems, Man and Cybernetics, Part B, IEEE Transactions on, vol. 30, Issue 6, pp. 857-864, Dec. 2000.* (8 pages) [ISI abstract]
  • Paliwal et al., "Comments on `modified K-means algorithm for vector quantizer design`", Image Processing, IEEE Transactions on, vol. 9, Issue 11, pp. 1964-1967, Nov. 2000.* (4 pages) [ISI abstract]
  • Lee et al., "Morphology-based three-dimensional interpolation", Medical Imaging, IEEE Transactions on, vol. 19, Issue 7, pp. 711-721, Jul. 2000.* (11 pages) [ISI abstract]
  • Berger, Marsha et al., "An Algorithm for Point Clustering and Grid Generation," IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, No. 5, Sep./Oct. 1991, pp. 1278-1286. (9 pages) Cited by 5 patents [ISI abstract]
  • Duda, Richard O. et al., "Pattern Classification and Scene Analysis," John Wiley & Sons, New York, 1973, pp. 210-257.
  • Ester, Martin et al., "A Database Interface for Clustering in Large Spatial Databases," Conference on Knowledge Discovery and Data Mining, KDD-95, AAAI Press, 1996, pp. 94-99.
  • Fisher, Douglas H., "Knowledge Acquisition Via Incremental Conceptual Clustering," pp. 267-283 (originally published in Machine Learning, No. 2:139-172); 1987.
  • Fukunaga, K. et al., "A Branch and Bound Algorithm for Computing .kappa.-Nearest Neighbors," IEEE Transactions on Computers, Jul. 1975, pp. 750-753. (4 pages)
  • Han, Eui-Hong (Sam) et al., "Clustering Based on Associated Rule Hypergraphs," Proceedings of the Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD '97), Tucson, Arizona, 1997, pp. 9-13.
  • Huang, Zhexue, "A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining," SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, SIGMOD-DMKD 1997, pp. 1-8.
  • Keim, Daniel A., "Enhancing Visual Clustering of Query-Dependent Database Visualization Techniques Using Screen-Filling Curves," Proceedings of the Workshop on Database Issues for Data Visualization, Atlanta, Georgia, 1995, pp. 101-110.
  • Ketterlin, A. et al., "Conceptual Clustering in Structured Databases: a Practical Approach," Proceedings of the First International Conference on Knowledge Discovery & Data Mining, KDD-95, AAAI Press, 1996, pp. 180-185.
  • Ng, Raymond T. et al., "Efficient and Effective Clustering Methods for Spatial Data Mining," Proceedings of the 20th VLDB Conference, Santiago, Chile, 1994, pp. 144-155.
  • Pollard, David, "Quantization and the Method of .kappa.-Means," IEEE Transactional on Information Theory, vol. IT-28, No. 2, Mar. 1982, pp. 199-205. (7 pages) Cited by 2 patents
  • Pollard, David, "A Central Limit Theorem for .kappa.-Means Clustering," The Annals of Probability, vol. 10, No. 4, 1982, pp. 919-926. (8 pages) Cited by 2 patents
  • Ripley, Brian, "Pattern Recognition & Neural Networks," Cambridge University Press, 1996, pp. 311-322.
  • Ralambondrainy, H., "A Conceptual Version of the .kappa.-means Algorithm," Pattern Recognition Letters 16, 1995, pp. 1147-1157. (11 pages) Cited by 5 patents [ISI abstract]
  • Selim, Shokri Z. et al., ".kappa.-Means-Type Algorithms: A Generalized Convergence Theorem and Characterization of Local Optimality," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-6, No. 1, Jan. 1984, pp. 81-87. (7 pages) Cited by 5 patents
  • Wharton, Stephen W., "A Generalized Histogram Clustering Scheme for Multidimensional Image Data," Pattern Recognition, vol. 16, No. 2, 1983, pp. 193-199. (7 pages) Cited by 2 patents


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2010 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help