 |
 |
|
|
|
|
Title: |
US6952700:
Feature weighting in κ-means clustering
[ Derwent Title ]

|
Country: |
US United States of America

|
| |
Inventor: |
Modha, Dharmendra Shantilal; San Jose, CA, United States of America
Spangler, William Scott; San Martin, CA, United States of America

|
Assignee: |
International Business Machines Corporation, Armonk, NY, United States of America
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
News, Profiles, Stocks and More about this company

|
Published / Filed: |
2005-10-04
/ 2001-03-22

|
Application Number: |
US2001000813896

|
IPC Code: |
Advanced:
G06F 15/00;
G06F 17/00;
G06K 9/62;
Core:
more...
IPC-7:
G06F 17/00;

|
ECLA Code: |
G06K9/62B1;

|
U.S. Class: |
707/101;
707/100;

|
Field of Search: |
707/100,101,104.1

|
Priority Number: |
| 2001-03-22 |
US2001000813896 |

|
Abstract: |
A method and system is provided for integrating multiple feature spaces in a k-means clustering algorithm when analyzing data records having multiple, heterogeneous feature spaces. The method assigns different relative weights to these various features spaces. Optimal feature weights are also determined that lead to a clustering that simultaneously minimizes the average intra-cluster dispersion and maximizes the average inter-cluster dispersion along all the feature spaces. Examples are provided that empirically demonstrate the effectiveness of feature weighting in clustering using two different feature domains.

|
Attorney, Agent or Firm: |
McGinn & Gibb, PLLC ;
Guzman, Esq., Leonard T. ;

|
Primary / Asst. Examiners: |
Le, Uyen;

|
INPADOC Legal Status: |
Show legal status actions
Family Legal Status Report

|
Family: |
Show 2 known family members

|
First Claim:
Show all 24 claims |
1. A method for evaluating and outputting a final clustering solution for a plurality of multi-dimensional data records, said data records having multiple, heterogeneous feature spaces represented by feature vectors, said method comprising: defining a distortion between two feature vectors as a weighted sum of distortion measures on components of said feature vectors; clustering said multi-dimensional data records into k-clusters using a convex programming formulation; selecting feature weights of said feature vectors, and minimizing distortion of said k-clusters.

|
Background / Summary: |
Show background / summary

|
Drawing Descriptions: |
Show drawing descriptions

|
Description: |
Show description

|
Forward References: |
Show 5 U.S. patent(s) that reference this one

|