 |
 |
|
|
|
|
Title: |
US5978794:
Method and system for performing spatial similarity joins on high-dimensional points
[ Derwent Title ]

|
Country: |
US United States of America

|
| |
Inventor: |
Agrawal, Rakesh; San Jose, CA
Shim, Kyuseok; Bedminster, NJ
Srikant, Ramakrishnan; San Jose, CA

|
Assignee: |
International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
News, Profiles, Stocks and More about this company

|
Published / Filed: |
1999-11-02
/ 1996-04-09

|
Application Number: |
US1996000629688

|
IPC Code: |
Advanced:
G06F 17/30;
Core:
more...
IPC-7:
G06F 17/00;

|
ECLA Code: |
G06F17/30G3;

|
U.S. Class: |
Current:
707/003;
707/004;
707/101;
707/E17.012;
Original:
707/003;
707/004;
707/101;

|
Field of Search: |
707/001-206,200
364/282.1,DIG. 1

|
Priority Number: |
| 1996-04-09 |
US1996000629688 |

|
Abstract: |
A method and system are disclosed for performing spatial similarity joins on high-dimensional points that represent data objects of a database. The method comprises the steps of: generating a data structure based on the similarity distance ε for organizing the high-dimensional points, traversing the data structure to select pairs of leaf nodes from which the high-dimensional points are joined, and joining the points from selected pairs of nodes according to a joining condition based on the similarity distance ε. An efficient data structure referred to as an ε-K-D-B tree is disclosed to provide fast access to the high-dimensional points and to minimize system storage requirements. The invention provides algorithms for generating the ε-K-D-B tree using biased splitting to minimize the number of nodes to be examined during join operations. The traversing step includes joining selected pairs of nodes and also self-joining selected nodes. Alternatively, the data structure is an R+tree generated using biased splitting.

|
Attorney, Agent or Firm: |
Tran, Khanh Q. ;

|
Primary / Asst. Examiners: |
Black, Thomas G.; Jung, David Yiuk

|
INPADOC Legal Status: |
Show legal status actions

|
Family: |
None

|
First Claim:
Show all 23 claims |
What is claimed is:
1. A method for performing similarity joins on high-dimensional points representing data objects of a database, the method comprising the steps of:
- generating a multi-dimensional data structure having a plurality of leaf nodes for organizing the points, each leaf node being split into .left brkt-bot.1/epsilon.right brkt-top. child nodes, where epsilon is a similar distance, based on the depth of the leaf node whenever the number of points associated with the leaf node exceeds a predetermined value, the dimensions used for splitting the nodes being in an order of correlation among the dimensions such that one selected for splitting next has the least correlation with previously used dimensions;
- traversing the interior nodes of the data structure to select pairs of leaf nodes from which the points are joined; and
- joining the points from the selected pairs of leaf nodes based on a joining condition that the distance between any two points to be joined is at most epsilon.

|
Background / Summary: |
Show background / summary

|
Drawing Descriptions: |
Show drawing descriptions

|
Description: |
Show description

|
Forward References: |
Show 17 U.S. patent(s) that reference this one

|
 |
 |
|
|
|
|
Foreign References: |
None

|
Other Abstract Info: |
DERABS G1999-619884
DERABS G1999-619884

|
Other References: |
J. Nievergelt, H. Hinterberger, The Grid File: An Adaptable, Symmetric Multikey File Structure, Readings in Database Systems, 2nd Edition, Morgan Kaugmann Publishers, San Mateo, CA (book), 1984.
D. B. Lomet, B. Salzberg, The hB-Tree: a Multiattribute Indexing Method with Good Guaranteed Performance, Readings in Database Systems, Second Edition, Morgan Kaufmann Publishers, San Mateo, California (book), 1990.
N. Yazdani, M. Ozsoyoglu & G. Ozsoyoglu, A Framework for Feature-Based Indexing for Spatial Databases, Proceedings, 7th International Working Conf. on Scientific & Statistical Database Mng.(IEEE Computer Society Press) pp. 259-269, Sep. 28-30, 1994 Charlottesville, Virginia.
Brinkhoff, Kriegel, B. Seeger, Efficient processing of Spatial Joins Using R-trees, SIGMOD 5/93 Washington, DC, ACM 0-89791-592-5/93/0005/0237/1993.
K.I. Lin, H.V. Jagadish & C. Faloutsos, The TV-Tree: An Index Structure for High-Dimensional Data, VLDB Journal 3, pp. 517-542, 1994.
M.L. Lo & C. V. Ravishankar, Spatial Joins Using Seeded Trees, SIGMOD 94-5/94 Minneapolis, Minn, ACM 0-89791-639-5/94/0005, pp. 209-220, 1994.
T. Sellis, N. Roussopoulos, C. Faloutsos, The R+ -Tree: A Dynamic Index for Multi-Dimensional Objects, Dept. of Computer Science, Univ. of Maryland, College Park, MD 20742, Feb. 1987.
J. L. Bentley, Multidimensional Binary Search Trees Used for Associative Searching, ACM, vol. 18, No. 9, pp. 509-517, Sep. 1975.
(9 pages)
Cited by 33 patents
H. V. Jagadish, A Retrieval Technique for Similar Shapes, ACM 0-89791-425-2/91/0005/0208, pp. 208-217, 1991.
R. Agrawal, K.I. Lin, H. S. Sawhney & K. Shim, Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases, Proceedings of the 21st VLDB Conference, Zurich, Switzerland, 1995.
H. Samet, (Book) The Design and Analysis of Spatial Data Structures, Chapter 2, Point Data, pp. 43-80, 1989 (QA76.9 D35 S26 1990).
A. Guttman, R-Trees: A Dynamic Index Structure for Spatial Searching, ACM 0-89791-12808/84/006/0047, pp. 47-57, 1984.
C. Faloutsos & K. I (David) Lin, Fast Map: A Fast Algorithm for Indexing, Data-Mining and Visualization of Traditional and Multimedia Datasets, ACM 0-89791-731-6/95/0005, pp. 163-174, 1995.
N. Beckman, H. P. Kriegel, R. Schneider, B. Seeger, The R* -tree: An Efficient and Robust Access Method for Points and Rectangles, ACM 089791-365/90/0005/0322 pp. 322-331, 1990.
J. T. Robinson, The K-D-B-Tree: A Search Structure for Large Multidimensional Dynamic Indexes, Proc. of ACM SIGMOD, Ann ARbor, Mi, Apr. 1981.
P. Mishra & M. H. Eich, Join Processing in Relational Databases, ACM Computing Surveys, vol. 24, No. 1, pp. 63-113, Mar. 1992.
(51 pages)
Cited by 17 patents
[ISI abstract]
M. Kitsuregawa, L. Harada and M. Takagi, Algorithms and performance Evaluation of Join Processing on KD-Tree Indexed Relations, Trans. Inst. Electron, Inf. Comm. Eng. D-I (Japan), vol. J76D-I, No. 4, pp. 172-183, Apr. 1993, 36 REF.
Jeffrey Ullman, Principles of Database And Knowledge-Base Systems, 1988, pp. 361-368, 375.
R.L. Graham et al., 3rd Edition, "Concrete Mathematics", Addison-Wesley Co., 1989, at pp. 67-78.

|


|
Nominate this for the Gallery...

|
|