Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced   Derwent    Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 10pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
  Go to:  Derwent  
 Email this to a friend  Email this to a friend 
       
Title: US5813002: Method and system for linearly detecting data deviations in a large database
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
10 pages

 
Inventor: Agrawal, Rakesh; San Jose, CA
Arning, Andreas; Wendelsheim, Germany

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1998-09-22 / 1996-07-31

Application Number: US1996000692906

IPC Code: Advanced: G06F 17/30;
Core: more...
IPC-7: G06F 17/30;

ECLA Code: G06F17/30;

U.S. Class: Current: 707/005; 707/002; 707/003; 707/006; 707/E17.001;
Original: 707/005; 707/002; 707/003; 707/006;

Field of Search: 707/002,3,5,6

Priority Number:
1996-07-31  US1996000692906

Abstract: A method for detecting deviations in a database is disclosed, comprising the steps of: determining respective frequencies of occurrence for the attribute values of the data items, and identifying any itemset whose similarity value satisfies a predetermined criterion as a deviation, based on the frequencies of occurrence. The determination of the frequencies of occurrence includes computing an overall similarity value for the database, and for each first itemset, computing a difference between the overall similarity value and the similarity value of a second itemset. The second itemset has all the data items except those of the first itemset. Preferably, a smoothing factor is used for indicating how much dissimilarity in an itemset can be reduced by removing a subset of items from the itemset. The smoothing factor is evaluated as each item is incrementally removed from the itemset, thereby allowing a data item to be identified as a deviation when the difference if similarity value is the highest.

Attorney, Agent or Firm: Tran, Khanh Q. ;

Primary / Asst. Examiners: Black, Thomas G.; Coby, Frantz

Maintenance Status: E2 Expired  Check current status

INPADOC Legal Status: Show legal status actions

Family: None

First Claim:
Show all 15 claims
What is claimed is:     1. A method for detecting deviations in a database having a plurality of data items, each data item being characterized by an attribute value, each subset of the data items being an itemset, and each itemset having a similarity value based on the attribute values of the data items in the itemset, the method comprising the steps of:
  • determining a frequency of occurrence for each attribute value;
  • identifying any itemset whose similarity value satisfies a predetermined deviation criterion as a deviation, based on relative frequencies of occurrence of the attribute values; and
  • computing a smoothing factor that represents a reduction in similarity between any two itemsets when a subset of the data items common to the two itemsets is disregarded.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 6 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (6)   |   Backward references (28)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
  US3670310  1972-06 Bharwani et al.  Infodata Systems Incorporated METHOD FOR INFORMATION STORAGE AND RETRIEVAL
  US3694813  1972-09 Loh et al.  International Business Machines Corporation METHOD OF ACHIEVING DATA COMPACTION UTILIZING VARIABLE-LENGTH DEPENDENT CODING TECHNIQUES
Buy PDF- 121pp US4290115  1981-09 Pitt et al.  System Development Corporation Data processing method and means for determining degree of match between two data arrays
Buy PDF- 18pp US4653021  1987-03 Takagi  Kabushiki Kaisha Toshiba Data management apparatus
Buy PDF- 14pp US4800511  1989-01 Tanaka  Fuji Photo Film Co., Ltd. Method of smoothing image data
Buy PDF- 9pp US4839853  1989-06 Deerweater et al.  Bell Communications Research, Inc. Computer information retrieval using latent semantic structure
Buy PDF- 34pp US4868750  1989-09 Kucera et al.  Houghton Mifflin Company Collocational grammar system
Buy PDF- 23pp US4956774  1990-09 Shibamiya et al.  International Business Machines Corporation Data base optimizer using most frequency values statistics
Buy PDF- 31pp US5031206  1991-07 Riskin  Fon-Ex, Inc. Method and apparatus for identifying words entered on DTMF pushbuttons
Buy PDF- 27pp US5129082  1992-07 Tirfing et al.  Sun Microsystems, Inc. Method and apparatus for searching database component files to retrieve information from modified files
Buy PDF- 25pp US5168565  1992-12 Morita  Ricoh Company, Ltd. Document retrieval system
Buy PDF- 12pp US5255386  1993-10 Prager  International Business Machines Corporation Method and apparatus for intelligent help that matches the semantic similarity of the inferred intent of query or command to a best-fit predefined command intent
Buy PDF- 38pp US5276629  1994-01 Reynolds  Reynolds Software, Inc. Method and apparatus for wave analysis and event recognition
Buy PDF- 25pp US5297039  1994-03 Kanaegami et al.  Mitsubishi Denki Kabushiki Kaisha Text search system for locating on the basis of keyword matching and keyword relationship matching
Buy PDF- 18pp US5351247  1994-09 Dow et al.  Digital Equipment Corporation Adaptive fault identification system
Buy PDF- 8pp US5375235  1994-12 Berry et al.  Northern Telecom Limited Method of indexing keywords for searching in a database recorded on an information recording medium
Buy PDF- 19pp US5418951  1995-05 Damashek  The United States of America as represented by the Director of National Security Agency Method of retrieving documents that concern the same topic
Buy PDF- 64pp US5440481  1995-08 Kostoff et al.  The United States of America as represented by the Secretary of the Navy System and method for database tomography
Buy PDF- 35pp US5488725  1996-01 Turtle et al.  West Publishing Company System of document representation retrieval by successive iterated probability sampling
Buy PDF- 21pp US5542089  1996-07 Lindsay et al.  International Business Machines Corporation Method and apparatus for estimating the number of occurrences of frequent values in a data set
Buy PDF- 8pp US5544049  1996-08 Henderson et al.  Xerox Corporation Method for performing a search of a plurality of documents for similarity to a plurality of query words
Buy PDF- 45pp US5544352  1996-08 Egger  Libertech, Inc. Method and apparatus for indexing, searching and displaying data
Buy PDF- 67pp US5576954  1996-11 Driscoll  University of Central Florida Process for determination of text relevancy
Buy PDF- 15pp US5598557  1997-01 Doner et al.  Caere Corporation Apparatus and method for retrieving and grouping images representing text files based on the relevance of key words extracted from a selected file to the text files
Buy PDF- 26pp US5642502  1997-06 Driscoll  University of Central Florida Method and system for searching for relevant documents from a text database collection, using statistical ranking, relevancy feedback and small pieces of text
Buy PDF- 6pp US5659732  1997-08 Kirsch  Infoseek Corporation Document retrieval over networks wherein ranking and relevance scores are computed at the client for multiple database documents
Buy PDF- 17pp US5666442  1997-09 Wheeler  InfoGlide Corporation Comparison system for identifying the degree of similarity between objects by rendering a numeric measure of closeness, the system including all available information complete with errors and inaccuracies
Buy PDF- 13pp US5699509  1997-12 Gary et al.  Abbott Laboratories Method and system for using inverted data to detect corrupt data
       
Foreign References: None

Other Abstract Info: DERABS G1998-531461 DERABS G1998-531461

Other References:
  • J. W. Shavlik, T. G. Dietterich, Reading in Machine Learning (book), Morgan Kaufman Pub. Inc., San Mateo, CA, Chapter 3 Unsupervised Concept Learning and Discovery, pp. 263-283, 1990.
  • P. F. Velleman, D. C. Hoaglin, Applications, Basics, & Computing of Exploratory Data Analysis (book), Chapter 3, pp. 65-81, Chapter 5, pp. 121-147, and Chapter 6, pp. 159-184, date unknown.
  • D. Chamberlin, Using the New DB2 IBM's Object-Relational Database System, Chapter 5, Active Data, pp. 323-342. (book), date unknown.
  • D. C. Hoaglin, F. Mosteller, J. W. Tukey, Understanding Robust and Exploratory Data Analysis, (book) J. Wiley & Sons, Inc., pp. 1-55, date unknown.
  • L. G. Valiant, A Theory of the Learnable (Artificial Intelligence and Language Processing, Inductive Learning from Preclassified Training Examples, pp. 192-200, date unknown.
  • D. E. Rumelhart and D. Zipser, Freature Discovery by Competitive Learning, Orig. Pub. in Cognitive Science, 9.1, 1985, pp. 307-325.
  • J. R. Quinlan, 2.2 Algorithms, Induction of Decision Trees, Orig. Pub. in Machine Learning, 1:81-106, 1986, Kluwer Academic Publishers, Boston, pp.57-69.
  • R. S. Michalski, R. E. Stepp, Learning from Observation: Conceptual Clustering, Chapter 11, pp. 331-362, date unknown.
  • S. J. Hanson, M. Bauer, Conceptual Clustering, Categorization, and Polymorphy, Machine Learning 3: 343-372, 1989, Kluwer Academic Publishers--Manufactured in The Netherlands.
  • D. H. Fisher, Knowledge Acquistion via Incremental Conceptual Clustering, Originally Published in Machine Learning, 2: 139-172, 1987 Kluwer Academic Publishers, Boston, 3.2.1 pp. 267-283.
  • D. Angluin, P. Laird, Learning from Noisy Examples, 1988 Kluwer Academic Publisher, Boston, Manufactured inThe Netherlands, Maching Learning 2: 343-370, 1988.
  • D. W. Aha, D. Kibler, M. K. Albert, Instance-Based Learning Algorithms Machine Learning, 6, 37-66, 1991, Kluwer Academic Publishers Boston Manufactured in The Netherlands. (30 pages) Cited by 18 patents [ISI abstract]


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2010 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help