Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 21pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US6185527: System and method for automatic audio content analysis for word spotting, indexing, classification and retrieval
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
21 pages

 
Inventor: Petkovic, Dragutin; Saratoga, CA
Ponceleon, Dulce Beatriz; Palo Alto, CA
Srinivasan, Savitha; San Jose, CA

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2001-02-06 / 1999-01-19

Application Number: US1999000234663

IPC Code: Advanced: G06F 3/16; G06F 17/30; G10L 15/00; G10L 15/04; G10L 15/08; G10L 15/10; G10L 15/18; G10L 15/26;
IPC-7: G10L 11/02; G10L 15/04; G10L 15/08;

ECLA Code: G06F17/30U5; G06F17/30U1T; G10L15/26A; S10H210/046; S10H210/061; S10L15/08W;

U.S. Class: Current: 704/231; 704/233; 704/251; 704/253; 704/270; 704/E15.045;
Original: 704/231; 704/233; 704/251; 704/253; 704/270;

Field of Search: 704/235,251,253,257,270,231,233

Priority Number:
1999-01-19  US1999000234663

Abstract:     A system and method for indexing an audio stream for subsequent information retrieval and for skimming, gisting, and summarizing the audio stream includes using special audio prefiltering such that only relevant speech segments that are generated by a speech recognition engine are indexed. Specific indexing features are disclosed that improve the precision and recall of an information retrieval system used after indexing for word spotting. The invention includes rendering the audio stream into intervals, with each interval including one or more segments. For each segment of an interval it is determined whether the segment exhibits one or more predetermined audio features such as a particular range of zero crossing rates, a particular range of energy, and a particular range of spectral energy concentration. The audio features are heuristically determined to represent respective audio events including silence, music, speech, and speech on music. Also, it is determined whether a group of intervals matches a heuristically predefined meta pattern such as continuous uninterrupted speech, concluding ideas, hesitations and emphasis in speech, and so on, and the audio stream is then indexed based on the interval classification and meta pattern matching, with only relevant features being indexed to improve subsequent precision of information retrieval. Also, alternatives for longer terms generated by the speech recognition engine are indexed along with respective weights, to improve subsequent recall.

Attorney, Agent or Firm: Rogitz, John L. ;

Primary / Asst. Examiners: Smits, Talivaldis I.;

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Family: Show 7 known family members

First Claim:
Show all 36 claims
We claim:     1. A computer-implemented method for analyzing an audio signal, comprising:
  • detecting audio events in one or more intervals of the audio signal, each interval including a temporal sequence of one or more segments;
  • indexing the audio signal based on the audio events; and
  • skimming, gisting, or summarizing the audio signal using the indexing thereof.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 143 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (143)   |   Backward references (13)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 16pp US5199077  1993-03 Wilcox et al.  Xerox Corporation Wordspotting for voice editing and indexing
Get PDF - 15pp US5293584  1994-03 Brown et al.  International Business Machines Corporation Speech recognition system for natural language translation
Get PDF - 22pp US5404510  1995-04 Smith et al.  Oracle Corporation Database index design based upon request importance and the reuse and modification of similar existing indexes
Get PDF - 52pp US5436653  1995-07 Ellis et al.  The Arbitron Company Method and system for recognition of broadcast segments
Get PDF - 46pp US5504518  1996-04 Ellis et al.  The Arbitron Company Method and system for recognition of broadcast segments
Get PDF - 36pp US5526407  1996-06 Russell et al.  Riverrun Technology Method and apparatus for managing information
Get PDF - 18pp US5606643  1997-02 Balasubramanian et al.  Xerox Corporation Real-time audio recording system for automatic speaker indexing
Get PDF - 46pp US5612729  1997-03 Ellis et al.  The Arbitron Company Method and system for producing a signature characterizing an audio broadcast signal
Get PDF - 21pp US5655058  1997-08 Balasubramanian et al.  Xerox Corporation Segmentation of audio data for indexing of conversational speech for real-time or postprocessing applications
Get PDF - 13pp US5712953  1998-01 Langs  Electronic Data Systems Corporation System and method for classification of audio or audio/video signals based on musical content
Get PDF - 29pp US5787387  1998-07 Aguilar  Voxware, Inc. Harmonic adaptive speech coding method and system
Get PDF - 11pp US5937422  1999-08 Nelson et al.  The United States of America as represented by the National Security Agency Automatically generating a topic description for text and searching and sorting text by topic using the same
Get PDF - 11pp US6100882  2000-08 Sharman et al.  International Business Machines Corporation Textual recording of contributions to audio conference using speech recognition
       
Foreign References:
Buy
PDF
Publication Date IPC Code Assignee   Title
Get PDF - 13pp EP0702351A2 1996-03  G10L 3/00 IBM Method and apparatus for analysing audio input events in a speech recognition system 
Get PDF - 14pp EP0780777A1 1997-06  G06F 17/30 Hewlett-Packard Company Indexing of recordings 
Get PDF - 11pp EP0820025A1 1998-01  G06F 17/30 AT & T CORP Method for providing a compressed rendition of a video program in a format suitable for electronic searching and retrieval 
  JP08063184 1996-03       
  JP08087292 1996-04       
  JP10049189 1998-02       


Other Abstract Info: DERABS G2000-633246

Other References:
  • White Paper: "Retrieving Spoken Documents by Combining Multiple Index Sources." Jones et al. pp. 30-38. Computer Laboratory, Engineering Dept., Univ. of Cambridge, England. 1996.
  • Article: "Content-Based Classification, Search, and Retrieval of Audio." Wold et al. Muscle Fish, IEEE Multimedia. vol. 3, No. 3. 16 pgs. Fall, 1996.
  • White Paper: "Automatic Audio Content Analysis". Pfeiffer et al. Univ. of Mannheim, Mannheim, Germany. ACM Multimedia. 1996.
  • Article: "SpeechSkimmer: A System for Interactively Skimming Recorded Speech". Barry Arons. ACM Transactions on Computer-Human Interaction, vol. 4, No. 1, pp. 3-38. 1997.


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help