Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   

 The Delphion Integrated View

  Buy Now:   Buy PDF- 16pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 Email this to a friend  Email this to a friend 
Title: US5649060: Automatic indexing and aligning of audio and text using speech recognition
[ Derwent Title ]

Country: US United States of America

View Images High


16 pages

Inventor: Ellozy, Hamed A.; Bedford Hills, NY
Kanevsky, Dimitri; Ossining, NY
Kim, Michelle Y.; Scarsdale, NY
Nahamoo, David; White Plains, NY
Picheny, Michael Alan; White Plains, NY
Zadrozny, Wlodek Wlodzimierz; Mohegan Lake, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 1997-07-15 / 1995-10-23

Application Number: US1995000547113

IPC Code: Advanced: G03B 31/00; G06F 17/30; G10L 15/00; G10L 15/18; G10L 15/26; G11B 27/028; G11B 27/10; G11B 27/28; H04N 5/91; G10L 15/22;
IPC-7: G10L 9/00;

ECLA Code: G11B27/28; G06F17/30U1T; G10L15/26A; G11B27/028; G11B27/10;

U.S. Class: Current: 704/278; 369/025.01; 704/002; 704/009; 704/235; 704/251; 704/276; 704/E15.045; G9B/027.008; G9B/027.017; G9B/027.029;
Original: 395/002.87; 395/002.6; 395/002.44; 395/002.85; 395/752; 395/759; 369/025;

Field of Search: 395/2.4,2.44,2.55,2.59,2.6,2.64,2.79,2.84,2.85,2.86,2.87 369/025 364/419.02,419.08

Priority Number:
1995-10-23  US1995000547113
1993-10-18  US1993000138949

Abstract:     A method of automatically aligning a written transcript with speech in video and audio clips. The disclosed technique involves as a basic component an automatic speech recognizer. The automatic speech recognizer decodes speech (recorded on a tape) and produces a file with a decoded text. This decoded text is then matched with the original written transcript via identification of similar words or clusters of words. The results of this matching is an alignment of the speech with the original transcript. The method can be used (a) to create indexing of video clips, (b) for "teleprompting" (i.e. showing the next portion of text when someone is reading from a television screen), or (c) to enhance editing of a text that was dictated to a stenographer or recorded on a tape for its subsequent textual reproduction by a typist.

Attorney, Agent or Firm: Whitham, Curtis, Whitham and McGinn ; Tassinari, Jr., Robert P. ;

Primary / Asst. Examiners: MacDonald, Allen R.; Sartori, Michael A.

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Related Applications:
Application Number Filed Patent Pub. Date  Title
US1993000138949 1993-10-18       

Parent Case:

    This application is a continuation of application Ser. No. 08/138,949 filed Oct. 18, 1993, now abandoned.

Designated Country: DE FR GB 

Family: Show 7 known family members

First Claim:
Show all 8 claims
We claim:     1. An apparatus for indexing an audio recording comprising:
  • an acoustic recorder for storing an ordered series of acoustic information signal units representing sounds generated from spoken words, said acoustic recorder having a plurality of recording locations, each recording location storing at least one acoustic information signal unit;
  • a timer connected to said acoustic recorder for time stamping said acoustic information signal units;
  • a speech recognizer connected to said acoustic recorder for generating an ordered series of recognized words having a high conditional probability of occurrence given the occurrence of the sounds represented by the acoustic information signal units from said acoustic recorder, each recognized word corresponding to at least one acoustic information signal unit and comprising a series of one or more characters, each recognized word having a context of at least one preceding or following recognized word;
  • a time alignment device connected to said speech recognizer and receiving time stamps of said acoustic information signal units for aligning said acoustic information signal units according to respective time stamps of said acoustic information signal units;
  • a text storage device for storing a transcript of text of the spoken words corresponding to ordered series of acoustic information signal units stored on said acoustic recorder;
  • mapping means connected to said text storage device for determining a size of an acoustic information signal unit to be passed to said speech recognizer from said acoustic recorder, said mapping means generating an ordered series of index words, said ordered series of index words comprising a representation of at least some of the spoken words represented by the acoustic information signal units, each index word having a context of at least one preceding or following index word and comprising a series of one or more characters;
  • a segmenter controlled by said mapping means for controlling playback of acoustic information signal units to said speech recognizer; and
  • alignment means connected to said acoustic recorder and to said mapping means for comparing the ordered series of recognized words with the ordered series of index words to pair recognized words and index words which are the same word and which have matching contexts, a recognized word being the same as an index word when both words comprise the same series of characters, a context of a target recognized word comprises the number of other recognized words preceding and following the target recognized word in the ordered series of recognized words, a context of a larger index word comprises the number of other index words preceding and following the target index word in the ordered series of index words, and the context of a recognized word matches the context of an index word if the context of the target recognized word is within a selected threshold value of the context of the target index word, said alignment means tagging each paired index word with the recording location of the acoustic information signal unit corresponding to the recognized word paired with the index word.

Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 174 U.S. patent(s) that reference this one

U.S. References: Go to Result Set: All U.S. references   |  Forward references (174)   |   Backward references (17)   |   Citation Link

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 12pp US4682957  1987-07 Young   Teleconferencing and teaching method and apparatus
Get PDF - 24pp US4689022  1987-08 Peers et al.  Peers; John System for control of a video storage means by a programmed processor
Get PDF - 17pp US4695975  1987-09 Bedrij  Profit Technology, Inc. Multi-image communications system
Get PDF - 50pp US4783803  1988-11 Baker et al.  Dragon Systems, Inc. Speech recognition apparatus and method
Get PDF - 15pp US4847698  1989-07 Freeman  Actv, Inc. Interactive television system for providing full motion synched compatible audio/visual displays
Get PDF - 35pp US4884972  1989-12 Gasper  Bright Star Technology, Inc. Speech synchronized animation
Get PDF - 20pp US4984274  1991-01 Yahagi et al.  Casio Computer Co., Ltd. Speech recognition apparatus with means for preventing errors due to delay in speech recognition
Get PDF - 56pp US5010495  1991-04 Willetts  American Language Academy Interactive language learning system
Get PDF - 48pp US5027406  1991-06 Roberts et al.  Dragon Systems, Inc. Method for interactive speech recognition and training
Get PDF - 54pp US5111409  1992-05 Gasper et al.   Authoring and use systems for sound synchronized animation
Get PDF - 25pp US5119474  1992-06 Beitel et al.  International Business Machines Corp. Computer-based, audio/visual creation and presentation system and method
Get PDF - 5pp US5136655  1992-08 Bronson  Hewlett-Pacard Company Method and apparatus for indexing and retrieving audio-video data
Get PDF - 6pp US5145375  1992-09 Rubio   Moving message learning system and method
Get PDF - 10pp US5149104  1992-09 Edelstein   Video game having audio player interation with real time video synchronization
Get PDF - 16pp US5199077  1993-03 Wilcox et al.  Xerox Corporation Wordspotting for voice editing and indexing
Get PDF - 12pp US5272571  1993-12 Henderson et al.  L. R. Linn and Associates Stenotype machine with linked audio recording
Get PDF - 14pp US5333275  1994-07 Wheatley et al.   System and method for time aligning speech
Foreign References:
Publication Date IPC Code Assignee   Title
Get PDF - 15pp EP0507743A2 1992-02  G11B 27/028 STENOGRAPH CORP Information storage and retrieval systems 
  JP61084174 1986-04       
  PCT/US91/09536 1991-12  G11B 17/32    

Other Abstract Info: DERABS G1995-148939

Other References:
  • Bahl, L.R., et al. "A Maximum Likelihood Approach to Continuous Speech Recognition." IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-5, No. 2, pp. 179-190, Mar. 1983. (12 pages) Cited by 42 patents
  • Brown, P.F., et al. "Aligning Sentences In Parallel Corpora." .us on;Proceedings 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, Calif., Jun. 1991, pp. 169-176.
  • de Souza, P.V. "A Statistical Approach to the Design of an Adaptive Self-Normalized Silence Detector." IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. ASSP-31, No. 3, Jun. 1983, pp. 678-684.
  • Leung, H.C., et al. "A Procedure For Automatic Alignment of Phonetic Transcriptions With Continuous Speech." Proceedings of ICASSP 84, 1984, pp. 2.7.1 to 2.7.3.
  • IBM Technical Disclosure Bulletin, Mar. 1991, vol. 33, #10A, pp. 295-296, "Correlating Audio and Moving-Image Tracks".

  • Inquire Regarding Licensing

    Powered by Verity

    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help