Title: US6073095: Fast vocabulary independent method and apparatus for spotting words in speech
Country: US United States of America

9 pages

Inventor: Dharanipragada, Satyanarayana; Ossining, NY
Eide, Ellen Marie; Mount Kisco, NY
Roukos, Salim Estephan; Scardsale, NY

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
Published / Filed: 2000-06-06 / 1997-10-15

Application Number: US1997000950621

IPC Code: Advanced: G10L 15/08; G10L 15/00; G10L 15/18;
IPC-7: G10L 7/08;

ECLA Code: G10L15/08; S10L15/08W; S10L15/187; S10L15/197;

U.S. Class: Current: 704/242; 704/255; 704/E15.014;
Original: 704/242; 704/255;

Field of Search: 704/256,242,231,236,246,250,251,257,255,270

Priority Number:
1997-10-15  US1997000950621

Abstract:     A fast vocabulary independent method for spotting words in speech utilizes a preprocessing step and a coarse-to-detailed search strategy for spotting a word/phone sequence in speech. The preprocessing includes a Viterbi-beam phone level decoding using a tree-based phone language model. The coarse search matches phone-ngrams to identify regions of speech as putative word hits, and the detailed search performs an acoustic match at the putative hits with a model of the given word included in the vocabulary of the recognizer.

Attorney, Agent or Firm: F. Chau & Associates, LLP ;

Primary / Asst. Examiners: Dorvil, Richemond;

First Claim:
Show all 20 claims
We claim:     1. A fast vocabulary independent method for spotting words in speech comprising the steps of:
  • converting a speech waveform into a representation comprising phone-ngrams and a corresponding time interval of occurrence of each of the phone-ngrams;
  • receiving phone-ngrams of at least one input word;
  • performing a coarse match by selecting time intervals of the speech waveform having phone-ngrams that correspond to the phone-ngrams of the at least one input word; and
  • performing a detailed acoustic match at the selected time intervals.

Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 30pp US4817156  1989-03 Bahl et al.  International Business Machines Corporation Rapidly training a speech recognizer to a subsequent speaker given training data of a reference speaker
Get PDF - 11pp US5023911  1991-06 Gerson  Motorola, Inc. Word spotting in a speech recognition system without predetermined endpoint detection
Get PDF - 26pp US5073939  1991-12 Vensko et al.  ITT Corporation Dynamic time warping (DTW) apparatus for use in speech recognition systems
Get PDF - 16pp US5199077  1993-03 Wilcox et al.  Xerox Corporation Wordspotting for voice editing and indexing
Get PDF - 12pp US5309547  1994-05 Niyada et al.  Matsushita Electric Industrial Co., Ltd. Method of speech recognition
Get PDF - 45pp US5345536  1994-09 Hoshimi et al.  Matsushita Electric Industrial Co., Ltd. Method of speech recognition
Get PDF - 10pp US5664227  1997-09 Mauldin et al.  Carnegie Mellon University System and method for skimming digital audio/video data
Get PDF - 20pp US5794194  1998-08 Takebayashi et al.  Kabushiki Kaisha Toshiba Word spotting in a variable noise level environment
Get PDF - 13pp US5819223  1998-10 Takagi  NEC Corporation Speech adaptation device suitable for speech recognition device and word spotting device
Foreign References:
Publication Date IPC Code Assignee   Title
Get PDF - 18pp EP0533491 1993-03  G10L 5/06 XEROX CORP Wordspotting using two hidden Markov models (HMM) 

Other Abstract Info: DERABS G1999-266341

Other References:
  • ICASSP-88. Fissore et al., "Very Large vacabulary isolated utterance recognition: a comparison betwwen one pass and two pass strategies" vol. 2928 pp. 203-206. Apr. 1998.
  • Spoken Language, 1996, ICSLP 96. Nitta et al., "Word-spotting based on inter-word and intra-word diphone models" pp. 1093-1096 vol. 2. Oct. 1996.

