Work Files Saved Searches
   My Account                                                  Search:   Quick/Number   Boolean   Advanced       Help   


 The Delphion Integrated View

  Buy Now:   Buy PDF- 20pp  PDF  |   File History  |   Other choices   
  Tools:  Citation Link  |  Add to Work File:    
  View:  Expand Details   |  INPADOC   |  Jump to: 
 
 Email this to a friend  Email this to a friend 
       
Title: US6253175: Wavelet-based energy binning cepstal features for automatic speech recognition
[ Derwent Title ]


Country: US United States of America

View Images High
Resolution

 Low
 Resolution

 
20 pages

 
Inventor: Basu, Sankar; St. Tenafly, NJ
Maes, Stephane H.; Danbury, CT

Assignee: International Business Machines Corporation, Armonk, NY
other patents from INTERNATIONAL BUSINESS MACHINES CORPORATION (280070) (approx. 44,393)
 News, Profiles, Stocks and More about this company

Published / Filed: 2001-06-26 / 1998-11-30

Application Number: US1998000201055

IPC Code: Advanced: G10L 15/02; G10L 15/10; G10L 15/20; G10L 21/02; G10L 15/06;
IPC-7: G10L 15/00;

ECLA Code: G10L15/02; S10L15/063C; S10L25/27;

U.S. Class: Current: 704/231; 704/237; 704/245; 704/E15.004;
Original: 704/231; 704/237; 704/245;

Field of Search: 704/231,236,237,245,250

Priority Number:
1998-11-30  US1998000201055

Abstract:     Systems and methods for processing acoustic speech signals which utilize the wavelet transform (and alternatively, the Fourier transform) as a fundamental tool. The method essentially involves "synchrosqueezing" spectral component data obtained by performing a wavelet transform (or Fourier transform) on digitized speech signals. In one aspect, spectral components of the synchrosqueezed plane are dynamically tracked via a K-means clustering algorithm. The amplitude, frequency and bandwidth of each of the components are, thus, extracted. The cepstrum generated from this information is referred to as "K-mean Wastrum." In another aspect, the result of the K-mean clustering process is further processed to limit the set of primary components to formants. The resulting features are referred to as "formant-based wastrum." Formants are interpolated in unvoiced regions and the contribution of unvoiced turbulent part of the spectrum are added. This method requires adequate formant tracking. The resulting robust formant extraction has a number of applications in speech processing and analysis including vocal tract normalization.

Attorney, Agent or Firm: F. Chau & Associates, LLP ;

Primary / Asst. Examiners: Tsang, Fan; Opsasnick, Michael N.

INPADOC Legal Status: Show legal status actions          Buy Now: Family Legal Status Report

Family: Show 5 known family members

First Claim:
Show all 20 claims
What is claimed is:     1. A program storage device readable by machine, tangibly embodying a program of instructions executable by the machine to perform method steps for extracting spectral features from acoustic speech signals for use in automatic speech recognition, said method steps comprising:
  • digitizing acoustic speech signals for at least one of a plurality of frames of speech;
  • performing a first transform on each of said frames of digitized acoustic speech signals to extract spectral parameters for each frame;
  • performing a squeezing transform on said spectral parameters of each frame by grouping spectral components having similar instantaneous frequencies such that acoustic energy is concentrated at the instantaneous frequency values;
  • clustering said squeezed spectral parameters to determine elements corresponding to each frame, the location of the elements being determined by cluster centers resulting from said clustering;
  • mapping frequency, bandwidth and weight values to each element for each frame of speech;
  • mapping each element with its corresponding frame; and
  • generating spectral features from said element for each frame.


Background / Summary: Show background / summary

Drawing Descriptions: Show drawing descriptions

Description: Show description

Forward References: Show 7 U.S. patent(s) that reference this one

       
U.S. References: Go to Result Set: All U.S. references   |  Forward references (7)   |   Backward references (3)   |   Citation Link

Buy
PDF
Patent  Pub.Date  Inventor Assignee   Title
Get PDF - 18pp US5398302  1995-03 Thrift   Method and apparatus for adaptive learning in neural networks
Get PDF - 9pp US5899973  1999-05 Bandara et al.  International Business Machines Corporation Method and apparatus for adapting the language model's size in a speech recognition system
Get PDF - 11pp US5956671  1999-09 Ittycheriah et al.  International Business Machines Corporation Apparatus and methods for shift invariant speech recognition
       
Foreign References: None

Other Abstract Info: DERABS G2000-469361

Other References:
  • Daubechies et al., "A Nonlinear Squeezing of the Continuous Wavelet Transform Based on Auditory Nerve Models," Wavelets in Medicine and Biology, vol. 20, pp. 527-546, 1996.
  • Stephane H. Maes, "Robust Speech and Speaker Recognition Using Instantaneous Frequencies and Amplitudes Obtained With Wavelet-Derived Synchrosqueezing Measures," pp. 1-16, Mar. 1996.
  • Stephane H. Maes, "Fast Quasi-Continuous Wavelet Algorithms for Analysis and Synthesis of One-Dimensional Signals," SIAM J. Appl. Math., vol. 57, No. 6, pp. 1763-1801, Dec. 1997. (39 pages) [ISI abstract]


  • Inquire Regarding Licensing

    Powered by Verity


    Plaques from Patent Awards      Gallery of Obscure PatentsNominate this for the Gallery...

    Thomson Reuters Copyright © 1997-2014 Thomson Reuters 
    Subscriptions  |  Web Seminars  |  Privacy  |  Terms & Conditions  |  Site Map  |  Contact Us  |  Help